Thursday, April 26, 2007

Weaning yourself off the keyboard

Today, I'll be talking about Vista's new speech recognition technology. Unfortunately, it's received a lot of bad press in the last few months mostly due to disastrous technical difficulties experienced at a Microsoft-hosted demonstration. Perhaps you’ve seen the video online. Everyone loves to laugh at and/or trash Microsoft. I’m not going to comment either way right now as that’s not the focus of this particular post. What I do hope to achieve is to draw attention to a very usable tool. Vista’s built-in speech recognition has enabled me to reduce the use of both keyboard and mouse to over 90%. To a sufferer of repetitive strain injury, this is an astonishing relief.

I won't be going over the setup process as it is very simple and other web sites already have excellent introductions (check out ExtremeTech's view of it: Vista's Speech Recognition).

Don’t get me wrong. Microsoft has not perfected speech recognition. What they have done is integrate it at the OS level so that anyone could technically do away with both keyboard and mouse. As makers of the operating system, perhaps they're the only ones who could have brought this about. I’m not technical enough to know if that’s true. I have seen how Dragon NaturallySpeaking has approached this, but beyond dictation it seems to be quite awkward controlling anything else (although I do know it's possible to program as many custom macros as you need with the Pro edition, but who wants to spend all that time doing that?).

The magic of Vista's speech recognition is in controlling programs. I can surf the web, respond to emails, and control every aspect of the operating system with my voice. Sometimes it really does feel like magic. The first time I was able to "click" on a link just by reading it I was sold. The first time the program threw up numbers on the screen to figure out which link I meant, I thought "Now that makes sense!"

Disambiguation or "Which one?" (Click to enlarge.)

To be fair, Vista's speech recognition is not compatible with every program. Now before you cry foul about Microsoft only making their products compatible with other Microsoft products, you might be amused to find that their speech recognition isn't even fully compatible with all of their own programs. Hopefully this will be a nonissue in the coming months. But for now, I've had to abandon some of my favourite programs in order to reach that 90% plateau I mentioned earlier.

I had to give up my beloved Opera for browsing. Of course, Internet Explorer is completely compatible, but I really hate to use it. Luckily, Firefox is almost as compatible once "Enable dictation everywhere" is checked under speech options (just say "show speech options" then "options"). The only difference is that it will pop up a confirmation box whenever entering text. I had to change my RSS reader (Opera again) to GreatNews. And I had to change my e-mail program (yeah you guessed it, Opera again) to Windows Mail which is the only one I found that worked without popping up those aforementioned confirmation boxes which is too annoying for dictating e-mail.

Entering text in Firefox (Click to enlarge.)

This brings us to dictation. I wouldn't even attempt to dictate before going through the subsequent training you can do after the initial setup. For best results, I would recommend using a headset that connects through USB rather than through your computer's sound card. I bought a used Sennheiser PC165 which has the added benefit of noise cancellation technology which doesn't hurt. If you already have a cheapish headset, try them out to see how well they work. Just know that if you get too frustrated, a better headset will improve the situation. If you don't already have a cheapish headset, don't waste your money on one.

Now if you follow the above advice, you'll find dictation accuracy of Vista's speech recognition to be quite good. It is nowhere near perfect, but it will only improve the more you use it. If I were to judge by dictation alone, Dragon NaturallySpeaking is much more accurate at getting it just right. But with its $200 price tag along with its previously mentioned shortcomings, the gap in accuracy doesn't seem to be so wide. With a little patience and a little practice in learning how to dictate to your computer (longer phrases rather than word by word works best for me) and using it will become as natural as typing. If it hasn't already occurred to you, I of course use it to dictate all of my posts.

To sum up, with Vista there is finally a speech recognition solution that can be used all the time. I have no doubt that this will be the first of many posts on this topic. I am excited that speech recognition is finally entering the mainstream. I can only hope that enough enthusiasm is garnered by this promising beginning that Microsoft and other operating system makers continue to pour resources into this important technology. (Dare I suggest that this is the real reason that Mac has postponed the release of its next OS?)

As always, feel free to comment.

No comments: