Alexa, can you understand me?

Q: Is there a voice recognition system that can replace typing that actually works well?

A: As much as technology has advanced since Bell Labs developed “Audrey” in the 1950s, which could only recognize digits spoken by a single user, it’s still far from what you’ve seen in science fiction movies.

Commands vs. Dictation

Voice command platforms like many automated phone systems use are reasonably effective because they severely limit the number of verbal commands you can use.

Natural speech recognition is what most people want, and that’s a challenge that has yet to be met in a way for it to be widely adopted.

We’re surrounded by options that offer some form of voice command and/or recognition from Apple, Google and Amazon, but they are far from perfect, as we all well know.

Accurate dictation has been the challenge that many very sophisticated companies, including IBM, have been trying to solve for over 60 years.

To put the problem into perspective, a system with a 90 percent accuracy means that every 10th word is wrong. That accuracy gives us a ratio of 1 in 20 words being incorrect; even at 98 percent accuracy, we’re still looking at roughly 1 in 50 words being incorrect.

With an average paragraph in the 100- to 150 word range, you can start to see how the time we may save in generating the text can get eaten up in editing what was captured.

Throw in other factors like changing voices when we’re sick, various accents, the speed in which we speak and a host of other variables and you start to understand how much more sophisticated of a processor the human brain is.

The Context Problem

Another huge challenge is context, both in command and dictation technology. Google recently started to bridge the context gap with their latest Google Assistant technology that allows you to have more of a conversation.

For example, you can ask “Do I need an umbrella today?” and after it responds, you can follow up with, “What about tomorrow?”

Another advance in context is being made possible by what many consider the “creepy” factor of today’s technology. Since our smartphones can remember virtually everything we’ve done in the past, consider our current location or what we’ve been searching for online or in a mapping program, they can use this additional information to help better understand your verbal commands.

Tips for Being Successful

If dictation is your key need, the company that’s been at it the longest, as far as a consumer product goes, is Dragon NaturallySpeaking. 

As good as the program is, expecting to install the software and have it magically become your new way of typing will guarantee failure. You are essentially going to be learning a new language in a sense.

If you aren’t willing to take the necessary time to train yourself to learn how to speak to your computer, you shouldn’t bother spending the money.

You’ll also need to make sure that you have the proper hardware to be successful, such as enough processing power, RAM and a good microphone, so be sure to review the system requirements before taking the plunge.

Ken Colburn is founder and CEO of Data Doctors Computer Services. Ask any tech question on Facebook or Twitter.

Federal News Network Logo
Log in to your WTOP account for notifications and alerts customized for you.

Sign up