Usability issues with Speech Interfaces
Speech is a wonderful thing, but it's very difficult to design a speech interface that works well.
We can't usually tell if a typed message may have been generated by a person or a computer, but with speech, the ability to identify it as computer-generated is very easy.
Firstly, because most of the speech recognition and delivery engines haven't quite got the logic behind them yet, this results in the computer not responding in a way that makes sense to the user.
Secondly, the way you have to speak to the computer or telephone (slow, steady and sometimes in an American accent) may be unnatural.
Thirdly, the way the computer talks may not be smooth or have the right intonation and, again, the interaction doesn't 'sound' or feel natural.
There are a number of issues that need to be addressed in relation to the usability of speech interfaces and some of these are outlined below.
- Input vs. navigation
In general, a speech interface is best for continuous text entry, while manipulating and navigating data are best left to other modes of interaction such as pointing devices.Can speech replace the keyboard as a text input method? An example of the issues that are going to arise is with numbers. It's just as easy with a keyboard to hit the numeral or the appropriate keys but they need to be differentiated when speaking by adding more words (is number 9, the words "number nine", "9" or "number 9"). However, as a continuous text entry method, speech is pretty good.
As an editing method, speech is a poor cousin to the pointing device. While it's easy to talk, it's hard to point with your voice. Entering text and navigating often requires a dual input method (like the mouse - keyboard combination). One answer could be to combine speech text entry with a mouse, stylus, tablet, touch-screen or eye tracker that could highlight the text you want to change and then speak the words you want in that place.
- Errors
Of course if the computer can't recognise what you are trying to say, there are a couple of things it can do: carry out the task with what it thinks you wanted, repeat the question (possibly in a different way) or do nothing.The first option is annoying because you have to correct the mistake, the second can become annoying and the third might leave you wondering whether the computer is just sitting there thinking.
- Tuning to the target market
The success of a speech recognition system is also highly-dependant upon the vocabulary of users, especially when it comes to jargon and slang. Even within Australia, certain services may be aimed at specific populations or services allowing the recognition engine to be more finely tuned and therefore more accurate.The classic issue with speech is accent. The early speech dictation systems that were developed in the US were tough to use in Australia, as users had to speak in an American accent. Even within countries, there are different dialects that need to be cartered for.
It's important to remember that it's not only the vocabulary that is different but also the order of the words and the order that the information is collected.
- Customer understanding
It's important to remember how the customer thinks of your products and how they might want to access them. That is what terms they might use, what could they get it confused with, how could they mispronounce it, and how can you incorporate this information in to the system to make sure it can compensate for variations. - Interaction yime
The speed of the system is also important to the effectiveness of the process. If it takes all day to get through the process, people are likely to use other methods such as going into a branch or learn to get around the system by learning a path to the operator. This is not necessarily the most cost effective way of dealing with the customer. - Expert vs novice
It is important that the process is appropriate for both expert customers who may be familiar with the system as well as first time users. Some IVR systems provide two different paths for different customers based on familiarity. However, most systems should be straightforward enough for everyone to use.Directed Dialogue is a term that Speechworks uses to describe the shaping of the information that the customers need to provide to the system. It allows the computer to elicit the information it requires, at a level that is appropriate. If you answer outside the allowable inputs, it rephrases the question and slowly cuts back the complexity of the questions. The computer calculates the path they are taking and prompts the questions it needs answered.
- Feedback
We can also expect feedback through speech. Spoken confirmation that something has taken place can be handy, especially if the task is happening in the background. It would be nice to be asked questions to confirm the information but the process has to be faster than reading and be able to immediately repeat if necessary. Because speech is transient it requires attention at a specific time, rather than reading the screen that you can do at anytime.The Speechworks solutions also provide feedback to the customer while the computer is processing the speech and the request. This is very important as it informs the customer that something is happening. The sound itself is quite amusing and sounds like a coffee pot percolating.
The delivery speed of synthesised or concatenated speech is also important to the interaction. Too slow and it can be tedious but too fast and you have to concentrate really hard or keep pressing the hash key to repeat the menu. Some speech feedback, like the Mac OS Text to Speech, can be sped up or slowed down to suit the needs of the listener, making it a much more effective interface.
- Support information and tools
The customer experience may often include other sources of information or items that need to be taken into account. The classic example of this is the phone bill that needs to be paid has all the details for paying by credit card. Some companies have their pay by credit card telephone number on the back while the details you need to read are on the front, meaning that you are trying to listen to the IVR and trying to find the information which is spread out over both sides of the piece of paper. Orange has sorted this out and has all the information you need (including the total cost on the back for easy reference. It's often the little things like this that may not be obvious to the companies implementing the service but quickly show up when you actually watch customers do it. - Customer acceptance
Tim Courtright of Inflection Technologies also points out the importance of how the speech interface is introduced and sold to the customer. Pilot testing is essential but so is letting the customers know the benefits of a speech system. You want all your customers to be aware of how the system works and to buy into the service. Promotions for service launch provide a great opportunity for companies to get back in touch with the customer and NLSR provides a real benefit to them And don't forget your internal customers, such as the call centre agents who also need to be trained so they are very aware of what the customer who gets through to them just experienced.












Don't forget those are hearing impaired people who cannot hear!!! I'm not impressed of this article about voice communication.
Those techology will leave hearing impaired without jobs that require voice communication and can affect them in many ways.
Hearing impaired user!