PDAs find their own voice

Text-to-speech

Many applications will require speech output as well as input. Although unified messaging providers are already providing text-to-speech, these facilities are typically hosted on a server.

Force Computers, an embedded systems manufacturer, is using DecTalk text-to-speech technology in its StrongArm- and Intel-based wireless devices. DecTalk was first developed by DEC Computers, before being acquired by Compaq, which then sold it to Force's parent company, Solectron. Force is still developing the product, says Carl Leber, product manager for DecTalk. 'One of the things that mobile vendors are concerned with is footprint size,' he says. So a product that was once shoe-horned into a PC is now being squeezed into a mobile client.

In the same way that it is unusual for two people to have the same handwriting, it is unlikely that two users' voices will be exactly the same. The software engineer can choose to design a large product that is as comprehensive as possible and resilient in the face of users with different accents and rates of speech. The alternative is a compact product that restricts the number of words it expects at any one time.

Tap-and-talk
The approach used by MiPad ­ provides context-sensitive speech recognition. However, even this does not reduce the software's footprint to a size that easily fits on a mobile client. With current technology, the most effective way to squeeze voice recognition into a mobile device is to trim the application to the point where it is only looking for specific, and very distinct, words.

DecTalk software originally consisted of 160,000 lines of C code, but Force says that the product is now small enough for use in mobile devices. 'We are licensing the software to people who want to put the module onto chips,' says Leber.

One advantage that text-to-speech has over speech-to-text is that text has less uncertainty, so it is easier to interpret. But speech output depends on a phonetic rule engine, which can sound lumpy and mechanical. On a PC or server this problem can be overcome, to some extent, by providing a larger dictionary and more sounds and words. It is also possible to add extra digital signal processing (DSP) hardware. Neither of these options is easy or, in many cases, possible when implementing speech on a mobile client.

The choice of whether to put speech software onto the client device or leave it on the server will depend on the particular application. Julia Ferguia, director of communications and product planning at AVT, a company that provides unified messaging systems, says that she sees no reason to cram all the software into the client. '[Processing] happens on our messaging server and over our voice pipe,' she says. On the other hand, this approach would not be suitable for some applications. For example, vendors such as TTPCom point to laws that require hands-free operation when mobile devices are used by drivers.

Advertisement

Talkback 0 comments

Latest Videos

Sponsored content

Power Centre - Content from our premier sponsors

Blogs

  • Chris Duckett Get extensions going in Firefox, redux
    Previously on Null Pointer we looked at getting extensions working in Firefox betas, and that was great until the fine folks at Firefox changed their minds.
  • Array How reliable is IP telephony?
    Have you ever heard a weird kind of hissing, crackling or popping noise when calling someone on an IP telephony line? How rare is the phenomenon these days?
  • Array Forget the NBN, 100Mbps is already here
    Telstra and TransACT will shortly begin offering 100Mbps broadband to many customers. By moving early, the companies have not only raised the bar for Australia's broadband services, but thrown down a challenge to a government that now faces increased pressure to deliver the NBN as promised.
  • More blogs »

Tags

Back to top

Featured