Voice recognition: Past, present and future

Talking hubs: voice recognition hits the net

In preparation for this new wave of voice-based Web access, Dr Rolf Schwitter, a lecturer at Sydney's Macquarie University, has integrated training in voice XML into an introductory course on Web technology.

"We cover basic speech technologies, speech synthesis, text to speech, then expand into an introduction to voice XML," Schwitt explains.

Schwitter believes, as speech technology becomes more prevalent, developers will need to understand both the engineering requirements of implementing a solution, as well as some of the psychological and linguistic requirements needed to write a dialogue flow.

"You have to ask the right questions to get the right information," Schwitter says. "If they are interested after having completed the first phase of the course, we have a unit called Interactive natural language systems, devoted to the subject."

Accordingly the course material he has been involved with was developed in conjunction with industry partners such as Motorola and Phillips.

"We are formulating the course based on what they want from their future employees, and teaching students what they will have to know," Schwitter says.

Traversing the technological plateau

While vendors are characterising the next phase of development in speech recognition technology as applications and integrations focussed, researchers in the field recognise that the technology behind such applications has largely reached a plateau.

Dr Steve Cassidy, senior lecturer in computing at Macquarie says there is a trend in the academic literature discussing what the next quantum leap in the technology might be.

"As far as the vendors are concerned the technology is at the state where you can do lots of useful things with it, and while the researchers are always trying to push things as far as they can, most of the work being done is based on incremental changes," Cassidy says.

Dr David Grayden, research fellow at the Bionic Ear Institute in Melbourne, believes that along side advances in processing power there have been three main advances in speech recognition technology.

"The first breakthrough was the introduction of databased approaches - rather than trying to understand every little speech event, then came dynamic time warping, which enabled the software to compare incoming speech with stored versions of the speech," says Grayden. "Next came hidden Markov models, which allowed continuous speech to be recognised, and forms the basis of dictation type models."

While conceding he is probably in the minority among engineering-focussed researchers Grayden argues that an earlier move away from integrating linguistic physiology into speech recognition research has ultimately proven detrimental.

"In there early days there was a notion that every time the research became more engineering based there was a leap forward in the technology," Grayden says. "I believe that is why it plateaued. There is now a need for a breakthrough, something new that will give us a jump in performance, and I believe it will come from a mixture of skills, including computer engineering, linguistics and physiology."

In the mean time, Cassidy is focussing on training graduates for an employment market where voice applications development is likely to provide the bulk of the work opportunities.

"Customer acceptance is growing, and it is fairly inevitable that these kinds of voice systems will take off and that the possibility for a bigger voice industry is already there, even given the limitations of the current technologies," Cassidy surmises.

Advertisement

Talkback 0 comments

Latest Videos

Sponsored content

Power Centre - Content from our premier sponsors

Blogs

  • Phil Dobbie A guide to the future of the internet
    Last week we looked at the history of the internet in Australia. It's been around for 20 years and changed our lives in so many ways. Imagine what it could do given another 20 years.
  • Array Carelessness busts Linux security
    No operating system can ever properly protect a computer from trojans as long as users continue to do silly things. Just because Linux is immune to your standard drive-by viruses it does not mean that it can escape trojan horses.
  • Array Sun shining on Ajnaware
    Graham Dawson talks about the future of iPhone app development and augmented reality.
  • More blogs »

Tags

Back to top

Featured