|
|
To print: Select File and then Print from your browser's menu
-------------------------------------------------------------- This story was printed from ZDNet Australia. --------------------------------------------------------------
|
Interfaces of the future By Stephen Withers, 0 December 10, 2002 URL: http://www.zdnet.com.au/news/business/soa/Interfaces-of-the-future/0,139023166,120270531,00.htm
How long will it be before your computer is able to read your facial expressions? Will a rude gesture become the next Control-Alt-Delete? ZDNet Australia investigates computing interfaces. It's generally accepted that the last substantial change in user interfaces happened in the mid 1980s when Apple launched the Macintosh, bringing the WIMP (windows, icons, menus, pointer) or GUI (graphical user interface) to the mass market. The arrival of Windows 3.0 in the early 1990s cemented the shift, but since then all we've seen is incremental improvementâ€"some say incremental change, arguing that not all the differences constitute improvements. The science fiction vision of computer systems with advanced user interfaces seems as far off as ever: Clarke and Kubrick's HAL is certainly not last year's model! But is this implied criticism really fair? The future has a way of creeping up without us noticing. For example, the styling of many contemporary cars bears a strong resemblance to the -cars of the future" that were touted in the 1960s and '70s. And while in-car navigation systems are rarely standard equipment, as an option or aftermarket accessory they cost less than many people spend on audio gear. Similarly, various -futuristic" user interfaces such as virtual reality and speech recognition might not be part of the everyday desktop computing experience, but they have found niches in the IT ecosystem.
Virtual Reality
-The two industries that have really adopted virtual reality are defence, and mining, oil, and gas exploration," he says. For example, virtual reality models built from seismic data are used to collaboratively investigate drilling options. In the military world, command and control systems manage huge amounts of data that are best represented visually. Some sections of manufacturing are also using VR in a big way. Ryner explains that the automotive industry started by simulating car crashes (to the extent that some models aren't physically crash-tested), moved on to styling projects in VR, and then took design a step further by checking that the necessary assembly and disassembly tasks will be possible. It sounds obvious, but it was only when a new motorcycle model went into production at a once-major manufacturer that anyone noticed that a fully assembled engine couldn't be mounted in the frame. Using VR for design makes it easier for Australian companies to play globally, Ryner says. Teams in New York, Melbourne, and London can all collaborate on the same data set, resulting in a quicker time to market. -It's become core infrastructure," he says. Among the emerging markets for VR is "hazard perception and situation awareness," Ryner says. Derived from military work such as dogfight simulations, the idea is to provide an interactive environment for training as that improves retention. Unlike traditional flight and driving simulators, these systems aren't limited to the physics of the situation but also model the behaviour of the people in the system. The State Rail Authority of NSW has purchased such a system to train railway staff, and the Queensland Police is using the technology for inspector level training in seige situations. Other applications are being considered including mine accidents and bushfires. One reason for using VR in analytical situations is that people tend to be scared of large quantities of data. That said, senior people aren't always comfortable with new ways of presentation, Ryner says, but the "Nintendo generation" is different. This can be seen in the military, he says, where today's generals are not comfortable with an AI-backed avatar advising them which pieces of available information are most relevant, but tomorrow's generals will be. Another interesting observation about younger people is that they accept lower visual fidelity without questioning the accuracy of the underlying simulation or VR model.
Speech RecognitionWhile speech recognition has a patchy record in connection with desktop applications, it is enjoying considerableâ€"though not universalâ€" success in phone-based systems. Industry analysts at META Group recently concluded -Although initial integration and investment cost obstacles remain, speech recognition-enabled contact centres and internal information source capabilities have entered the mainstream and are increasingly penetrating the enterprise as a broader user interface option." According to James Brooks, managing director of Genesys Laboratories Australasia, interest in speech recognition is largely the result of two trends. The current climate drives organisations to reduce costs at the same time as improving service, and the widespread use of mobile phones (and a propensity to use them to fill in -unproductive" time such as travelling) means speech-based services are attractive to both parties. The gaming industry has been a very successful user of speech recognition, he says, largely because from a technical standpoint, transactions are easily defined and there are not too many variables. From a client's perspective, speech makes it quick and easy to place a bet. While there have been several very successful speech implementations in the customer contact arena, the technology doesn't have a 100 percent record. -I think some customers have picked the wrong applications to auto-mate," says Brooks. The banks are all looking at augmenting their current tone-based phone banking systems with speech recognition -but the difficulty is coming up with a business case," he saysâ€"the question is whether making phone banking friendlier will lead to more people using it. -All of these systems will have to offer both [tone and speech] options" due to people's need for privacy and problems with background noise, Brooks suggests. Banks are also considering the use of voice portals to handle incoming calls. The idea is to let customers say what they're calling about, and then ask them specific questions according to their response in order to route the call to the most appropriate person. Applying tone-based IVR requires too many menus or too many choices to give sufficiently fine granularity. One bank has already trialled such a system: -It's good, it's very good," says Brooks. Standards such as VXML make it easier and cheaper to integrate speech-based systems with back-end applications, he says. This means much of the work carried out to deliver transactions or information via the Web can be reused, and any subsequent changes can flow through to both channels. On the corporate side, Brooks predicts growth over the next few years in the use of speech interfaces to human resources systems. A lot of work has been done to provide intranet-based self-service HR systems, but mobile employees might not use an intranet every day. -There are huge opportunities for efficiencies," he says. Genesys is working with companies such as SAP to integrate speech with HR systems. Dimension Data account manager Haydn Faltyn agrees. While contact centre automation and efficiency represented the -low hanging fruit" for speech recognition, it is applicable to other business issues including remote workforce administration. Dimension Data's strategies currently revolve around leveraging existing applications and infrastructure, and -speech recognition is a sweet spot" as it connects an existing phone network with applications such as SAP and Peoplesoft. Delivering functions such as timesheets, pay-roll, and rostering via voice as an alternative to Web access -is generating a huge amount of interest," he says. For example, businesses such as cinemas that employ young casual staff have improved on-time attendance by making the rostering system available by phone. Each employee can be given the following week's roster in an automated phone call, and if any of the proposed shifts cause problems they can bid for a different slot or ask to talk to someone in the HR department. SMS messages can also be sent to remind workers of their rostered shift for the following day. -It's all about supercharging business processes and giving access," says Faltyn. An organisation with 100,000 employees may have 95,000 casuals these days, and only a small fraction of those might have access to the company's intranet, he suggests. This means there is great potential to reduce costs through automation, especially if a speech interface can be added to existing systems. Other job functions where speech-based interfaces to administrative systems are proving fruitful involve truck drivers, meter readers, power line maintenance workers, and field service technicians, he adds. Dimension Data has a portfolio of over 20 speech applications that can be quickly and relatively cheaply deployed at new sites. -You're plugging in next to the Web server," says Faltyn, exploiting standards to get inter-operability. -When you get it right, speech is absolutely compelling," he says, adding -we've proven that it works."
Natural language processingSpeech recognition takes sounds and turns them into words, or at least identifies particular sounds and associates them with certain words. Understanding the meaning conveyed by a human language is another matter. The two are connected, as you might want a system to make sense of a stream of words generated by speech recognition software (and that is what happens in some of the systems outlined above), but they are separate functions. Just as you might put a voice interface in parallel with a Web interface to a back-end system, a natural language analyser can be fed with text produced by a speech recognition subsystem or merely typed in at a keyboard. A feed from a speech recognition module might give better results if it was able to detect and provide information about which syllables were stressed and where any pauses fell in a particular sentence. Consider the sequence of words -Woman without her man is nothing"â€"supposedly half the population parse it as -Woman, without her man, is nothing" while the remainder take it to mean -Woman! Without her, man is nothing". An old chestnut, admittedly, but it does illustrate the problem, because none of us would have any trouble determining which meaning was intended if the sentence was spoken naturally. In less contrived situations, interpreting our first language seems quite straightforward to most of usâ€"after all, we've been doing it for most of our lives. Getting a computer to do it is a different matter. One of the problems is that words and phrases can have multiple meanings, and very often the sentence containing such a word does not provide sufficient context to determine which meaning is intended. A whole class of jokesâ€"including -the world's funniest joke" (see www.laughlab.co.uk/winner.html)â€"relies on this, or at least on the existence of homonyms with very different meanings. While natural language processing (NLP) is still a work in progress for computer scientists, there are some tools existâ€"such as Simplis' Zlangâ€"that can make it easier for an application developer to support natural language queries. It's relatively easy to do NLP when the context is restricted, and that's why speech interfaces for telephone betting services have proved successful. The wider the domain covered, the harder it gets. For hands-on experience of how NLP systems can react, try MIT's START system â€"-what's the weather in Sydney?" is answered sensibly, but -Did Australia beat England in the test match?" isn't understood at all. Microsoft has a substantial NLP research group, and its work has provided the technology behind the Office grammar checker, Encarta's ability to answer questions, and the IntelliShrink feature in Mobile Information Server that extracts unnecessary words and characters from a message and applies abbreviations to it before relaying the condensed message to a mobile device. The company stresses that its NLP technology uses an automated knowledge base. Just as the use of a DBMS in a conventional application separates data storage from data processing, this approach means new NLP algorithms can be grafted onto an existing knowledge base. Systems built on hand-coded data do not have this flexibility. Microsoft's experimental MindNet system contains information about interrelationships between words that it has generated from an analysis of texts including two dictionaries and Microsoft's own Encarta encyclopaedia.
Other technologiesHere are some of the other fields in which interface and interaction work is being done: Gestures. Although the idea of issuing commands by making gestures dates back to the 1970s, it was popularised by PDAs such as the Apple Newton MessagePad and to a lesser extent the far more successful range from Palm. To delete some text on a Newton, you scribble over it with an up-and-down zig-zag action. The caps lock command when entering text with the Graffiti alphabet on a Palm is a pair of up movements with the stylus. With the new Tablet PC, wiggling the stylus pen above the screen brings up the virtual keyboard. More recently, they have shown up in Web browsers. Since browsing is largely mouse driven, the idea of being able to issue commands without tracking all the way to the tool-bar or menu bar (or using keyboard shortcuts) is attractive. The lead in this are was set by Opera. In that program, most gestures are performed while holding down the right mouse button. The gesture for -previous page" is to move the mouse left; for -next page", move the mouse right. Slightly more complex gestures include moving up then right for -maximise window" and down then left for -minimise window". The Mozilla community has also picked up this idea: the Optimoz add-on provides gesture support for the open-source browser. Some of the gestures match those used by Opera (such as those for previous and next pages), while others have been added, for example up-down-up to refresh the current page without using the local cache. While moving the mouse left matches the left-arrow on the back button, the link between up-down-up and refresh isn't so obvious. The concept has also been implemented at a system-wide level. Sensiva's Symbol Commander maps gestures (including letter-shapes) onto commands such as running a particular program or saving a document. Symbol Commander is available for Windows and Pocket PC, which is especially convenient for people who divide their time between full- size and handheld devices. Sensiva is part owned by Toshiba, and the latter will be shipping the gesture recognition software with its Portege 3500 Tablet PC. Handwriting. Like speech recognition systems that are not dedicated to a particular task, handwriting recognition seems to work well for some people but not others. The primary answer to this problem has been the development of special alphabets for stylus-oriented systems, notably Palm's Graffiti system. For example, the letter M is drawn in the -golden arches" style, while A is drawn without a crossbar. The recent announcement of Tablet PCs is likely to renew interest in handwriting recognition for everyday applications, but it is unlikely to be a significant feature of desktop or notebook computing. The Inkwell technology in Mac OS X 10.2 supports handwriting recognition as well as command gestures via a graphics tablet. Avatars. One approach to humanising the user interface is to give the program a human face. Animated agents or avatars appear to be a positive feature for many users, though a lot of people were glad to see the back of Microsoft Office's Clippy. The Ananova newsreader was a relatively early example of a fully animated avatar, but some organisations are working with static images bearing expressions appropriate to the program's response. An example of the latter is Eve, the virtual service agent on eGain's Web site. Animated faces make people more comfortable, according to Zac Jacobs, director of business development at Famous3D. Apart from Web-based avatars, this Australian company's technology is also applied to animated characters for film, TV and computer games. -We can put a lot of personality into a character," says Jacobs. One way of applying facial animation is to provide a virtual assistant to help people fill in forms. -When you make a mistake in a field, the character can spot the mistake and tell you how to correct it," he says. Characters might adopt different demeanours to suit the situation and the user demographics: contrast -Mate! You've left some digits out of your mobile number" with -That mobile number doesn't look quite right, would you check it, please?" The technology need not be limited to canned responses. It is quite feasible for contact centre agents to conduct a conversation with someone via an avatar. Furthermore, Famous3D is working with another company to back avatars with AI. Jacobs expects the Famous3D technology to be in use on over 10 Web sites by the end of the year, including a major company in the financial sector. Avatars can also be used on intranets, and one very practical example is a -virtual receptionist" in the lift lobby of each floor in a building, ready to provide directions to particular offices or other facilities, or to summon a real person when necessary. There are over 100 licensees of the company's software. Famous3D is also working with a Japanese company to put avatars in mobile phones to deliver information such as weather reports or the result of Internet queries. -The phone is such as visual medium these days in Japan," Jacobs says. Research into the science behind avatars is ongoing. For example, an avatar that fails to smile naturally has a negative impact on users, according to Eduardo Chavez, associate lecturer and Web designer and developer at the University of Technology Sydney's Faculty of Information Technology. Chavez' analysis of the eye expressions that accompany a genuine smile could be used to make avatars more realistic in a way that could have a significant affect on their acceptability to users. However, the relationship between the eye and mouth movements in true smiles is not simple. Mood recognition. As if it wasn't enough to have software pulling your strings by displaying an appropriate facial expression, work is underway that will result in programs responding to your mood as displayed on your face. NCR's Teradata division and the University of Southern California (USC) are collaborating in a project to explore ways computers can store, interpret and use human emotions. -The idea is to capture the face, annotate it with dots and regions, so we can then process and match it with our database against a catalogue of emotions and have the system react the way a good salesperson might," says Dave Schraeder, technical expert at Teradata. USC and Teradata see two particular applications for this technology. The first is to improve customer service by having systems react to expressions. For example, if the user is squinting at the screen, the device might switch to displaying text at a larger size. NCR's interest in the project is understandable, since it is a major manufacturer of ATMs, automated checkouts and related devices. The other area is health, where the technology might eventually be used to help diagnose various emotional and medical problems. USC clinical psychologist Dr Skip Rizzo suggests it might augment a therapist's skills when dealing with geographically remote clients via the Internet. -We still have a long way to go with this, but I believe that by tracking facial expressions it gives us added information that the therapist could use to get better insight into the patient," he says. Teradata suggests it will be three to five years before these -e-motional" systems reach the market. Just because a computer can't see you, don't assume it can't tell how you are feeling. Mitel has patented a system for detecting a person's mood from their tone of voice, choice of language and other cues available via a phone line, such as the speed at which numbers are pressed when responding to menus. The idea is to allow contact centre systems to direct incoming calls to operators best able to handle people in a particular frame of mind, but the concept could have broader application. For example, voice-based applications might be able to detect increasing levels of frustration when someone is struggling with a particular facility, and deliver a brief, carefully focused tutorial. But why not do a better job of designing the primary user interface in the first place (or reduce queuing times), so users don't get frustrated with your system?
Case study: Holden Special VehiclesVirtual reality has had a significant impact in the automotive industry, simplifying international collaboration and reducing the need to construct clay models. Holden Special Vehicles (HSV) produces a range of performance cars based on models such as the Commodore and Statesman. Styling is carried out in the UK by parent company TWR's chief designer Neil Simpson. (TWR also styles entire cars for a variety of manufacturers.) When Simpson is happy with his work, the TWR VR centre in Worthing is linked to a similar SGI-equipped facility in Melbourne at either Holden or RMIT to give the HSV product group a virtual tour of the car and an opportunity to comment on the styling. -We'll go through a couple of design iterations," says Brad Dunstan, executive in charge of advanced engineering at HSV. When everyone is satisfied, a traditional clay model of the affected parts of the car will be made. -Virtual reality is good, but it's possible to get hiccups in the digital model that you don't notice," says Dunstan. The bumper bar for the Monaro-based HSV Coupé was only rendered in clay at the very end of the design process: -It was almost completely digital," he said. Traditional methods involve two or three clay models to get close to an acceptable design, but now a stylist can present seven or eight iterations at the first review. It only takes a matter of days to blend the best parts of each, -and we can look at it full size in the VR centre," he says. Previously, work on a new car took two years, but that has been slashed to 18 months. -We're able to deliver new variants such as the Monaro in record time," says Dunstan. Once the design is finalised, the digital models are transferred to a CAD system where a team of typically six operators convert them into finished designs that can be turned over to the toolmakers. Speedy development is essential to HSV, as it can't start work on a new product until Holden has frozen the design of the base model. And those models now have a two-year production life rather than three, increasing the time pressure on HSV. The process saves the company money as well as time. The combination of VR and digital engineering saved HSV over $1 million on the Monaro project alone, Dunstan says.
Copyright © 2009 CBS Interactive, a CBS Company. All Rights Reserved. |