The Practical Edge of Speech Technology

Moshe Yudkowsky, President, Disaggregate

Date: Thursday, October 29

Time: 11:30 - 11:45 AM

Location: Transformatorhuis

Speech technology continues to improve, and in the last year we've seen more ideas imported from from the wider world of technology, including flat-rate cloud-based speech technology services integrated with with the telephone network. At least one vendor now integrates speech technology with instant messaging - the same application can respond to speech or to text. And while speech technology will never be lightweight, Moore's law continues more processing power and memory to hand-held and other low-end devices, which improves speech recognition for mobile applications. It's hard to tell, but I see more interest in biometrics to authenticate users, most seriously for network applications involving lots of money.

Speech analytics provides data mining for recorded speech. Typical customers -- the ones I've spoken to think highly of analytics -- run large call centers with miilions of callers. They can dig into their massive collection of millions of minutes of recorded calls (otherwise useless because of its size) to find ways to reduce call times, avoid the root causes of calls, or improve their agents' sales skills. Another type of analytics provides

What we can't seem to do with speech technology is provide it for free through open source software. Some text-to-speech is available; speech recognition open source software might have some rudimentary models in the near future which would make ASR barely usable. The models and information required by speech technology is simply too great for today's open source collaborations, and for the foreseeable future commercial products will require commercial speech "engines."

