Speech comes to the web
Jul 7, 2000 — by LinuxDevices Staff — from the LinuxDevices Archive — viewsSpeechWorks International, Inc. (Boston, MA) has announced an initiative called the Open Speech Web, aimed at creating a speech-based implementation of the World Wide Web. The proposed network of interconnected speech services will allow consumers to “speech-surf” or to connect to speech-aware applications simply by speaking commands into a telephone or Internet-enabled audio input device. If successful, Open Speech Web technology could have far reaching implications to the design of future generations of smart devices, Internet appliances, and embedded systems.
To accelerate the development and proliferation of Open Speech Web technology, SpeechWorks will soon release two key products as open source software: the SpeechWorks Speech Browser, and SpeechLinks. The two products are currently in beta.
About SpeechLinks
Based on industry standards SIP (Session Initiation Protocol) and XML, SpeechLinks serve as connective tissue that allows users to move seamlessly among speech-enabled services, much like Web surfing. SpeechLinks are also designed to support options for passing call-specific data and caller-specific data along with telephone calls.
Future releases of SpeechLinks will use SIP to add support for SpeechCookies which will enable the initiator of a SpeechLink to store and return data to the recipient should a caller be linked again. Unlike solutions that don't offer open connectivity among speech services, SpeechLinks are designed to let any service provider support call transfer or receive between any service, be it a VoiceXML-based or a legacy speech application, or to content not in the VoiceXML format.
About the SpeechWorks Speech Browser
The speech browser is an open source, standards-based browser that allows any speech application to access Web-based content that is VoiceXML-enabled, thereby extending the information and transactional capabilities available for speech services, such as speech portals.
The Speech Browser allows users to access voice-enabled applications on websites using a common format — the emerging VoiceXML standard. Users can access a Speech Browser-enabled site, listen and respond to voice prompts linked to URLs, and navigate to any compatible site selected.
The SpeechWorks Speech Browser consists of the following components:
- VoiceXML Engine — This component includes the VoiceXML interpreter and adds JAVA and HTTP elements as infrastructure that supports linking to web sites and handles the grammars and prompts as defined by VoiceXML.
- SNAPS Layer — Speech Navigation And Presentation Scripts include hot word customization and event log history. Base VoiceXML files are provided so developers can get started quickly with prompt-to-URL menu creation.
- Speech Recognition Interface — Allows integrators to incorporate SpeechWorks speech recognition and Text-to-Speech (TTS) from a variety of vendors. This published interface will support integration with other speech recognition engines.
Speech Browser and SpeechLinks software will be available as open source via the Carnegie Mellon Open Source Speech Initiative download site. Kevin Lenzo, Director of the Open Source Speech Initiative at Carnegie Mellon University, said: “CMU plans to not only become the focal point for free access to this software, but also intends to take advantage of the Speech Browser's open architecture by integrating Carnegie Mellon's SPHINX open source speech recognition.”
Through a recently announced strategic partnership with AT&T, SpeechWorks also plans to market Text-to-Speech (TTS) and Large Vocabulary speech recognition technologies developed over the past 25 years by Bell and AT&T Labs.
SpeechLinks and the SpeechWorks Speech Browser will be distributed as Open Source software to all interested developers including other speech software companies. Both SpeechLinks and Speech Browser are currently in beta test at a limited number of partners and customers. The SpeechLink 1.0 specification will be available for review at www.speechworks.com on July 15, 2000. SpeechLinks product code is expected to be available via web download in 3Q2000, and Speech Browser in 4Q2000. Carnegie Mellon University will be the host site for obtaining Speech Browser open source code. Additionally, SpeechLinks will be available as pre-packaged versions embedded in SpeechSite and the SpeechWorks platform.
Related story:
CMU Sphinx Open Source Speech Recognition Engines
This article was originally published on LinuxDevices.com and has been donated to the open source community by QuinStreet Inc. Please visit LinuxToday.com for up-to-date news and articles about Linux and open source.