 |
| |
| Audio and Speech Technologies |
| |
Sensory offers world class speech technologies on both hardware and software platforms. The technologies listed below can be implemented on the platforms depicted by the product symbols here:
|
| |
|
|
| |
| Technology Demo Videos Available:
Beat Prediction BlueGenie VUI LipSync NanoLock SV
Real-Time LipSync SonicNet SoundSource TimeSet
|
| |
|
| Speech Recognition: |
|
|
| |
| Continuous Digits |
 |
For entering phone numbers and digit strings
This technology is ideally suited for voice dialing applications such as handsets, personal dialers, mobile phones, and hands-free kits. It can also be used for an application like time setting, or anywhere that a string of digits are used for recognition.
|
|
| |
| Speaker-Dependent Speech Recognition |
 |
Flexible vocabulary, any language, any accent
Speaker-dependent (SD) recognition is desirable where user-specific or language-specific vocabularies are required. Each recognition word is trained just once by the user to create voice "templates", each of which requires up to 200 bytes of memory (which can be on-chip or external). Vocabularies in excess of 100 words are possible, although there are often practical reasons for keeping recognition sets under 50 words. With proper design, Sensory's SD technology can yield highly accurate recognition for any user, regardless of language or accent. |
|
| |
| Speaker Verification |
 |
Biometric security through voice
While similar to Speaker-dependent recognition, Speaker Verification offers the capability of being able to identify whether or not a password is spoken by the original individual who trained that password. The user trains 1-4 passwords (the more passwords, the better the security) that can create voice access to any product. Equal error rates (where the probability of an incorrect acceptance equals that of an incorrect rejection) ranges between 0.01-7% depending on the number of words and whether the passwords are known to the imposter.
NanoLock SV Demo
|
|
| |
| Options |
 |
Continuous Listening
Always on - no button press required
For applications desiring hands-free usage, Continuous Listening (CL) technology enables products to respond to specific, discrete commands (surrounded by relative quiet) without pressing a button or waiting for a prompt. Sensory offers both continuously listening speaker-dependent and speaker-independent technologies.
Word Spotting
Recognize a word in the middle of a sentence
Word Spotting offers the ability to extract key-words from a normal conversation. This technology promises to improve the human-to-machine interface by creating a more natural language interface, and by its nature is more immune to noise. |
|
|
| Audio |
|
|
| |
| Music Synthesis and (DTMF) Synthesis |
 |
Midi-like music can even accompany speech synthesis
Sensory's music synthesis technology can produce up to 14 voices simultaneously for harmonizing instruments. Custom libraries are available with a choice of instruments and pitch ranges. By using synthesis rather than digital recording, the off-chip memory required for an incremental 2-3 minute song is under 5Kbytes. In telephony applications, this feature is useful in generating DTMF tones to enable the RSC processor to directly perform the dialing function.
|
|
| |
| Low Power Audio Wakeup |
 |
Wake from low power mode capability
One of the challenges for hands-free battery operated products, was that if they were always on, always listening the batteries would drain rapidly. Sensory has created a low power technology that can listen for audio (whistle or claps) and wake up from this low power mode and begin listening for speech recognition commands. This technology can extend the life of battery operated products from weeks to years, and improve the useability by making them truly hands-free.
|
|
| |
| Record and Playback |
 |
Store messages and play back--voice messaging capability
Compressed digital sound reproduction
Sensory's RSC Family processors can record audio to off-chip RAM or Flash at data rates of under 14K bits per second for custom greetings, phones and answering machines, voice pitch changers, and hand-held recording devices. On-chip compression levels can be varied depending on the quantity and quality of playback desired. Automatic silence removal can also be done to reduce memory requirements.
|
|
|
| Interactive / Robotic |
|
| Beat Prediction |
 |
Chip figures out the recurring beat to know how to act moving forward-great for dancing and motion oriented products.
Beat Prediction Demo
|
|
| |
| LipSync |
 |
Allows for a product to match robotic mouth movements to pre-recorded speech.
LipSync Demo
|
|
| |
| Peak Detection |
 |
Picking up the amplitude of different sounds in the room as they occur and reacting to them with a movement or display function.
|
|
| |
| Pitch Detection |
 |
A human pitched voice can be analyzed by the RSC processor to figure out the pitches being sung. |
|
| |
| Real-Time LipSync |
 |
Allows a product to match robotic mouth movements to speech heard in real time, much like a ventriloquist dummy.
Real-Time LipSync Demo
|
|
| |
| Sing Back |
 |
Combining talkback and pitch detection allows a robotic creature or avatar to imitate a person singing.
See the Windows Media demo
Non-human voices can also imitate a person singing-this demo features a dog barking and matching pitches from a human voice.
See the Windows Media demo
|
|
| |
| SonicNet |
 |
Communicates data via encoded sound in short bursts between enabled units. Products with integrated speech that already include an RSC-4x chip, microphone and speaker can implement this for free, and can interact with each other, potentially doubling demand.
SonicNet Demo
Interactive Multimedia Windows Media demo - requires Demo/Eval Board |
|
| |
| Sound Sourcing |
 |
Adding a second microphone allows the RSC processor to locate the direction of a human voice.
SoundSource Demo
|
|
| |
| Talk Back |
 |
The RSC can produce speech in response to your talking or inquiries that appears to be conversational speech from a non-human creature. |
|
| |
| TimeSet |
 |
Set digital time clocks using natural phrases.
TimeSet Demo
|
|
|
| Voice Recognition for BlueTooth Products |
|
| BlueGenie™ Voice User Interface |
 |
Speech Recognition for Headsets, Music Players, Hands-Free Kits & More
Sensory’s BlueGenie Voice Interface software suite runs on CSR's BC-5 MM Kalimba DSP, and enables manufacturers of Bluetooth products to integrate full voice control and synthetic speech output without the need for visual displays or complex user interfacing. It frees designers to pack functionality onto small form factor Bluetooth devices and answers consumer demand for a truly hands-free experience.
BlueGenie Voice User Interface Demo
|
|
|
| |
| Technology Matrix |
| |
RSC-4x Family: |
SC-6x Family: |
FluentSoft Speech Recognition: |
BlueGenie™
Voice User
Interface: |
Speaker Independent
with T2SI™ |
 |
|
 |
 |
| Continuous Digits |
|
|
 |
 |
| Speaker Dependent |
 |
|
coming soon |
|
| Speaker Verification |
 |
|
coming soon |
|
| Speech Synthesis |
 |
 |
 |
 |
| Music and DTMF Snthesis |
 |
|
coming soon |
|
| Low Power Audio Wakeup |
 |
|
|
|
| Record and Playback |
 |
|
|
|
| Interactive/Robotic |
 |
|
|
|
|
| |