Sensory, Inc. - Leaders in Speech Technology for Consumer Products
  Technologies
  Integrated Circuits
  RSC-4x Series for recognition,
synthesis and control
  SC-6x Series for speech
and music output
  IC Development ToolKits
  RSC-4x ToolKits
  SC-6x ToolKits
  Modules & Toolkits
  VR Stamp™ Module
  VR Stamp ToolKits
  Embedded Software & SDK’s
  Bluetooth Solutions

Sensory offers the following speech technologies:

Speech Recognition:

Audio:

Interactive/Robotic (RSC-4x):

Voice Recognition for BlueTooth Products:

Technology Matrix by Product

 
 
Audio and Speech Technologies
 

Sensory offers world class speech technologies on both hardware and software platforms. The technologies listed below can be implemented on the platforms depicted by the product symbols here:

 
Hardware: Software:

 RSC-4x Family with FluentChip™ Technology
RSC-4x Family
with FluentChip™ Technology

SC-6x Speech Synthesis Slave Processor
SC-6x Speech Synthesis Slave Processor


FluentSoft
Speech
Recognition

BlueGenie Voice User Interface
BlueGenie
Voice User Interface

 
Technology Demo Videos Available:

 Beat Prediction    BlueGenie VUI    LipSync    NanoLock SV
    Real-Time LipSync    SonicNet    SoundSource    TimeSet

 

Speech Recognition:

  Speaker Independent w/T2SI™
     

No Training Required!
Speaker-independent (SI) works right out of the box, and requires no end-user training. SI technology is designed for a specific language, and can handle thousands of words in a single set for FluentSoft, or over 40 words in a single set for the FluentChip/RSC-4x combo. The number of sets is limited only by the amount of memory in your system. With proper design, Sensory's SI technology will yield highly accurate recognition. Sensory's Quick T2SI (text-to-SI) is the first ever GUI tool to allow product designers to create their own speaker-independent set and execute recognition within minutes on chip!

 
  Continuous Digits
 

For entering phone numbers and digit strings
This technology is ideally suited for voice dialing applications such as handsets, personal dialers, mobile phones, and hands-free kits. It can also be used for an application like time setting, or anywhere that a string of digits are used for recognition.

 
  Speaker-Dependent Speech Recognition

Flexible vocabulary, any language, any accent
Speaker-dependent (SD) recognition is desirable where user-specific or language-specific vocabularies are required. Each recognition word is trained just once by the user to create voice "templates", each of which requires up to 200 bytes of memory (which can be on-chip or external). Vocabularies in excess of 100 words are possible, although there are often practical reasons for keeping recognition sets under 50 words. With proper design, Sensory's SD technology can yield highly accurate recognition for any user, regardless of language or accent.

 
  Speaker Verification

Biometric security through voice
While similar to Speaker-dependent recognition, Speaker Verification offers the capability of being able to identify whether or not a password is spoken by the original individual who trained that password. The user trains 1-4 passwords (the more passwords, the better the security) that can create voice access to any product. Equal error rates (where the probability of an incorrect acceptance equals that of an incorrect rejection) ranges between 0.01-7% depending on the number of words and whether the passwords are known to the imposter.
 NanoLock SV Demo

 
  Options

Continuous Listening
Always on - no button press required
For applications desiring hands-free usage, Continuous Listening (CL) technology enables products to respond to specific, discrete commands (surrounded by relative quiet) without pressing a button or waiting for a prompt. Sensory offers both continuously listening speaker-dependent and speaker-independent technologies.

Word Spotting
Recognize a word in the middle of a sentence
Word Spotting offers the ability to extract key-words from a normal conversation. This technology promises to improve the human-to-machine interface by creating a more natural language interface, and by its nature is more immune to noise.


Audio

  Speech Synthesis
 
 

Perfect for Voice Prompts and/or Speech Output
High quality speech and sound effects can be played back by Sensory’s IC’s and software products. Sensory's compression technology utilizes proprietary time and frequency-domain approaches that can compress speech and sound effects to as little as 1000 bits per second. Sensory’s text-to-speech (TTS) synthesis and Hybrid Speech can compress speech playback data even lower. Speech output creates the opportunity for natural dialog with a product and can reduce reliance on an instruction manual.

 
  Music Synthesis and (DTMF) Synthesis

Midi-like music can even accompany speech synthesis
Sensory's music synthesis technology can produce up to 14 voices simultaneously for harmonizing instruments. Custom libraries are available with a choice of instruments and pitch ranges. By using synthesis rather than digital recording, the off-chip memory required for an incremental 2-3 minute song is under 5Kbytes. In telephony applications, this feature is useful in generating DTMF tones to enable the RSC processor to directly perform the dialing function.

 
  Low Power Audio Wakeup

Wake from low power mode capability
One of the challenges for hands-free battery operated products, was that if they were always on, always listening the batteries would drain rapidly. Sensory has created a low power technology that can listen for audio (whistle or claps) and wake up from this low power mode and begin listening for speech recognition commands. This technology can extend the life of battery operated products from weeks to years, and improve the useability by making them truly hands-free.

 
  Record and Playback

Store messages and play back--voice messaging capability
Compressed digital sound reproduction
Sensory's RSC Family processors can record audio to off-chip RAM or Flash at data rates of under 14K bits per second for custom greetings, phones and answering machines, voice pitch changers, and hand-held recording devices. On-chip compression levels can be varied depending on the quantity and quality of playback desired. Automatic silence removal can also be done to reduce memory requirements.


Interactive / Robotic

  Beat Prediction

Chip figures out the recurring beat to know how to act moving forward-great for dancing and motion oriented products.
 Beat Prediction Demo

 
  LipSync

Allows for a product to match robotic mouth movements to pre-recorded speech.
 LipSync Demo

 
  Peak Detection

Picking up the amplitude of different sounds in the room as they occur and reacting to them with a movement or display function.
 

 
  Pitch Detection

A human pitched voice can be analyzed by the RSC processor to figure out the pitches being sung.

 
  Real-Time LipSync

Allows a product to match robotic mouth movements to speech heard in real time, much like a ventriloquist dummy.
 Real-Time LipSync Demo

 
  Sing Back

Combining talkback and pitch detection allows a robotic creature or avatar to imitate a person singing.
See the Windows Media demo

Non-human voices can also imitate a person singing-this demo features a dog barking and matching pitches from a human voice.
See the Windows Media demo

 
  SonicNet

Communicates data via encoded sound in short bursts between enabled units. Products with integrated speech that already include an RSC-4x chip, microphone and speaker can implement this for free, and can interact with each other, potentially doubling demand.
 SonicNet Demo

Interactive Multimedia Windows Media demo - requires Demo/Eval Board

 
  Sound Sourcing

Adding a second microphone allows the RSC processor to locate the direction of a human voice.
 SoundSource Demo

 
  Talk Back

The RSC can produce speech in response to your talking or inquiries that appears to be conversational speech from a non-human creature.

 
  TimeSet

Set digital time clocks using natural phrases.
 TimeSet Demo


Voice Recognition for BlueTooth Products

  BlueGenie™ Voice User Interface

Speech Recognition for Headsets, Music Players, Hands-Free Kits & More
Sensory’s BlueGenie Voice Interface software suite runs on CSR's BC-5 MM Kalimba DSP, and enables manufacturers of Bluetooth products to integrate full voice control and synthetic speech output without the need for visual displays or complex user interfacing. It frees designers to pack functionality onto small form factor Bluetooth devices and answers consumer demand for a truly hands-free experience.
 BlueGenie Voice User Interface Demo


 
Technology Matrix
  RSC-4x Family: SC-6x Family: FluentSoft Speech Recognition: BlueGenie™
Voice User
Interface:
Speaker Independent
with T2SI
 
Continuous Digits    
Speaker Dependent   coming soon  
Speaker Verification   coming soon  
Speech Synthesis
Music and DTMF Snthesis   coming soon  
Low Power Audio Wakeup      
Record and Playback      
Interactive/Robotic      
 
Copyright © 2007 Sensory Inc. All Rights Reserved. Please read our Privacy Policy  |  RoHS Compliance Statement