Cars Controlled By Speech

Being a speech recognition guy I'm looking for a car with speech recognition included. Sounds strange to select car just because of it, but that is just kidding. So far the list is:

  • Honda Accord
  • Any Ford 2011
  • Mazda 6
Not listing something expensive like BMW or Mersedes. Hm, it looks almost everyone is doing that. Any others? Which is the most advanced one?

Some details on particular implementation

Ford SYNC 2011

Quite advanced system. Command-based. Supports many types of commands to control dvd or get baseball scores. Supports user profiles but doesn't seem like it has specific training procedure. With current speaker recognition capabilities it could in theory adapt to users automatically without profiles.

Mazda 6 2011

Pretty interesting system, but limited comparing to previous one. According to owner manual it supports a very limited list of commands to manage calls, get incoming messages and. From interesting capabilites it supports training and voice entry for contacts. Three languages - English, French, Spanish. Looks like it's using single microphone. Looks like voice navigation system has separate speech recognition subsystem.
    Honda Fit 2009

    Many commands mostly related to navigation but no user adaptation and no profiles. Alphanumeric entry as a backup to vocabulary search. This one is very simple.

    Mitsubishi/Hyundai 2011

    I didn't manage to find the manual on them. Feature name "Bluetooth hands-free phone system with voice recognition and phonebook download" makes me think it's the same system as in Mazda.


    Doesn't seem like this is deployed, but presentation looks impressive


    Accoding to SpeechTechMag Microsoft and Kia codeveloped the UVO multimedia and infotainment system, which the Korean automaker rolled out in its new Sportage, Sorento, and Optima models late last year. UVO lets users access media content and connect with people through  quick voice commands without having to navigate hierarchical menus.

    ICASSP 2011 Part 1 - Thoughts

    It seems like ICASSP this year was a great event, it is pity I missed it. Just comparing the keynotes list, ICASSP beats Interspeech 4:0. ICASSP is very technical, Interspeech is for linguists. Compare the two:

    Making Sense of a Zettabyte World vs Neural Representations of Word Meanings

    New section formats like technical tracks and trends discussions are interesting though I am not sure how they felt in practice.

    So this was the reason to spend few days in reading. 1000 papers on speech technology! Huh. Thanks to all authors for their hard work! Well, I found several duplicates in the end.

    Main thing I noted is that topics of the research are very sparse, for example
    • Everyone does speaker recognition. Appealing problem statement here is that here is to detect a synthetic speaker. Paper titled "DETECTION OF SYNTHETIC SPEECH FOR THE PROBLEM OF IMPOSTURE" by De Leon at al. hints that there is no solution for that.
    • I got tired to skip pursuits, bandiths and compressive sensing
    • On the other side, increased portion of papers on non-speech signals, cocktail party problem, signal recovery is very interesting to read.
    • Things like DBN features or SCARF decoder are widely represented. You can read about applications of CRF from g2p algorithms to dialogs. But traditional things like search algorithms and adaptation are almost uncovered. 
    • It was suprising to find the session dedictated to multimedia security which must be a gold mine of ideas in particular if you need a topic for a paper. Is there a company selling such products? 
    Overall I found several original problem statements as well as inspiring ideas covering very important technology issues. For example it would be nice to implement meeting transcription application with several iPhones to combine streams and later transcribe them using multichannel environment compensation. Several meeting transcription setups and channel separation methods are described in the conference proceedings.

    After reading some amount of papers I found that conference papers are too short. While you see a nice title and an abstract you expect to read a detailed insight into the problem with history discourse and everything explained in detail, a deep investigation of the problem. But you get just a description of the technology and few figures from experiments. On the other side, I will not be able to read 100 papers 20 pages each.

    Very interesting that this year awards are not related to speech technology. That will be the contents of Part 2. I just need to go through last 50 papers left.