My hobby project for years has been a Babel device. In a relatively short time I am not too far away from a sample -- where is my competition?

When I was ten or eleven I read one of the first books that changed my life -- The Hitchhiker's Guide to the Galaxy.  Although later in life I tended to shy away from Douglas Adam's point of view, the concept of the Babel Fish was perhaps the most interesting and exciting things I had ever read about, even if it was made up.

The Babel Fish was a very silly creature adventurers would put in their ears. Once in your ear, the fish would interpret audible language from other creatures and quasi-telepathically put the meaning in your head.  The idea was so elegant and succinct that AltaVista later named its automatic translation service Babel Fish.

Several years ago I started working on my own Babel Fish -- a device you could actually put in your ear and get live translation of the world around you.  I thought as long as I had a blog on DailyTech, I could document the progress of my ultimate hobby project. Even back in 2000 we had more or less the components needed to do something like this:

  • Microphone attached to some sort of mobile device LI>
  • Speech to Text Software, a la Dragon Naturally Speaking (and others)
  • Automatic Text Translation, AV Babel Fish, Google Translation, etc
  • Text to Speech Software, AT&T Natural Speaking
  • Bluetooth headsets

The components have gotten a little better over the years, but the basic elements are more or less the same (though maybe I would opt for one of these bad boys instead of a headset). However, since 2000 processing power has increased dramatically.  My 200MHz Qtek 9100 PDA has no problem going on the Internet and actively translating text through a parser.

Unfortunately, there are some horrible walls I've run into while creating my Babel Fish.  The first is the delay in transmission.  Even when my speech to text software picks up a good block of conversation, there is a half second or so delay before it reaches my text parser.  From there it is another several seconds while the parser attempts to determine the language, connects to the internet and parses the text back to English.  The written English to audible English translation takes another second or two.  After all is said and done, the turnaround time for a basic translation is upwards of ten seconds.

There are some considerable steps I can take to reduce this -- moving the translation software onto the PDA, threading the input stream and increasing the processor speed.  I am working on all of these and I will report my progress in the next few months.

The much larger problem I have run into is getting the speech to text software to readily recognize languages correctly.  While I have no problem "selecting" a module for Chinese or a module of Japanese beforehand, every text to speech package I have dealt with has a horrible recognition rate for Asian languages (which, quite frankly are the only ones I am interested in developing this software for).  Spanish, so far, has been the easiest.  I have heard good things about the Microsoft Speech SDK, but the majority of my programming is already done on Linux.  More on this problem in the next few months as well.

With all the hurdles I need to overcome, the progress I have made has been relatively easy for a single programmer working in spare time.  If Sony or Samsung decided to put some effort into a project like this, everyone would have Babel Fish devices within a year.  I wonder why no major vendor has picked this up and run with it.

"The Space Elevator will be built about 50 years after everyone stops laughing" -- Sir Arthur C. Clarke
Related Articles

Copyright 2017 DailyTech LLC. - RSS Feed | Advertise | About Us | Ethics | FAQ | Terms, Conditions & Privacy Information | Kristopher Kubicki