Donald D Henson wrote:
Donald D Henson wrote:
Does anyone know of an Open Source application to accept continuous speech and convert it to text? I've found a couple of proprietary apps but you have to use Voice mail as an input. Any suggestions appreciated.
Don Henson
A couple of weeks ago it was suggested that I try a product from Nuance call dragon Naturally Speaking. As this was a non-open source product, I had to pay for it. Bummer. However, my problem was serious enough that I decided to go with a product that I had to pay for. I also promised the list that I would post a review after using the product for a couple of weeks. Here's the review.
How is it on resources? I.e. memory and cpu? Last I used drag-in-dict, it was ages ago -- before it allowed continuous speech recognition -- but even then, after training, it still was pretty slow. I eventually migrated to IBM's ViaVoice Pro-USB 10.0. unfortunately, IBM stopped offering the product and sold or gave a resale license to Nuance, but as far as I know, the source didn't migrate with the resale license and no new work has been done on it since it first came out ~ 5 years ago. It was the first to offer continuous dictation -- Dragon was in financial woes at the time and it took nearly 2-3 years before they recovered and had a continuous speech product. Thing with VVPro, is that it is resource intensive. I'd say starting with 1GB under XP is a minimum, and a 1GHz P-III Pentium was too slow to be usable. 3GB and a 2GHz Core-Duo, was "ok", but it grabs onto the system input mechanism and slows down all input/output -- even when it is "asleep" or its in the microphone 'off' state. Am on a 3GB 3.2GHz machine now and if I'm dictating into word, it's pretty good for recognition and speed. For application integration, though, IBM only added full integration for MS Office and IE. It can blindly type text into a non-integrated application, but that can be painful. A nice feature, which I consider 'essential', is that when you dictate into word or its speakpad, it stores the voice sessions with the document. This allows later re-editing in the case of word (clicking on a word, you hear your voice) -- and when you correct words, it 're-learns' what the word should have been based on what you said. So with the fully integrated applications allow the speech recognizer to be trained at the same time you are dictating -- so it will learn new vocabulary and learn your nuances of changing pronunciation. IBM released a development pack for linux, but nothing ever happened with it, and it was too primitive to make use of in the general case -- would have required specific apps to include and call their API -- a benefit of the MS platform where most programs go through common API's (though not Firefox nor T-bird). About 2-3 years ago, IBM announced their latest voice technology -- requiring no training -- but did not announce any products with it. The "product" they were demoing for their announcement was a foreign speech translation program -- and specifically, the plans were to sell the product to the US armed forces for use in the field in Iraq, where it had already been field tested with some success to allow soldiers to communicate and understand basic phrases in the local language. I tried to find out more info -- and when something might be released for consumers (at the time was projected that something might be available for consumers that summer (2006). I never heard anything after that -- but have heard occasional stories that the tech is still being used. Purely a guess, but maybe the military thought it worked "too well", and bought up the entire product for military/government use only. Maybe they didn't want such easy-to-use translation technology in the hands of possible enemies...or maybe they just wanted to keep civilians from being able to easily access such translation technologies. Obviously IBM continued their voice recognition and synthesis development, but it seems they dropped consumer level offerings off their map -- probably selling expensive custom business and government systems was far more profitable than trying to sell and support end users. Anyway -- as computers have gotten faster, their original tech is still pretty good. Required minimal training ~10-30 minutes. Occasionally I still see the product for sale, but the price has not gone down -- was best in class and retail was $200. They sold medical and legal specific vocabularies for an additional ~$200 each. No competition or 3rd party sellers ever came into the market to reduce the prices. Trés sad. Linda -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org