Please note: This code comes with no warranty, nor support, whatsoever. None. Zip. Nada. If your talking robots become self-aware and enslave humanity, then I will not be held responsible. But if you’re in the mood for tinkering, here’s how it’s strung together:
- The sound is generating using Linear Predictive Coding (“LPC”).
- First, some Python code: The analyze_lpc.py script analyzes phonemes.dat (which is just a headerless version of phonemes.aif). Individual phonemes are separated by moments of silence, so the script splits the sound file on those. Each phoneme is converted to LPC data, using code that I ported from the rt_lpc project. I felt like I understood the mathematics 3 years ago, but I doubt I could explain it today.
- Now, in Flash: Launch the DictCompressor application, and watch the trace messages. Click the screen to open the browser window, then select your cmudict___.txt pronouncing dictionary. (You can obtain the latest CMUdict here.) Flash will convert this to a (smaller) cmudict.dat file, which is what LPCsynth.swf loads.
- LPCsynth is the application that talks. The LPCSynthHarness.sayItNow() method creates an array of LPCFrames, which are “spoken” in the sampleData() method. This was never intended for public distribution, so the code is not exactly stellar (the talking bit should be extracted into its own class).
Is this interesting? Did your Flash Player become self-aware after hearing its own voice? Let me know!
It’s funny, originally this was intended for the controlzinc.com website. The robot voice would sing as you clicked, crooning about your mousing habits. I still can’t decide if that idea was brilliant, or terrible.