It was a joke, back in September. A goofy idea, amidst a brainstorming session of merely silly ideas. It’s a heavenly harp! And when you turn it upside-down, it becomes a Devil Harp! Ha, ha.
The YouTube trailer would probably look something like this:
I hacked Angel Harp together in my spare time. Four long months! The plan was to finish by Halloween of 2011, but it took considerably longer than expected. The synthesis was completed in one week, the sound effects in another week. Standing on the shoulders of Twang, Angel Harp produces somewhat-realistic tones (like an actual harp! Complex filtering!) And it has 3+ dozen strings, for serious plucking power!
And, the graphics… Let’s talk about that.
Once the Halloween deadline became improbable, I decided to hack each feature until it was “good enough.” If any feature became an eyesore, then I’d revisit it — either for version 1.0, or a future release. The clouds were redone a couple times. I had grand plans for the harp itself, using an (awful, buggy) harp modeling tool; in a future version, you can draw your own harps, and skin them with fancy materials, I think.
Please note: This code comes with no warranty, nor support, whatsoever. None. Zip. Nada. If your talking robots become self-aware and enslave humanity, then I will not be held responsible. But if you’re in the mood for tinkering, here’s how it’s strung together:
First, some Python code: The analyze_lpc.py script analyzes phonemes.dat (which is just a headerless version of phonemes.aif). Individual phonemes are separated by moments of silence, so the script splits the sound file on those. Each phoneme is converted to LPC data, using code that I ported from the rt_lpc project. I felt like I understood the mathematics 3 years ago, but I doubt I could explain it today.
Now, in Flash: Launch the DictCompressor application, and watch the trace messages. Click the screen to open the browser window, then select your cmudict___.txt pronouncing dictionary. (You can obtain the latest CMUdict here.) Flash will convert this to a (smaller) cmudict.dat file, which is what LPCsynth.swf loads.
LPCsynth is the application that talks. The LPCSynthHarness.sayItNow() method creates an array of LPCFrames, which are “spoken” in the sampleData() method. This was never intended for public distribution, so the code is not exactly stellar (the talking bit should be extracted into its own class).
Is this interesting? Did your Flash Player become self-aware after hearing its own voice? Let me know!
It’s funny, originally this was intended for the controlzinc.com website. The robot voice would sing as you clicked, crooning about your mousing habits. I still can’t decide if that idea was brilliant, or terrible.
You can push your formant sequence to the Yamaha FS1R, using software such as K_Take’s FS1R Editor. Click the “Save .syx” button, and follow the instructions in K_Take’s documentation. This is a lot of fun, and breathes new life into the FS1R.
This project became much deeper than anticipated! The code includes FFT analysis (thanks Gerry Beauregard), pitch detection, a formant detection algorithm, and an AIFF parser to read AIFF files. The interface was a challenge to design and implement, and there are still many unfinished features.
My energy is shifting to other work, so I’ll enhance fseq-flash when time permits.
Drag and resize the blue blocks to change the filter frequency and width.
This sequencer is not using expensive bandpass filters. The oscillators are sine waves, which are frequency modulated with white noise. It may not sound inherently musical, but you can produce great hihats, bass thuds, and airy pitched noises.
Here’s the source code. (Requires Flash CS5 to compile.) Have fun!
First, do you know Lev Grossman? He’s an incredibly talented author who recently toured Portland. If you haven’t read his book The Magicians, then stop whatever you’re doing and procure a copy immediately. Without trying to spoil anything, the major college in the book is named the Brakebills College for Magical Pedagogy. Lev saw the Brakebills T-shirt that I designed for my sweetheart’s birthday present:
Lev blogged some of my other work, too (“the guy who does this has the enviably fake-sounding name of Zach Archer”). It’s true, I have an awesome pro wrestler name.
Second, my new iPhone app has landed in the App Store:
Twang is a handheld guitar. It’s easier to play than a real guitar, and is very expressive. Instead of using audio samples, Twang uses physical modelling techniques to create a more natural, dynamic sound. No two plucks are identical. Watch my grainy first video if you disbelieve.
In the next version of Twang, left-handed people will be able to switch Twang’s orientation, and serious musicians can dampen or mute strings with their fingers. And probably more! This version is already in development, and may be submitted in a week or two? Follow Control Z, Inc on Twitter if you have a ravenous thirst for updates!
Here’s something from the vaults. Aquasound was built with these requirements in mind:
Generate sounds that aquatic animals might make
Sounds can be “combined” somehow
Sounds can emote
This was never used in production. I wonder if I could turn this into something? Like a paid iPhone app? ;)
Double-click the envelopes to add/remove control points. Drag lines up & down to change their curviture. The best feature is the “Combine With” dropdown, which splices the current sound with your selection. Also the “Emote” menu will play sounds with different expression.
The audio algorithm is reverse-engineered from my beloved FS1R. I generated formants in two ways (toggle the “Tonal” checkbox to hear both), the “atonal” version is closer to ring modulation than actual formants. It’s more fun if you don’t understand what the controls are doing, but if you insist: Pitch controls the overall pitch of the sound. Freq controls the center frequency of the formant (like a bandpass filter). LFOFreq and LFOWeight control a low-frequency sine wave, which can be applied to other controls via their “___LFOAmt” curves. Amp is amplitude, Width is formant width (think: width of the bandpass filter), Skirt adds distortion. Each voice has two formant generators, check “Formant Active” to enable them.
My first iPhone app has been submitted to the app store for review! Metal Mouth is a text-to-speech synthesizer that mimics the talking devices of the 80’s (Speak & Spell, “Wizard needs food, badly”, etc.) The functionality is similar to my Synthetic Speech In Flash demo, but with many new features (male & female voices, auto-tune, pitch & time scratching) and a snappy interface with talking robots.
This took about 5 weeks to develop. Meanwhile, I’ve started another app, and I envision releasing Metal Mouth 2.0 in a few months, with more voices, and the ability to record audio.
In 1998, the Yamaha Corporation unleashed a product that was convoluted and bizarre like no other: The FS1R Synthesizer.
Like the era-defining DX7, the FS1R is an FM Synthesizer, but it boasts a massive 8 operators per voice, compared to 6 in the DX. And the FS1R sports a new toy, Formant Synthesis, capable of mimicking voices, human and otherwise! Waves and formants can modulate each other in 88 different configurations. Top that off with LFOs, filters, on-board effects… It’s so flexible, and so complicated. So much power.