Synthetic Speech in Flash: the Source Code

Remember the Flash synthetic speech demo, which turned into a talking robot app? Here’s the source code: Download it!

SUPERCALIFRAGILISTICEXPEALIDOSHUS

Please note: This code comes with no warranty, nor support, whatsoever. None. Zip. Nada. If your talking robots become self-aware and enslave humanity, then I will not be held responsible. But if you’re in the mood for tinkering, here’s how it’s strung together:

  • The sound is generating using Linear Predictive Coding (“LPC”).
  • First, some Python code: The analyze_lpc.py script analyzes phonemes.dat (which is just a headerless version of phonemes.aif). Individual phonemes are separated by moments of silence, so the script splits the sound file on those. Each phoneme is converted to LPC data, using code that I ported from the rt_lpc project. I felt like I understood the mathematics 3 years ago, but I doubt I could explain it today.
  • Now, in Flash: Launch the DictCompressor application, and watch the trace messages. Click the screen to open the browser window, then select your cmudict___.txt pronouncing dictionary. (You can obtain the latest CMUdict here.) Flash will convert this to a (smaller) cmudict.dat file, which is what LPCsynth.swf loads.
  • LPCsynth is the application that talks. The LPCSynthHarness.sayItNow() method creates an array of LPCFrames, which are “spoken” in the sampleData() method.  This was never intended for public distribution, so the code is not exactly stellar (the talking bit should be extracted into its own class).

Is this interesting? Did your Flash Player become self-aware after hearing its own voice? Let me know!

It’s funny, originally this was intended for the controlzinc.com website. The robot voice would sing as you clicked, crooning about your mousing habits. I still can’t decide if that idea was brilliant, or terrible.

Son of Strange Attractors

I rewrote my strange attractor generator, in Flash:

Try it. Click to generate new attractors.

The attractor coefficients are still chosen randomly. But now, attractors that explode/collapse are rejected. Also, attractors that create “boring” shapes (by drawing the same pixels repeatedly) are discarded. It’s a little slow, but I’m sure the speed could be improved using Pixel Bender.

Also, here’s the source code. (Compile with Flash CS5.)

A Formant Sequencer in Flash

I started an open source project: fseq-flash is a formant sequence editor. Current features include:

  • Import AIFF files
  • Audition and edit formant sequences in real time
  • Vowel- and function-drawing tools
  • Export .syx files for the Yamaha FS1R

Click to launch. Press the space bar to play the sound.

You can push your formant sequence to the Yamaha FS1R, using software such as K_Take’s FS1R Editor. Click the “Save .syx” button, and follow the instructions in K_Take’s documentation. This is a lot of fun, and breathes new life into the FS1R.

This project became much deeper than anticipated! The code includes FFT analysis (thanks Gerry Beauregard), pitch detection, a formant detection algorithm, and an AIFF parser to read AIFF files. The interface was a challenge to design and implement, and there are still many unfinished features.

My energy is shifting to other work, so I’ll enhance fseq-flash when time permits.

Filtered Noise Sequencer

Here’s something fun — I made a 16-step sequencer in Flash, that plays filtered noise (or sine waves, when the filter is narrow):


Filtered Noise Sequencer

Drag and resize the blue blocks to change the filter frequency and width.

This sequencer is not using expensive bandpass filters. The oscillators are sine waves, which are frequency modulated with white noise. It may not sound inherently musical, but you can produce great hihats, bass thuds, and airy pitched noises.

Here’s the source code. (Requires Flash CS5 to compile.) Have fun!

Aquatic Sound Generator in Flash

Here’s something from the vaults. Aquasound was built with these requirements in mind:

  • Generate sounds that aquatic animals might make
  • Sounds can be “combined” somehow
  • Sounds can emote

This was never used in production. I wonder if I could turn this into something? Like a paid iPhone app? ;)

Double-click the envelopes to add/remove control points. Drag lines up & down to change their curviture. The best feature is the “Combine With” dropdown, which splices the current sound with your selection. Also the “Emote” menu will play sounds with different expression.

The audio algorithm is reverse-engineered from my beloved FS1R. I generated formants in two ways (toggle the “Tonal” checkbox to hear both), the “atonal” version is closer to ring modulation than actual formants. It’s more fun if you don’t understand what the controls are doing, but if you insist: Pitch controls the overall pitch of the sound. Freq controls the center frequency of the formant (like a bandpass filter). LFOFreq and LFOWeight control a low-frequency sine wave, which can be applied to other controls via their “___LFOAmt” curves. Amp is amplitude, Width is formant width (think: width of the bandpass filter), Skirt adds distortion. Each voice has two formant generators, check “Formant Active” to enable them.

May all your bloops and crackles be happy ones!

Flash 3D: A change of heart?

Yesterday, I posted a damning critique of Flash’s native 3D.

Today I noticed that if you right-click on yesterday’s SWF and show the redraw regions, you can see that it’s redrawing the contents of the entire stage, even though I put the scene in a scrollRect. Is it seriously rendering a scene that’s thousands of pixels wide before displaying it ?!?!?!? Oh, no. No they DIDN’T.

Today I ported the scene to the Away3D rendering engine. Here’s the result:

It’s beautiful, and provides access to low-level drawing routines, light sources, normal maps, … It was speedy at first, then slowed down considerably when I added the glowing floors. (Each glow is 16+ triangles right now, for various reasons including: I can’t render objects in my own custom order.) This makes yesterday’s version look performant, I’m reluctant to admit.

Possible next steps:

  • Reduce the native 3D rendering area, see if performance improves?
  • Grow beyond Flash, embrace the future and try Unity 3D?
  • Dump this project, finish that iPhone game I started, make a million dollars in 2 weeks?

Flash 3D makes me sad

I’ve been dabbling with Flash 10’s native 3D support. Try my engine:

Click to set the focus. Use the arrow keys to move. Touch blocks to illuminate.

I’m disappointed with two things:

1). How much time is required to create a 3D engine, even a grid-based one like mine. I’ve been wrestling this project for 4+ hours every day, for a week. I feel like I must be lagging behind, but there are ten thousand things that will go wrong when developing in 3D. The paradigm is uniquely punishing, there are always edge cases where some polygons aren’t drawn correctly. This project hasn’t been a joy.

Also:

2). Flash’s native 3D is not suited for a high-performance application like this one. It would be fine if I was only spinning a few DisplayObjects in space. However, the scene above displays up to 125 Bitmaps simultaneously. (Light all 25 bulbs (3 Bitmaps each), stand in the corner facing them, and the 25-segment walls.) 125 Bitmaps would be child’s play in OpenGL. But after you light a few blocks, Flash Player chokes pretty hard.

Here’s another version that uses a BlendMode on the lightbulbs. It looks great, but its performance is even less acceptable.

Here’s an early version that uses my own 3D computations, and the Graphics API. Also it has a limited field of view, which I widened for the latest builds. The performance is surprisingly high. I abandoned my custom 3D when I reached this point; drawing lines around each cube face was expensive, so I switched to Bitmaps, and the native 3D.

The cube faces are set to width & height of 100. However, the bitmaps are higher resolution, a 200×200 region is shown. They’re being downsampled at 100×100 before they’re rendered, not by my choice.

At runtime, I get periodic warnings like these:

Warning: 3D DisplayObject will not render. Its dimensions (8238, 1628) are too large to be drawn.

What?! How is this happening? I swear that any blocks behind the camera are being removed from the Stage. (Actually, this is difficult to verify. If I shrink the scene, Flash magically applies the 3D perspective with a weird projection, and distorts everything in Lovecraftian dimensions.) Please, Adobe, tell me that you’re not rendering the scene at 8000 pixels wide, then scaling it down to my 700×400 window, frame after frame?

Also note that you, the developer, are responsible for drawing the DisplayObjects in the correct depth order (farthest to nearest), Flash doesn’t handle it automatically. This is known as “2.5D“, and it’s wildly inconvenient.

So, I’m pretty disappointed with Flash 10’s native 3D. Even with my limited 3D experience, I dislike how it renders the scene (I’m not alone in this) and the performance is obviously sub-par. This technology will not bring 3D games to the web, it cannot.

I need to decide whether to endure its shortcomings for 4 more weeks, or if I should abandon this project altogether. There are moments when you realize you’ve outgrown something you used to love, and this may be one of mine.

Synthetic Speech in Flash

Recently, I learned about Linear Predictive Coding (“LPC”). This technique is used in classic arcade games (such as Gauntlet) and the Speak & Spell to synthesize speech.

Here’s my first attempt at LPC speech in Flash: (click & explore)

It’s great, except for one tiny problem: It sounds horrific. Can you feel the cold, robotic love? This voice will stalk your nightmares.

The phonemes were derived from an unrehearsed recording of my voice. I’m confident that it can be improved. Note that direct LPC encodings of my voice, such as this one, sound more acceptable.

EDIT: I made an iPhone version, “Metal Mouth”, with lots of features. Here it is on YouTube and the iTunes Store!

EDIT #2: The source code is available here.

Fractal Transform, made with Pixel Bender

I made a Pixel Bender filter that performs Julia Set transformations on images. It looks great when it animates, the colors morph and twist like mathematical slime. Try it (Flash Player 10 required): JuliaTile.swf

Source code: julia_tile_src.zip. Pixel Bender code is in src/shader/.

The default image is Seattle’s Space Needle. You can upload custom images. Very large images may set your processor on fire.

Continue reading