TC-Helicon Advances Voice Processing Technology with MATLAB and Simulink

“When the time comes to turn our algorithm ideas into prototype systems, Simulink is our main tool. Simulink makes it possible for us to create working prototypes that are very DSP friendly. We have found that the transition to DSP implementations from Simulink models is much easier than the transition from C/C++ implementations.”

Challenge

Develop voice processing technology that would match the sound of a human voice

Solution

Use MathWorks tools to develop algorithms, build a physical model of human sound production, and test the model in real time

Results

  • Accelerated development
  • Cost-effective implementation
  • Creative solutions validated in a matter of hours
The VoiceCraft Human Voice Modeling Card (top) and the VoicePrism vocal processor (bottom).

Voice processing technology offers both amateur and professional singers hundreds of ways to alter the way they sound—from adding breath, growl, and vibrato to altering resonance or inflection.

TC-Helicon is a joint venture of TC Electronics A/S and IVL Technologies Ltd. that is dedicated to advancing voice transformation technology. Its latest product is the VoiceCraft Human Voice Modeling Card, an upgrade card for the company’s debut product, the VoicePrism vocal processor. The VoiceCraft contains algorithms developed exclusively in MATLAB® and Simulink®. It produces dramatic changes in the human voice, such as giving it the raspy sound produced by a night of drinking whiskey and smoking cigarettes.

Challenge

TC-Helicon engineers set out to develop voice processing algorithms that could be run on a DSP. They needed to make the output authentic enough to bypass people’s built-in “artifact detectors” for the human voice. “Humans have specialized brain circuitry to deal with both generating and understanding vocalizations,” explains Dr. Peter Lupini, research manager at IVL. “I can make huge changes to a saxophone sound and people will tell me that they hear different saxophones. But if I make one tiny change to a human voice, people will instantly cry out, ‘That doesn’t sound human!’”

To ensure maximum authenticity, the TC– Helicon team needed to develop a physical model of sound production in humans. This model would comprise a source (the vocal cords) and a filter (the vocal tract), which could each be manipulated independently. It would become a template for the DSP coders to use. Creating the model is a complex process. “The voice is incredibly flexible—much more flexible than any man-made instrument,” Lupini explains. “It takes thousands of synchronized muscle movements just to make a single vowel sound.”

The engineers also needed to be able to test the model in real time. “There is simply no substitute for being able to listen in real time to an audio effect during algorithm development,” says Lupini. Many problems encountered in the development of professional audio products cannot be anticipated without real-time analysis—the singer needs to pick up the mike and try it out. Of course, this requirement has made the DSP implementation very expensive in the past because of the need to implement DSP code several times during the design cycle.

Solution

In the early stages of algorithm development, TC-Helicon used MATLAB and Signal Processing Toolbox™ to explore the underlying principles of voice production. “The MATLAB programming environment makes it very easy to extend the command set,” Lupini says. “Also, using the excellent visualization tools within MATLAB, we have found it fairly straightforward to create GUIs, some of which have become core tools within the research group. For example, we have one GUI that enables us to visualize the evolution of a human vocal tract response over time.”

During the critical testing period, the engineers used Simulink and Signal Processing Blockset™ to get the algorithms working in real time, enabling them to match the sound to that of a human.

Because making changes to a Simulink model is fast and easy, debugging time was considerably reduced. For example, engineers found that the vocal “growl” effect did not sound natural enough. From his observations of real singers, one of the engineers guessed that the growl was produced by a tightening of the singer’s throat. Using Simulink, he was able to test his theory in just one day. He then made changes to the model that resulted in coupling the onset of the growl effect with a corresponding constriction of the vocal tract model. The resultant output was significantly more authentic than the existing implementation, and the changes were put into effect on the DSP.

“DSP implementation is expensive, and trying out new ideas at that late stage is rarely approved,” says Lupini. “In the past, if a DSP engineer thought of an interesting improvement or enhancement to a product as it neared the final stages of development, it would have been nearly impossible to try the idea out. With Simulink, we can quickly test ideas, sometimes in a matter of hours. Our DSP engineers can be more creative, and as a result, the products are better.”

After real-time testing, the engineers used the Simulink model as a template to write the DSP code for the VoiceCraft Human Voice Modeling Card. VoiceCraft has received excellent customer feedback and is being used by several major bands.

Results

  • Accelerated development. Without Simulink, it is unlikely that the improvement in the vocal growl effect would have made it into this release of the product.

  • Cost-effective implementation. The ability to experiment with algorithms in real time during the development phase was essential to the successful creation of VoiceCraft. With Simulink, engineers were able to change control parameters as they worked, without having to return to the design and implementation stages.

  • Creative solutions validated in a matter of hours. “Simulink enhances the ability of engineers to be creative,” says Lupini. “We are able to try all kinds of things that would have been very, very difficult before.”