Monday, January 30, 2012
Wednesday, November 9, 2011
mask-bot: another talking head
Researchers in Germany and Japan (see Institute for Cognitive Systems) have collaborated to create this slightly creepy and cinematic talking head. The facial image is projected onto a mask to create the 3D likeness. Mask-bot doesn't respond through artificial intelligence, but through text-to-speech, voicing words typed into a keyboard.
Mask-bot is quite like Joseph Faber's Talking Machine, which I've written about previously. It's fascinating to see such similar results culled from technologies that are more than a century apart.
Labels:
history,
synthesized voices,
text-to-speech,
voice technology
Sunday, July 17, 2011
A Voice Like Chocolate
Chocolate is an oft-used metaphor for voice, suggesting a sound that's sultry, sexy, alluring. It's a romantic notion, a chocolate voice.
But what about an actual chocolate voice--a voice mediated by chocolate?
Found, a Scottish band, recently released their new single "Anti Climb Paint" on both vinyl and chocolate.
You only get a moment of the chocolatey music in this video, but it sounds surprisingly good. The non-chocolate song probably sounds better:
A chocolate record is obviously a novelty, but all I can say is that I love this idea of sound shaped by chocolate. A related intersection of food and sound is Heston Blumenthal's "Sound of the Sea", where a seafood dish is served with an ipod, the sounds of the sea evoking an experience, a memory. Chocolate will certainly transform sound, but I imagine that sound (as mediated by chocolate) would also transform the taste of that chocolate. Maybe a love would make chocolate taste just a little sweeter?

The chocolate record is not a new novelty. In 1903 (and for a few years afterward) the Stollwerck Chocolate Company made chocolate records, and the phonograph to play them. It's irresistible, I guess, this meeting of music and chocolate. I am confident that my voice would improve if mediated by chocolate.
Finally, another chocolate record maker. Pay particular attention to his mention of trying to make records from sausage. Hmmm. . .
But what about an actual chocolate voice--a voice mediated by chocolate?
Found, a Scottish band, recently released their new single "Anti Climb Paint" on both vinyl and chocolate.
You only get a moment of the chocolatey music in this video, but it sounds surprisingly good. The non-chocolate song probably sounds better:
A chocolate record is obviously a novelty, but all I can say is that I love this idea of sound shaped by chocolate. A related intersection of food and sound is Heston Blumenthal's "Sound of the Sea", where a seafood dish is served with an ipod, the sounds of the sea evoking an experience, a memory. Chocolate will certainly transform sound, but I imagine that sound (as mediated by chocolate) would also transform the taste of that chocolate. Maybe a love would make chocolate taste just a little sweeter?
The chocolate record is not a new novelty. In 1903 (and for a few years afterward) the Stollwerck Chocolate Company made chocolate records, and the phonograph to play them. It's irresistible, I guess, this meeting of music and chocolate. I am confident that my voice would improve if mediated by chocolate.
Finally, another chocolate record maker. Pay particular attention to his mention of trying to make records from sausage. Hmmm. . .
Labels:
musical voices
Monday, July 11, 2011
Euphonia--Joseph Faber's Talking Machine
A recent podcast from my beloved RadioLab explored a 16th century automaton, a monk that moves in circles, bowing and moving his mouth in silent prayers. (You can see the monk in action and learn more about his fascinating history by following the link). The monk's praying got me wondering about speaking automatons, early machines that vocalized. Turns out there is a long history of speaking automatons, a history that is intricately connected with Bell's development of the telephone and Edison's creation of the phonograph. Far too much to write about now, but I wanted to briefly describe one specific talking machine.
Jose
ph Faber's Amazing Talking Machine, also known as the Euphonia, attempted to recreate the physiology of the voice: the lungs, larynx, lips, tongue. Alfred Mayer, in an 1878 Popular Science article described the Euphonia's various components:
The Euphonia was only a partial automaton, as it required some manipulation of keyboard and bellows. The Euphonia could laugh, sing, talk, hiss (all with a distinctly German accent, as various reports on the invention were sure to note).
A London theater manager, John Hollingshead, described the Euphonia's voice as a sort of specter:
The Euphonia was part sideshow and part science. It was certainly a marvel of technical ingenuity, an invention that influenced Alexander Graham Bell's experiments with voice technology. Yet, it was greeted with a lackluster response by the public. Faber destroyed his machine in frustration over the poor reaction it received. He did later rebuild it to tour with P.T. Barnum.
To get a sense of what the Euphonia may have sounded like, listen to this recording of Martin Riches' Talking Machine (1989-1991). Riches machine is computer-controlled, but in its effort to re-create the human voice through bellows and shapes it seems reminiscent of Faber's Euphonia.
Read More
Hankins, Thomas L. and Robert J. Silverman. "Vox Mechanica: The History of the Speaking Machine." Instruments and the Imagination. p. 178-220.
Lindsay, David. "Talking Head." American Heritage of Invention & Technology. Summer 1997, p. 57-63.
Mayer, Alfred M. "On Edison's Talking Machine." Popular Science Monthly. 12 April 1878.
Jose
A vibrating ivory reed, of variable pitch, forms its vocal chords. There is an oral cavity, whose size and shape can be rapidly changed by depressing the keys on a key-board. A rubber tongue and lips make the consonants; a little windmill, turning in its throat, rolls the letter R, and a tube is attached to its nose when it speaks French. This is the anatomy of this really wonderful piece of mechanism.
The Euphonia was only a partial automaton, as it required some manipulation of keyboard and bellows. The Euphonia could laugh, sing, talk, hiss (all with a distinctly German accent, as various reports on the invention were sure to note).
A London theater manager, John Hollingshead, described the Euphonia's voice as a sort of specter:
A hoarse sepulchral voice came from the mouth of the figure, as if from the depths of a tomb. It wanted little imagination to make the very few visitors believe that the figure contained an imprisoned human--or half human--being, bound to speak slowly when tormented by the unseen power outside.
The Euphonia was part sideshow and part science. It was certainly a marvel of technical ingenuity, an invention that influenced Alexander Graham Bell's experiments with voice technology. Yet, it was greeted with a lackluster response by the public. Faber destroyed his machine in frustration over the poor reaction it received. He did later rebuild it to tour with P.T. Barnum.
To get a sense of what the Euphonia may have sounded like, listen to this recording of Martin Riches' Talking Machine (1989-1991). Riches machine is computer-controlled, but in its effort to re-create the human voice through bellows and shapes it seems reminiscent of Faber's Euphonia.
Read More
Hankins, Thomas L. and Robert J. Silverman. "Vox Mechanica: The History of the Speaking Machine." Instruments and the Imagination. p. 178-220.
Lindsay, David. "Talking Head." American Heritage of Invention & Technology. Summer 1997, p. 57-63.
Mayer, Alfred M. "On Edison's Talking Machine." Popular Science Monthly. 12 April 1878.
Labels:
history,
mechanical voices,
voice technology
Wednesday, June 8, 2011
the laryngophone and the singing guitar
Watch this clip. You won't be sorry.
Let's forget for a moment that Stringy, the singing guitar, would have made a perfect sidekick for Chucky. Let's instead talk about his voice. Stringy is an early (maybe the earliest?) example of a "talk box" technique--applying vocalization to an instrument, in this case a pedal steel guitar (by way of Stringy).
The man playing the steel guitar here is Alvino Rey, who was an early innovator of electronic instrumentation. He created a pickup prototype that was eventually used for Gibson's first electric guitar. (He is also the grandfather of Will and Win Butler from Arcade Fire. And for all of you Salt Lakers, he died here in 1994).
Stringy's "voice" is a combination of Rey's guitar and his wife, Luise King Rey, who is standing offstage mouthing the lyrics. The vibrations of her vocal chords are being amplified through a laryngophone, or throat mic. With a throat mic, as the name suggests, a small microphone held to the throat projects vibrations from the larynx.
The laryngophone was originally intended to allow for telephone communication in noisy environments--factories, night clubs, that sort of thing. By pressing a mic to the larynx, sound could be conveyed without interference from environmental, ambient noises. The laryngaphone was adapted for military use and diving. (It's currently very popular with paintball players). Recently, Life included the laryngophone as
one of "30 Dumb Inventions." Maybe the creepy Stringy would confirm this assessment, but I think it's a bit short-sighted. I can see myself benefiting from a throat mic on occasion. Perhaps the "dumb" label is linked to our discomfort with mechanized voices, at least in every day conversation. The image from Life shows a very ordinary looking phone--more unexpected than alternative communications for military, etc.
We have more tolerance for synthetic voices in music, it seems. In music, the voice is allowed to become an instrument rather than representative of a person. And so there is more room for experimentation, mechanization, play. I'll be writing more on music and voice technology in the future, but for now here is a bit more throat mic/ talk box action from Alvino Rey (without Stringy, I'm sorry to say).
Let's forget for a moment that Stringy, the singing guitar, would have made a perfect sidekick for Chucky. Let's instead talk about his voice. Stringy is an early (maybe the earliest?) example of a "talk box" technique--applying vocalization to an instrument, in this case a pedal steel guitar (by way of Stringy).
The man playing the steel guitar here is Alvino Rey, who was an early innovator of electronic instrumentation. He created a pickup prototype that was eventually used for Gibson's first electric guitar. (He is also the grandfather of Will and Win Butler from Arcade Fire. And for all of you Salt Lakers, he died here in 1994).
Stringy's "voice" is a combination of Rey's guitar and his wife, Luise King Rey, who is standing offstage mouthing the lyrics. The vibrations of her vocal chords are being amplified through a laryngophone, or throat mic. With a throat mic, as the name suggests, a small microphone held to the throat projects vibrations from the larynx.
one of "30 Dumb Inventions." Maybe the creepy Stringy would confirm this assessment, but I think it's a bit short-sighted. I can see myself benefiting from a throat mic on occasion. Perhaps the "dumb" label is linked to our discomfort with mechanized voices, at least in every day conversation. The image from Life shows a very ordinary looking phone--more unexpected than alternative communications for military, etc.We have more tolerance for synthetic voices in music, it seems. In music, the voice is allowed to become an instrument rather than representative of a person. And so there is more room for experimentation, mechanization, play. I'll be writing more on music and voice technology in the future, but for now here is a bit more throat mic/ talk box action from Alvino Rey (without Stringy, I'm sorry to say).
Labels:
musical voices,
voice technology
Friday, June 3, 2011
Honest Voices?
I just finished recording an introductory video for students in my online writing class. This will be the first interaction I have with my students, and I can't help worrying about how I will be perceived, how my voice will be perceived.
According to research from MIT's Human Dynamics Lab, I should be worried. Researchers at the lab, including its director Alex (Sandy) Pentland explore the "honest" signals that underlie everyday conversations: intonation, pace, amplitude. The argument is that these nonverbal vocal cues can tell us more about a person's mood and intention than their words can.
The research has some interesting applications, including the spin-off company Cogito Health which is developing software to gather diagnostic information from speech behavior. A healthcare provider could monitor a patient's mood by telephone in between visits, allowing for early intervention for depression. According to researchers, vocal signals "can be used to infer mood and predict behavior."
In part, this research seems intuitive. Certainly there are times when I've noted sadness in a friend's voice or been persuaded by the sound of enthusiasm. But these are "normal" voices. The research assumes consistency and reliability, linking the voice more to cognition/ emotion than to physiology. What about the damaged voice?
More than once, a fight (or should I say disagreement?) with my husband has escalated because I've sounded aggressive or dismissive. But the perceived emotion had little to do with my actual state of mind. The perception was the result of my failing voice, a voice that can't always convey nuances of intonation or amplitude. My husband knows me, knows my voice and its limitations. Yet, in those tense moments, it's easy to assume that my voice conveys more than my words can. The assumption is human, as the MIT research indicates, but does it follow that those nonverbal, vocal signals are more honest?
I find Pentland's research intriguing, but I'm curious about its limitations. The research trusts perception and the honesty of biology, but what happens when biology is impaired?
According to research from MIT's Human Dynamics Lab, I should be worried. Researchers at the lab, including its director Alex (Sandy) Pentland explore the "honest" signals that underlie everyday conversations: intonation, pace, amplitude. The argument is that these nonverbal vocal cues can tell us more about a person's mood and intention than their words can.
The research has some interesting applications, including the spin-off company Cogito Health which is developing software to gather diagnostic information from speech behavior. A healthcare provider could monitor a patient's mood by telephone in between visits, allowing for early intervention for depression. According to researchers, vocal signals "can be used to infer mood and predict behavior."
In part, this research seems intuitive. Certainly there are times when I've noted sadness in a friend's voice or been persuaded by the sound of enthusiasm. But these are "normal" voices. The research assumes consistency and reliability, linking the voice more to cognition/ emotion than to physiology. What about the damaged voice?
More than once, a fight (or should I say disagreement?) with my husband has escalated because I've sounded aggressive or dismissive. But the perceived emotion had little to do with my actual state of mind. The perception was the result of my failing voice, a voice that can't always convey nuances of intonation or amplitude. My husband knows me, knows my voice and its limitations. Yet, in those tense moments, it's easy to assume that my voice conveys more than my words can. The assumption is human, as the MIT research indicates, but does it follow that those nonverbal, vocal signals are more honest?
I find Pentland's research intriguing, but I'm curious about its limitations. The research trusts perception and the honesty of biology, but what happens when biology is impaired?
Labels:
voice and relationships,
voice research
Thursday, April 21, 2011
Subscribe to:
Posts (Atom)