How the brain distinguishes music from speech

Summary: A new study reveals how our brain differentiates between music and speech using simple acoustic parameters. The researchers found that slower, steady sounds are perceived as music, while faster, irregular sounds are perceived as speech.

These findings could optimize therapeutic programs for language disorders such as aphasia. The research provides a deeper understanding of auditory processing.

Key facts:

  • Simple parameters: The brain uses basic acoustic parameters to distinguish music from speech.
  • Therapeutic potential: The findings could improve therapies for language disorders such as aphasia.
  • Research details: The study included more than 300 participants who listened to synthesized audio clips.

Source: NYU

Music and speech are among the most common types of sounds we hear. But how do we find out what we think are the differences between them?

An international team of researchers mapped this process through a series of experiments – providing insights that offer potential means to optimize therapeutic programs that use music to restore the ability to speak while addressing aphasia.

Knowing how the human brain differentiates between music and speech could potentially benefit people with hearing or language disorders such as aphasia, the authors note. Credit: Neuroscience News

This language disorder affects more than 1 in 300 Americans each year, including Wendy Williams and Bruce Willis.

“Although music and speech differ in many ways, from pitch to timbre to sound structure, our results show that the auditory system uses surprisingly simple acoustic parameters to distinguish between music and speech,” explains Andrew Chang, a postdoctoral researcher at New York University. Department of Psychology and lead author of the article that appears in the journal PLOS Biology.

“Overall, slower and steady sound clips of pure noise sound more like music, while faster and irregular clips sound more like speech.”

Scientists measure the speed of signals with precise units of measurement: Hertz (Hz). A higher number of Hz means more occurrences (or cycles) per second than a lower number. For example, humans typically walk at a pace of 1.5 to 2 steps per second, which is 1.5-2 Hz.

Stevie Wonder’s 1972 hit “Superstition” has a time signature of approximately 1.6 Hz, while Anna Karina’s 1967 hit “Roller Girl” has a frequency of 2 Hz. In contrast, speech is usually two to three times faster than at 4-5 Hz.

It has been well documented that a song’s loudness, or loudness, over time – what is known as “amplitude modulation” – is relatively stable at 1-2 Hz. In contrast, the amplitude modulation of speech is typically 4-5 Hz, meaning that its volume changes frequently.

Despite the ubiquity and familiarity of music and speech, scientists previously lacked a clear understanding of how we effortlessly and automatically identify sound as music or speech.

To better understand this process in their PLOS Biology Chang and colleagues conducted a series of four experiments in which more than 300 participants listened to a series of audio segments of synthesized music- and speech-like noise with varying rates and regularity of amplitude modulation.

Audio sound clips only allowed volume and speed detection. Participants were asked to judge whether these ambiguous sound clips, which they were told were music or speech masked in noise, sounded like music or speech.

Observing the pattern of participants sorting hundreds of noise clips as either music or speech revealed how much each rate and/or regularity affected their judgments between music and speech. It’s the auditory version of “seeing faces in the clouds,” the researchers conclude: If there’s a feature in the sound wave that matches listeners’ idea of ​​what music or speech should sound like, even a clip of white noise can sound like music or speech. .

The results showed that our auditory system uses surprisingly simple and basic acoustic parameters to distinguish between music and speech: to participants, clips with slower frequencies (<2Hz) and more regular amplitude modulation sounded more like music, while clips with higher frequencies (~4Hz) and more irregular amplitude modulation sounded more like music. the modulation sounded more like speech.

Knowing how the human brain differentiates between music and speech could potentially benefit people with hearing or language disorders such as aphasia, the authors note.

For example, melodic intonation therapy is a promising approach to train people with aphasia to sing what they want to say, using their intact “musical mechanisms” to bypass impaired speech mechanisms. Therefore, knowing what makes music and speech similar or different in the brain can help design more effective rehabilitation programs.

Additional authors of the paper were Xiangbin Teng of the Chinese University of Hong Kong, M. Florencia Assaneo of the National Autonomous University of Mexico (UNAM), and David Poeppel, a professor in NYU’s Department of Psychology and executive director of the Ernst Strüngmann Institute for Neuroscience in Frankfurt, Germany.

Funding: The research was supported by a grant from the National Institute on Deafness and Other Communication Disorders, part of the National Institutes of Health (F32DC018205), and Leon Levy Scholarships in Neuroscience.

About these auditory neuroscience research reports

Author: James Devitt
Source: NYU
Contact: James Devitt – NYU
Picture: Image is credited to Neuroscience News

Original Research: Open access.
“The Human Auditory System Uses Amplitude Modulation to Distinguish Music from Speech” by Andrew Chang et al. PLOS Biology


Abstract

The human auditory system uses amplitude modulation to distinguish music from speech

Music and speech are complex and distinct auditory signals that are both fundamental to human experience. The mechanisms underlying each domain are widely investigated.

But what perceptual mechanism converts sound into music or speech and how basic acoustic information is necessary to distinguish between them remain open questions.

Here, we hypothesized that amplitude modulation (AM) of sound, a fundamental temporal acoustic property driving the auditory system across levels of processing, is critical for distinguishing between music and speech.

Specifically, in contrast to paradigms using naturalistic acoustic signals (which can be difficult to interpret), we used a noise probing approach to unravel the auditory mechanism: If AM frequency and regularity are critical for the perceptual discrimination of music and speech, artificial judgment of noise- synthesized ambiguous audio signals should be consistent with their AM parameters.

In 4 experiments (N = 335), signals with a higher AM peak frequency tend to be judged as speech, lower as music. Interestingly, this principle is consistently used by all listeners for speech judgments, but only by musically sophisticated listeners for music.

In addition, signals with more regular AM are judged as music before speech, and this feature is more important for music judgment, regardless of musical sophistication.

The data suggest that the auditory system can rely on a low-level acoustic property as basic as AM to distinguish music from speech, a simple principle that prompts both neurophysiological and evolutionary experiments and speculation.

Leave a Comment