Voice Perception: Basic Parameters (2012-2015)

Stefan R. Schweinberger
Coworkers: Verena G. Skuk

The human voice carries a wealth of social information including emotion, gender, age or person identity, yet relatively little research has been devoted to processes mediating auditory perception of people via their voices. In the first funding period, we initially explored the role of attention for explicit and implicit voice memory for famous voices. We then conducted a substantial series of experiments on adaptation-induced aftereffects in voice perception. Building on this successful research, on further substantial and directly relevant work we conducted in the first funding period, and on ample methodological expertise acquired by the researchers in the project, we will pursue three main issues in voice perception: (1) First, exploiting the fact that new voice morphing software TANDEM-STRAIGHT permits independent morphing across each of five acoustic parameters (F0, formant frequencies, spectrum level information, aperiodicity, and time), we investigate the differential contribution of these acoustic parameters to the perception of speaker gender and age. (2) Second, in an attempt to delineate individual contributions of basic low-level information to adaptation, we use single parameter-modified adaptor voices to create aftereffects in the perception of speaker gender and age. (3) Systematic research using larger samples of personally familiar voices for recognition is almost non-existent. We will test twelfth grade secondary school pupils as a homogeneous group to create a unique database that will allow us to assess the relative contribution of acoustic parameters, speech type, perceived voice characteristics (such as rated distinctiveness of a voice), and personal contact to the accuracy in individual voice recognition. This sample will also allow us to probe gender differences (both on the speaker and listener level), and to assess own voice recognition. (4) Finally, we plan to continue earlier work on voice averaging using full sentences to test a prototype account of familiar voice representation, and we will perform an EEG study investigating induced oscillatory responses as potential correlates of voice familiarity. Overall, we expect that this project continues to substantially improve our understanding of basic acoustic, perceptual and neuronal processes involved in human voice perception.

  • Skuk, V. G., & Schweinberger, S. R. (2014). Influences of fundamental frequency, formant frequencies, aperiodicity and spectrum level on the perception of voice gender. Journal of Speech, Language, and Hearing Research, 57(1), 285-296, doi:10.1044/1092-4388(2013/12-0314)
  • Skuk, V. G., & Schweinberger, S. R. (2013). Gender differences in familiar voice identification. Hearing Research, 296,  131-140, doi:10.1016/j.heares.2012.11.004
  • Kawahara, H., Morise, M., Banno, H., & Skuk, V. G. (2013, October 29-November 1). Temporally variable multi-aspect N-way morphing based on interference-free speech representations. Paper presented at the Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2013 Asia-PacificKaohsiung. doi:10.1109/APSIPA.2013.6694355

Figure left: Waveform and spectrogram along a male-female voice morph continuum of the utterance /aba/, and face-morphs along a male-female morph continuum. All continua are morphed from male to female in steps of 20% (morph level).