RAS PhysiologyСенсорные системы Sensory Systems

  • ISSN (Print) 0235-0092
  • ISSN (Online) 3034-5936

Identification of speaker gender by voice characteristics under background of multi-talker noise

PII
10.31857/S0235009224020041-1
DOI
10.31857/S0235009224020041
Publication type
Article
Status
Published
Authors
Volume/ Edition
Volume 38 / Issue number 2
Pages
54-61
Abstract
Psychophysical methods were used to study the features of identifying the gender of a speaker based on voice characteristics under conditions of speech-like interference and stimulation through headphones. We used a set of speech signals and multi-talker noise from experiments in a free sound field – a spatial scene (Andreeva et al., 2019). The set included 8 disyllabic words spoken by 4 speakers: 2 male and 2 female voices with average fundamental frequencies of 117, 139, 208 and 234 Hz. Multi-talker noise represented the result of mixing all audio files (8 words * 4 speakers). The signal-to-noise ratio was 1:1, which subjectively corresponded to the maximum noise level in the spatial scene (SNR = –14 dB). Adult subjects from 17 to 57 years old (n = 42) participated in the experiments. Additionally, 3 age subgroups were identified: 18.6±1.5 years (n = 27); 28±4.1 years (n = 7); 46±5.4 years (n = 8). All subjects had normal hearing. The results of the study and their comparison with the data of mentioned work confirmed the importance of voice characteristics for the auditory analysis of complex spatial (free sound field) and non-spatial (headphones) scenes, and also demonstrated the role of mechanisms of the masking and binaural perception, in particular, the high-frequency mechanism of spatial hearing. A relation the perceptual assessment of the gender by voice in noise and the age of the subjects and the gender of the speakers (male/female voice) was also found. The results are of practical importance for the organization of hearing-speech training, early detection of speech hearing interference immunity impairment, as well as the development of noise-resistant systems for automatic speaker verification and hearing aid technologies.
Keywords
восприятие голос гендерная особенность имитация сложной сцены шум многоголосие пространственная акустическая сцена
Date of publication
14.09.2025
Year of publication
2025
Number of purchasers
0
Views
6

References

  1. 1. Балякова А.А., Лабутина О.В., Медведев И.С., Пак С.П., Огородникова Е.А. Особенности распознавания речевых сигналов в условиях голосовой конкуренции в норме и при нарушениях слухоречевой функции. Сенсорные системы. 2023. Т. 37. № 4. С. 342–347. DOI: 10.31857/S0235009223040029
  2. 2. Королева И.В. Основы аудиологии и слухопротезирования. СПб: КАРО, 2022. 448 с.
  3. 3. Королева И.В., Огородникова Е.А., Пак С.П., Левин С.В., Балякова А.А., Шапорова А.В. Методические подходы к оценке динамики развития процессов слухоречевого восприятия у детей с кохлеарными имплантами. Российская оториноларингология. 2013. № 3. С. 75–85.
  4. 4. Лопотко А.И., Бердникова И.П., Бобошко М.Ю., Журавлева Т.А., Журавский С.Г., Квасова Т.В., Ломоватская Л.Г., Мальцева Н.В., Молчанов А.П., Рындина А.М., Савенко И.В., Слесаренко Н.П., Солдатова Г.Ш. Практическое руководство по сурдологии. СПб: Диалог, 2008. 273 с.
  5. 5. Ляшевская О.Н., Шаров С.А. Частотный словарь современного русского языка (на материалах Национального корпуса русского языка). М.: Азбуковник, 2009. 1090 с.
  6. 6. Огородникова Е.А., Лабутина О.В., Андреева И.Г., Гвоздева А.П., Баулин Ю.А. Фактор просодики в восприятии коммуникативной сцены с пространственным разделением источников речи и речеподобной помехи. Тезисы докладов Международной конференции “Лингвистический форум 2020: Язык и искусственный интеллект” / Под ред. А.А. Кибрика, В.Ю. Гусева, Д.А. Залманова. М.: Институт языкознания РАН, 2020. С. 127–128.
  7. 7. Сапогова Е.Е. Психология развития человека. М.: Аспект пресс, 2001. 460 с.
  8. 8. Хухлаева О.В. Психология развития. Молодость, зрелость, старость. М.: Академия, 2006. 208 с.
  9. 9. Andreeva I.G. Spatial selectivity of hearing in speech recognition in speech-shaped noise environment. Hum. Physiol. 2018. V. 44(2). P. 226–236. https://doi.org/10.1134/S0362119718020020
  10. 10. Andreeva I.G., Dymnikowa M., Gvozdeva A.P., Ogorodnikova E.A., Pak S.P. Spatial separation benefit for speech detection in multi-talker babble-noise with different egocentric distances. Acta Acustica united with Acustica. 2019. V. 105. № 3. P. 484–491. https://doi.org/10.3813/AAA.919330
  11. 11. Balling L.W., Mølgaard L.L., Townend O., Nielsen J.B.B. The collaboration between hearing aid users and artificial intelligence to optimize sound. Seminars in Hearing. 2021. № 42(3). P. 282–294. https://doi.org/10.1055/s-0041-1735135
  12. 12. Bharathi R., Nalina H.D. Survey of Recent Advances in Hearing Aid Technologies and Trends. International Research Journal on Advanced Engineering Hub. 2024. V. 2. I. 2. P. 303–308. https://doi.org/10.47392/IRJAEH.2024.0046
  13. 13. Bregman A.S. Auditory scene analysis: the perceptual organization of sound. Cambridge: MIT Press, 1990.
  14. 14. Bronkhorst A.W. The cocktail-party problem revisited: Early processing and selection of multi-talker speech. Attention, Perception & Psychophysics. 2015. V. 77(5). P. 1465–1487. https://doi.org/10.3758/s13414-015-0882-9.
  15. 15. Cherry E.C. Some experiments on the recognition of speech, with one and with two ears. J. Acoust. Soc. Am. 1953. V. 25. № 5. P. 975.
  16. 16. Darvin C.J., Brungart D.S., Simpson B.D. Effects of fundamental frequency and vocal-tract length changes on attention to one or two simultaneous talkers. J. Acoust. Soc. Am. 2003. V. 114. P. 2913–2922.
  17. 17. Davis A., McMahon C.M., Pichora-Fuller K.M., Russ S., Lin F., Olusanya B.O., Chadha S., Tremblay K.L. Aging and Hearing Health: The Life-course Approach. Gerontologist. 2016. № 56 (Suppl 2). Р. 256–267. https://doi.org/10.1093/geront/gnw033.
  18. 18. Fostick L., Ben-Artzi E., Babkoff H. Aging and speech perception: beyond hearing threshold and cognitive ability. J. Basic Clin Physiol Pharmacol. 2013. № 24(3). Р. 175–183. https://doi.org/10.1515/jbcpp-2013-0048.
  19. 19. Gutschalk A., Dykstra A.R. Functional imaging of auditory scene analysis. Hear. Res. 2014. V. 307. P. 98.
  20. 20. Lesica N.A., Mehta N., Manjaly J.G., Deng L., Wilson B.S., Zeng F.-G. Harnessing the power of artificial intelligence to transform hearing healthcare and research. Nat. Mach. Intell. 2021. № 3. Р. 840–849. https://doi.org/10.1038/s42256-021-00394-z
  21. 21. Moore B.C.J. An Introduction to the Psychology of Hearing. Leiden. Brill., 2012. 442 p.
  22. 22. Musiek F.E., Chermak G.D. Handbook of central auditory processing disorder. San Diego. Plural Publishing, 2014. V. 1. Auditory neuroscience and diagnosis. 768 p.
  23. 23. Pernet C.R., Belin P. The Role of Pitch and Timbre in Voice Gender Categorization. Front. Psychol. 2012. Sec. Perception Science. V. 3. https://doi.org/10.3389/fpsyg.2012.00023
  24. 24. Popper A.N., Fay R.R. (Eds). Perspectives on auditory research. Springer handbook of auditory research. 2014. 680 p.
  25. 25. Shamma S.A., Elhilali M., Micheyl C. Temporal coherence and attention in auditory scene analysis. Trends Neurosci. 2011. V. 34. P. 114.
  26. 26. Smirnova V.A., Labutina O.V., Gvozdeva A.P. Chapter 9: Speech detection in spatially distributed speech-like noise. In: Neural Networks and Neurotechnologies (eds: Yu. Shelepin, E. Ogorodnikova, N. Solovyev, E. Yakimova). St. Petersburg, VVM, 2019. P. 52–60.
  27. 27. Weston P., Hunter M.D., Sokhi D.S., Wilkinson I. Discrimination of voice gender in the human auditory cortex. NeuroImage. 2014. V. 105. P. 208–214. https://doi.org/10.1016/j.neuroimage.2014.10.056
QR
Translate

Индексирование

Scopus

Scopus

Scopus

Crossref

Scopus

Higher Attestation Commission

At the Ministry of Education and Science of the Russian Federation

Scopus

Scientific Electronic Library