Blog Archives

Detecting Inaudible Vocal Organ Changes Through Glottal Inverse Filtering

The aim of this study was to investigate if there were objective quantities extracted from the speech pressure waveforms that underlay inaudible changes in the symptoms of the vocal organ. This was done through analyzing 180 voice samples obtained from nine subjects (five females and four males) before and after exposure to a placebo substance (lactose) and an organic dust substance. Acoustical analysis of the voice samples was achieved by using glottal inverse filtering. Results showed that the values of primary open quotient and primary speed quotient changed significantly (P < 0.05) as did the amplitude quotient (P < 0.01). Exposure to lactose resulted in significant changes of secondary open quotient (P < 0.05) but opposite to effects found for exposure to organic dust. Modeling of the vocal tract into cross-sectional planes revealed that the immediate plane above the vocal folds correlates inversely with the feeling that voice is tense, or feeling the need to make an effort when speaking in addition having a feeling of shortness of breath or the need to gasp for air. Such results may point to acoustically detected subclinical changes in the vocal organ that the subject him/herself feels while they remain perceptually undetected by others.

from the Journal of Voice

Filtering to Match Hearing Aid Insertion Gain to Individual Ear Acoustics

When hearing aid gain is prescribed by software, gain is calculated based on the average acoustics for the age of patient, gender, mold type, and so on. The acoustics of the individual’s ear often vary from the average values, so there will be a mismatch between the prescribed gain and the real-ear gain. Real-ear measurement can be used to verify the gain and adjust it to meet targets, but the quality of the match will be limited by the number of channels and the flexibility of the hearing aid. A potential way to improve this process is to generate a filter that compensates for variations in real-ear insertion gain due to individual ear acoustics. Such a filter could be included in the processing path of a digital hearing aid. This article describes how such a filter can be generated using the windowing method, and the principle is demonstrated in a real ear. The approach requires communication between the real-ear measurement and hearing aid programming software. A finite impulse response filter with group delay just over 2 ms matched insertion gain to target values within the acceptable tolerance defined by British Society of Audiology guidelines.

from Trends in Amplification

What do male singers mean by modal and falsetto register? An investigation of the glottal voice source

The voice source differs between modal and falsetto registers, but singers often try to reduce the associated timbral differences, some even doubting that there are any. A total of 54 vowel sounds sung in falsetto and modal register by 13 male more or less experienced choir singers were analyzed by inverse filtering and electroglottography. Closed quotient, maximum flow declination rate, peak-to-peak airflow amplitude, normalized amplitude quotient, and level difference between the two lowest source spectrum partials were determined, and systematic differences were found in all singers, regardless of experience of singing. The observations seem compatible with previous observations of thicker vocal folds in modal register.

from Logopedics Phoniatrics Vocology

Perception of Emotional Valences and Activity Levels from Vowel Segments of Continuous Speech

This study aimed to investigate the role of voice source and formant frequencies in the perception of emotional valence and psychophysiological activity level from short vowel samples (150 milliseconds). Nine professional actors (five males and four females) read a prose passage simulating joy, tenderness, sadness, anger, and a neutral emotional state. The stress carrying vowel [a:] was extracted from continuous speech during the Finnish word [ta:k:ahan] and analyzed for duration, fundamental frequency (F0), equivalent sound level (Leq), alpha ratio, and formant frequencies F1–F4. Alpha ratio was calculated by subtracting the Leq (dB) in the range 50 Hz–1 kHz from the Leq in the range 1–5 kHz. The samples were inverse filtered by Iterative Adaptive Inverse Filtering and the estimates of the glottal flow obtained were parameterized with the normalized amplitude quotient (NAQ = fAC/(dpeakT)). Fifty listeners (mean age 28.5 years) identified the emotional valences from the randomized samples. Multinomial Logistic Regression Analysis was used to study the interrelations of the parameters for perception. It appeared to be possible to identify valences from vowel samples of short duration (150 milliseconds). NAQ tended to differentiate between the valences and activity levels perceived in both genders. Voice source may not only reflect variations of F0 and Leq, but may also have an independent role in expression, reflecting phonation types. To some extent, formant frequencies appeared to be related to valence perception but no clear patterns could be identified. Coding of valence tends to be a complicated multiparameter phenomenon with wide individual variation.

from the Journal of Voice

Monopitched Expression of Emotions in Different Vowels

from Folia Phoniatrica et Logopaedica

Fundamental frequency (F0) and intensity are known to be important variables in the communication of emotions in speech. In singing, however, pitch is predetermined and yet the voice should convey emotions. Hence, other vocal parameters are needed to express emotions. This study investigated the role of voice source characteristics and formant frequencies in the communication of emotions in monopitched vowel samples [a:], [i:] and [u:]. Student actors (5 males, 8 females) produced the emotional samples simulating joy, tenderness, sadness, anger and a neutral emotional state. Equivalent sound level (Leq), alpha ratio [SPL (1-5 kHz) – SPL (50 Hz-1 kHz)] and formant frequencies F1-F4 were measured. The [a:] samples were inverse filtered and the estimated glottal flows were parameterized with the normalized amplitude quotient [NAQ = fAC/(dpeakT)]. Interrelations of acoustic variables were studied by ANCOVA, considering the valence and psychophysiological activity of the expressions. Forty participants listened to the randomized samples (n = 210) for identification of the emotions. The capacity of monopitched vowels for conveying emotions differed. Leq and NAQ differentiated activity levels. NAQ also varied independently of Leq. In [a:], filter (formant frequencies F1-F4) was related to valence. The interplay between voice source and F1-F4 warrants a synthesis study.