Advances in Non-Linear Modeling for Speech Processing includes advanced topics in non-linear
estimation and modeling techniques along with their applications to speaker recognition.
Non-linear aeroacoustic modeling approach is used to estimate the important fine-structure
speech events which are not revealed by the short time Fourier transform (STFT). This
aeroacostic modeling approach provides the impetus for the high resolution Teager energy
operator (TEO). This operator is characterized by a time resolution that can track rapid signal
energy changes within a glottal cycle. The cepstral features like linear prediction cepstral
coefficients (LPCC) and mel frequency cepstral coefficients (MFCC) are computed from the
magnitude spectrum of the speech frame and the phase spectra is neglected. To overcome the
problem of neglecting the phase spectra the speech production system can be represented as an
amplitude modulation-frequency modulation (AM-FM) model. To demodulate the speech signal to
estimation the amplitude envelope and instantaneous frequency components the energy separation
algorithm (ESA) and the Hilbert transform demodulation (HTD) algorithm are discussed. Different
features derived using above non-linear modeling techniques are used to develop a speaker
identification system. Finally it is shown that the fusion of speech production and speech
perception mechanisms can lead to a robust feature set.