In this book we introduce the background and mainstream methods of probabilistic modeling and
discriminative parameter optimization for speech recognition. The specific models treated in
depth include the widely used exponential-family distributions and the hidden Markov model. A
detailed study is presented on unifying the common objective functions for discriminative
learning in speech recognition namely maximum mutual information (MMI) minimum classification
error and minimum phone word error. The unification is presented with rigorous mathematical
analysis in a common rational-function form. This common form enables the use of the growth
transformation (or extended Baum-Welch) optimization framework in discriminative learning of
model parameters. In addition to all the necessary introduction of the background and tutorial
material on the subject we also included technical details on the derivation of the parameter
optimization formulas for exponential-family distributions discrete hidden Markov models
(HMMs) and continuous-density HMMs in discriminative learning. Selected experimental results
obtained by the authors in firsthand are presented to show that discriminative learning can
lead to superior speech recognition performance over conventional parameter learning. Details
on major algorithmic implementation issues with practical significance are provided to enable
the practitioners to directly reproduce the theory in the earlier part of the book into
engineering practice. Table of Contents: Introduction and Background Statistical Speech
Recognition: A Tutorial Discriminative Learning: A Unified Objective Function
Discriminative Learning Algorithm for Exponential-Family Distributions Discriminative
Learning Algorithm for Hidden Markov Model Practical Implementation of Discriminative
Learning Selected Experimental Results Epilogue Major Symbols Used in the Book and Their
Descriptions Mathematical Notation Bibliography