Immediately following the Second World War between 1947 and 1955 several classic papers
quantified the fundamentals of human speech information processing and recognition. In 1947
French and Steinberg published their classic study on the articulation index. In 1948 Claude
Shannon published his famous work on the theory of information. In 1950 Fletcher and Galt
published their theory of the articulation index a theory that Fletcher had worked on for 30
years which integrated his classic works on loudness and speech perception with models of
speech intelligibility. In 1951 George Miller then wrote the first book Language and
Communication analyzing human speech communication with Claude Shannon's just published theory
of information. Finally in 1955 George Miller published the first extensive analysis of phone
decoding in the form of confusion matrices as a function of the speech-to-noise ratio. This
work extended the Bell Labs' speech articulation studies with ideas from Shannon's Information
theory. Both Miller and Fletcher showed that speech as a code is incredibly robust to
mangling distortions of filtering and noise. Regrettably much of this early work was forgotten.
While the key science of information theory blossomed other than the work of George Miller it
was rarely applied to aural speech research. The robustness of speech which is the most
amazing thing about the speech code has rarely been studied. It is my belief (i.e.
assumption) that we can analyze speech intelligibility with the scientific method. The
quantitative analysis of speech intelligibility requires both science and art. The scientific
component requires an error analysis of spoken communication which depends critically on the
use of statistics information theory and psychophysical methods. The artistic component
depends on knowing how to restrict the problem in such a way that progress may be made. It is
critical to tease out the relevant from the irrelevant and dig for the key issues. This will
focus us on the decoding of nonsense phonemes with no visual component which have been mangled
by filtering and noise. This monograph is a summary and theory of human speech recognition. It
builds on and integrates the work of Fletcher Miller and Shannon. The long-term goal is to
develop a quantitative theory for predicting the recognition of speech sounds. In Chapter 2 the
theory is developed for maximum entropy (MaxEnt) speech sounds also called nonsense speech. In
Chapter 3 context is factored in. The book is largely reflective and quantitative with a
secondary goal of providing an historical context along with the many deep insights found in
these early works.