This book provides an overview of how comparable corpora can be used to overcome the lack of
parallel resources when building machine translation systems for under-resourced languages and
domains. It presents a wealth of methods and open tools for building comparable corpora from
the Web evaluating comparability and extracting parallel data that can be used for the machine
translation task. It is divided into several sections each covering a specific task such as
building processing and using comparable corpora focusing particularly on under-resourced
language pairs and domains. The book is intended for anyone interested in data-driven machine
translation for under-resourced languages and domains especially for developers of machine
translation systems computational linguists and language workers. It offers a valuable
resource for specialists and students in natural language processing machine translation
corpus linguistics and computer-assisted translation and promotes the broader use of
comparable corpora in natural language processing and computational linguistics.