Course: Natural Language Processing

Prof. Jacques Savoy & Olena Zubaryeva
University of Neuchatel
Computer Science Department

Main objective

The main objective of this course is to introduce the students to the underlying problems when facing with natural languages data.

  • Representation and standards;
  • Statistical methods for natural language processing;
  • Prolog and parsing;
  • Markov Models

Practical exercises will complete the theoretical presentation.


Introduction to Perl and XML; Introduction to linguistics (morphology, syntax, semantics); simple statistical approaches (KWIC, concordances); automata and natural language (FSTN, RTN, ATN); Spelling detection and correction; Statistical models (counting words, bigramns, entropy); Markov chains; Hidden Markov chains; Information retrieval.

The final mark is based on both a final written exam and the results of the practical exercices.


    • Eugene Charniak: Statistical Language Learning. The MIT Press, Cambridge (MA).
    • Ian H. Witten, David Bainbridge: How to Build a Digital Library. Morgan Kaufmann, 2003.
    • Maxime Crochemore, Christophe Hancart, Thierry Lecrocq: Algorithms on Strings. Cambridge University Press (UK), 2007.

