This is an overview of a course in Statistical approaches to Natural Language Processing (in Swedish: Språkstatistik) that has been taught in the Autumn term at the department of Linguistics at the University of Uppsala. You can take a look at summaries of the results of the midcourse evaluation and the final evaluation.
date time room subject 1. 0905 10-12 K224 Introduction / Probability theory 1 2. 0907 12-14 B125 Probability theory 2 3. 0912 10-12 K412 Exercises 1 4. 0914 12-14 B125 Combinatorics, stochastic variables 5. 0919 10-12 K224 Functions of stochastic variables 6. 0921 12-14 B139 Exercises 2 7. 0926 14-16 K412 Binomial and normal distribution 8. 0928 12-14 B139 Statistical tests, certainty intervals 9. 1002 12-14 B125 Exercises 3 10. 1003 14-16 B125 Noisy channel model and applications 11. 1005 12-14 H327 Practical exercise 1 12. 1009 12-14 H339 Clustering 13. 1010 14-16 H327 Practical exercise 2 14. 1012 10-12 A122 Statistical grammars 15. 1017 10-12 H327 Practical exercise 3 16. 1019 12-14 K224 Data-oriented parsing / Entropy 17. 1023 12-14 H327 Practical exercise session 1030 09-13 Polb Test 18. 1106 14-16 H327 Test evaluation 19. 1117 14-16 H327 Practical exercise session
There are no obligatory sessions. However the students are advised to visit the first nine classes and the practical exercises as the home work assignments they will receive during those sessions partly determine their grade.
Students will receive a grade between 0 and 10 for the nine home work assignments, for the three practical assignments and for the test. To pass for the course the student will have to get an average grade of 6.0 or higher for the home work assignments, an average grade of 6.0 or higher for the practical assignments and a grade of 6.0 or higher for the final test. In order to receive the high pass grade for the course the student should pass the course and his or hers average for the three grades should be 8.0 or higher.
A practice test with answers is available.
The material that will be dealt with in this course in presented in the following literature. Students are recommended to purchase either Krenn&Samuelsson 95 (available on the web) or Freedman et.al. 91. The book by Gunnar Blom was advised by Fredrik Olsson as a good introductory book for students that prefer reading Swedish.
Note that the 1995 edition of this course does not deal with the topic of Hidden Markov Models because the 1995 students already dealt with this in previous year's Corpus Linguistics course.