Språkstatistik

This is an overview of a course in Statistical approaches to Natural Language Processing (in Swedish: Språkstatistik) that has been taught in the Autumn term at the department of Linguistics at the University of Uppsala. You can take a look at summaries of the results of the midcourse evaluation and the final evaluation.

Schedule

    date time  room  subject
 1. 0905 10-12 K224  Introduction / Probability theory 1
 2. 0907 12-14 B125  Probability theory 2
 3. 0912 10-12 K412  Exercises 1
 4. 0914 12-14 B125  Combinatorics, stochastic variables
 5. 0919 10-12 K224  Functions of stochastic variables
 6. 0921 12-14 B139  Exercises 2   
 7. 0926 14-16 K412  Binomial and normal distribution
 8. 0928 12-14 B139  Statistical tests, certainty intervals
 9. 1002 12-14 B125  Exercises 3
10. 1003 14-16 B125  Noisy channel model and applications
11. 1005 12-14 H327  Practical exercise 1
12. 1009 12-14 H339  Clustering
13. 1010 14-16 H327  Practical exercise 2
14. 1012 10-12 A122  Statistical grammars
15. 1017 10-12 H327  Practical exercise 3
16. 1019 12-14 K224  Data-oriented parsing / Entropy
17. 1023 12-14 H327  Practical exercise session
    1030 09-13 Polb  Test
18. 1106 14-16 H327  Test evaluation
19. 1117 14-16 H327  Practical exercise session

There are no obligatory sessions. However the students are advised to visit the first nine classes and the practical exercises as the home work assignments they will receive during those sessions partly determine their grade.

Students will receive a grade between 0 and 10 for the nine home work assignments, for the three practical assignments and for the test. To pass for the course the student will have to get an average grade of 6.0 or higher for the home work assignments, an average grade of 6.0 or higher for the practical assignments and a grade of 6.0 or higher for the final test. In order to receive the high pass grade for the course the student should pass the course and his or hers average for the three grades should be 8.0 or higher.

A practice test with answers is available.

Literature

The material that will be dealt with in this course in presented in the following literature. Students are recommended to purchase either Krenn&Samuelsson 95 (available on the web) or Freedman et.al. 91. The book by Gunnar Blom was advised by Fredrik Olsson as a good introductory book for students that prefer reading Swedish.

Note that the 1995 edition of this course does not deal with the topic of Hidden Markov Models because the 1995 students already dealt with this in previous year's Corpus Linguistics course.


Last update: March 18, 1998. erikt@stp.ling.uu.se