Språkstatistik HT96

This is an overview of the course in Statistical approaches to Natural Language Processing (in Swedish: Språkstatistik) that was taught in the Autumn term of 1996 at the department of Linguistics at the University of Uppsala.

You can take a look at summaries of the results of the midcourse evaluation and the final evaluation.


       date time  room  subject
 1. må 0902 14-16 B125  Lecture: ch. 3, 4
 2. ti 0903 12-14 B125  Lecture: ch. 5, 6
 3. on 0904 14-16 B147  Exercise class 1
 4. to 0905 10-12 B125  Lecture: ch. 8, 9
 5. må 0907 14-16 H413  Lecture: ch. 10
 6. on 0909 14-16 B125  Exercise class 2
 7. to 0910 14-16 B125  Lecture: ch. 13, 14
 8. må 0916 14-16 B125  Lecture: ch. 15
 9. on 0918 14-16 B125  Exercise class 3
10. to 0919 12-14 B125  Lecture: ch. 16, 17
11. må 0923 14-16 A138  Lecture: ch. 18
12. on 0925 14-16 F318  Exercise class 4
13. to 0926 14-16 B125  Lecture: Simple Corpus Processing 1
14. må 0930 14-16 B125  Lecture: Simple Corpus Processing 2
15. ti 1001 09-13 H327  Practical exercise session 1
16. må 1007 14-16 B125  Lecture: Part of Speech Tagging 1
17. må 1007 16-17 A138  Exercise wrap-up class
18. on 1009 14-16 B125  Lecture: Part of Speech Tagging 2
19. to 1010 13-17 H327  Practical exercise session 2
20. må 1014 14-16 B125  Lecture: Clustering
21. on 1016 14-16 B135  Lecture: Statistical grammars
22. to 1017 13-17 H327  Practical exercise session 3
23. må 1021 14-16 B125  Lecture: Aligning Parallel Texts 1
24. on 1023 14-16 B125  Lecture: Aligning Parallel Texts 2
25. to 1024 10-16 H327  Practical exercise session 4
    ti 1029!09-13 PS2   Test
26. må 1111 14-16 A138  Evaluation session

The course will consist of two parts. The first part contains eight lectures about general statistics and four exercise sessions. The book that will be used here is Freedman et.al. 91. The second part consists of eight lectures about statistics applied to natural language processing and four practical exercise sessions. There are no obligatory sessions. However the students are advised to visit the eight exercise sessions because the assignments they will receive during those sessions will determine their final grade.

Students will receive a grade between 0 and 10 for four home work assignments, three practical assignments and the final written test. To pass for the course the student will have to get an average grade of 6.0 or higher for the home work assignments, an average grade of 6.0 or higher for the practical assignments and a grade of 6.0 or higher for the final test. In order to receive the high pass grade for the course the student should get a total average for the three grades of 8.0 or higher while none of the three grades is lower than 7.0.

Because of the number of students the practical sessions will be split in two groups. The students may choose themselves if they want to be in the first or in the second group provided that the group is not already complete. Students are allowed to work together on the homework exercises and the practical exercises. However everyone should hand in his/her own answers or reports.


The material that will be dealt with in this course in presented in the following list.

Main Literature

The first part of the course will use the book Freedman et.al 91. It would be a good idea for the students of the course to try to obtain this book. The second half of the course will be based on different papers of which I have listed the most important here. The papers are often difficult to obtain and therefore reading them is not obligatory. Students that want to read an unavailable paper of this list can contact me.

Additional Literature

The literature in this list will not be used in the course. It has been listed here because it might contain interesting additional material for the students.

