This is an overview of the course in Statistical approaches to Natural Language Processing (in Swedish: Språkstatistik) that was taught in the Autumn term of 1996 at the department of Linguistics at the University of Uppsala.
You can take a look at summaries of the results of the midcourse evaluation and the final evaluation.
date time room subject 1. må 0902 14-16 B125 Lecture: ch. 3, 4 2. ti 0903 12-14 B125 Lecture: ch. 5, 6 3. on 0904 14-16 B147 Exercise class 1 4. to 0905 10-12 B125 Lecture: ch. 8, 9 5. må 0907 14-16 H413 Lecture: ch. 10 6. on 0909 14-16 B125 Exercise class 2 7. to 0910 14-16 B125 Lecture: ch. 13, 14 8. må 0916 14-16 B125 Lecture: ch. 15 9. on 0918 14-16 B125 Exercise class 3 10. to 0919 12-14 B125 Lecture: ch. 16, 17 11. må 0923 14-16 A138 Lecture: ch. 18 12. on 0925 14-16 F318 Exercise class 4 13. to 0926 14-16 B125 Lecture: Simple Corpus Processing 1 14. må 0930 14-16 B125 Lecture: Simple Corpus Processing 2 15. ti 1001 09-13 H327 Practical exercise session 1 16. må 1007 14-16 B125 Lecture: Part of Speech Tagging 1 17. må 1007 16-17 A138 Exercise wrap-up class 18. on 1009 14-16 B125 Lecture: Part of Speech Tagging 2 19. to 1010 13-17 H327 Practical exercise session 2 20. må 1014 14-16 B125 Lecture: Clustering 21. on 1016 14-16 B135 Lecture: Statistical grammars 22. to 1017 13-17 H327 Practical exercise session 3 23. må 1021 14-16 B125 Lecture: Aligning Parallel Texts 1 24. on 1023 14-16 B125 Lecture: Aligning Parallel Texts 2 25. to 1024 10-16 H327 Practical exercise session 4 ti 1029!09-13 PS2 Test 26. må 1111 14-16 A138 Evaluation session
The course will consist of two parts. The first part contains eight lectures about general statistics and four exercise sessions. The book that will be used here is Freedman et.al. 91. The second part consists of eight lectures about statistics applied to natural language processing and four practical exercise sessions. There are no obligatory sessions. However the students are advised to visit the eight exercise sessions because the assignments they will receive during those sessions will determine their final grade.
Students will receive a grade between 0 and 10 for four home work assignments, three practical assignments and the final written test. To pass for the course the student will have to get an average grade of 6.0 or higher for the home work assignments, an average grade of 6.0 or higher for the practical assignments and a grade of 6.0 or higher for the final test. In order to receive the high pass grade for the course the student should get a total average for the three grades of 8.0 or higher while none of the three grades is lower than 7.0.
Because of the number of students the practical sessions will be split in two groups. The students may choose themselves if they want to be in the first or in the second group provided that the group is not already complete. Students are allowed to work together on the homework exercises and the practical exercises. However everyone should hand in his/her own answers or reports.
The material that will be dealt with in this course in presented in the following list.
The first part of the course will use the book Freedman et.al 91. It would be a good idea for the students of the course to try to obtain this book. The second half of the course will be based on different papers of which I have listed the most important here. The papers are often difficult to obtain and therefore reading them is not obligatory. Students that want to read an unavailable paper of this list can contact me.
The literature in this list will not be used in the course. It has been listed here because it might contain interesting additional material for the students.