previous main page next

Dokumenthantering VT97:08

These are the exercises and references for the eighth class of the course Dokumenthanteringen


Exercises

The results of the exercises marked with * have to be handed in. The exercises marked with ? are optional obligatory exercises: you only have to hand in the results of one of them

  1. * Convert the ispell word list for Swedish to a lexicon containing 30 chars per word, 4 frequency bytes and 4 inverted file bytes. The word list can be found in the file:

    /home/staff/erikt/P/st97/lrtlab/words.swedish

    The frequency bytes and the inverted file bytes may be filled with anything you want. Now encode this lexicon by using front coding with word groups of size four but keep the eight extra bytes in the lexicon entries. Compare the size of the resulting lexicon with the size of the intermediate lexicon, that is the one with extra bytes but without front coding.

    Tip: perform your tests with a fraction of the word lists.


References


Last update: May 21, 1997. erikt@stp.ling.uu.se