These are the exercises for the second lab session of the course Dokumenthantering VT98. There are 8 exercises this week and 2 of them are obligatory. The obligatory exercises have been marked with a *.
Write a report about the obligatory exercises. The report should fulfill the same requirements as the report for the first lab. The deadline for handing in the report for this week's exercises is Wednesday February 17, 1998.
printf "octal: %o\n" 224 printf "hexadecimal: %x\n" 224
In the first string %d will print a decimal number, %o will print an octal number and %x (or %X) will print a hexadecimal number. The number behind the string (in this case 224) will be interpreted as a decimal number. Use this printf command to verify the numbers shown in the table in section 3.1. The same can be done with the following perl script:
#!/usr/local/bin/perl -w for (4,10) { printf "decimal %3d; octal %3o; hexadecimal %3x\n",$_, $_, $_; }
#!/usr/local/bin/perl -w printf "%c%c%c%ca%cb%c\n", 7, 10, 13, 127, 8;
The %c will print a character with the decimal ASCII value which has been specified behind the string. The result will depend on the terminal configuration you are using. The a or the b may be deleted by the next control character and you may hear a bell. Try running the command and piping it to the more command. This should remove the bell sound and show ^G.
asciiTable
.
Each window can display only one character set.
You can start a window with a different character set (font) by
starting the window program on the command line with arguments -fn
FONT where FONT is an X windows font.
You can use the command xlsfonts
to get an overview of
the available fonts.
Example:
xterm -fn -urw-courier-medium-r-normal--13-100-100-100-m-80-iso8859-5
starts an xterm window with font ISO 8859-5 size 13.
You can also work with the different character sets by starting one of
the programs aixterm1
, aixterm2
, and so
on or emacs1
, emacs2
and so on.
These programs start either aixterm or emacs with the font
ISO 8859-X in which X is the digit in the program name.
Again you can get an overview of the characters that are being used by
typing asciiTable
in one of the aixterm windows.
Use these overviews and the web page mentioned in the references to choose the best ISO 8859 character set for displaying an aligned text in Swedish and Slovenian. The character set should include as many as possible characters with diacritics of both languages. For Swedish the lower case characters with diacritics are å, ä ö and é. For Slovenian these are:
v v v c s z
sgml-ncheck
(for the TEI Lite DTD) and
html-ncheck
(for
several HTML DTDs).
Use html-ncheck
for checking if the
HTML
file in section 4.2 uses correct HTML.
Don't forget to insert the following extra SGML header line as a first line
in this file before you check it (see section 4.3):
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
Now replace the first line of the file by:
<!DOCTYPE html [ ]>
and add DTD definitions between the square brackets until
html-ncheck
accepts the file
(example,
more).
You need to make your own DTD for the file.
This is an alternative way of specifying DTDs: in the documents
themselves.
If you want to see how html-ncheck
has analyzed the
file then add a -o option between the command and the filename.
The program is nothing more than a script which calls the
James Clark's nsgml
program.
There is a manual for the latter if you want more information about
it.
The HTML DTD can be found in
/usr/local/lib/html-check/lib/html.dtd
(look at the source if you do not see anything readable).
htmlize
(see the manual page for information).
Your program only has to simulate conversion of four characters:
å (å),
ä (ä),
ö (ö) and
é (é).
It has to be able to convert these characters from ISO Latin 1 to
SGML entities and back (-r option) for an arbitrary number of
files.
[answer example]