The programs corresponding with these exercises can be found in the appendix. Each exercise adds some extra functionality to the program texttool.
count x: counts the lines in text x and shows the result.
The number of lines of a text can be obtained by splitting the text in lines and taking the number of elements in the result. Example run:
> read test abc def > count test 2
grep string x y: selects all lines containing the string string from text x and puts these in text y. The string may not contain white space.
This function looks like read but it reads lines from another text and only adds a line if it contains the specified string. Example run:
> read test abc def > grep e test result > print result def
cat x y: puts text x behind text y. The result is stored in text y.
We have allowed the second text to be undefined. In that case cat x y means copy x to y. Example run:
> read first abc def > read last ghi jkl > cat first last > print last ghi jkl abc def
chars x y: divides text x in characters and puts these in text y with each character on a different line. The white space characters in x should be replaced by hash symbols (#).
We store the first text in a temporary variable, replace white space by hashes, split it to a character list and join the list with newline characters as separators. After adding a final newline, we obtain the required result. Example run:
> read test ab c > chars test result > print result a b # c #
replace string1 string2 x y: replaces all occurrences of string string1 by string string2 in text x and puts the result in text y. The strings may not contain white space.
We copy the first text to the second and replace the strings. Example run:
> read test abc dbf > replace b e test result > print result aec def
delete x: deletes text x.
The Perl function undef can be used for removing a variable. Example run:
> read test abc > delete test > print test text variable does not exist
paste x y z: puts the lines of text y next to those of text x and places the result in text z. The lines should be separated by a single space.
The two source texts are converted to lists of lines and the lines are added to a text until the lists are empty. In case one of the texts is shorter than the other, the final lines of the new text will only contain elements of the longer source text. Example run:
> read test1 abc def > read test2 ghi jkl > paste test1 test2 result > print result abc ghi def jkl
tail number x y: copies the lines of text x to text y starting from the line specified with number.
This is a variant of grep but the restriction on which lines to include is made on the basis of their position in the text. The error code required extra attention because of the presence of a number argument. We have to make sure that the number argument contains a number and that the number is not larger than the amount of lines in the text. Example run:
> read test abc def ghi > tail 2 test result > print result def ghi
tokenize x y: divides text x in tokens and puts the result in text y one token per line.
We have used the tokenize code provided by the teacher with the solution of exercise 3.5*. Example run:
> read test Oh no! Mr. John's parrot died? > tokenize test result > print result Oh no ! Mr. John 's parrot died ?
uniq x: counts how often lines occur in text x and prints the result, with the most frequent lines first
We need three loops. The first counts the number occurrences of each line. The second prints each different line with the number of times that it has occurred. The third is embedded in the second. It selects the most frequent line from the lines that have not been printed yet. Example run:
> read test abc def abc > chars test result > uniq result 3 # 2 a 2 b 2 c 1 d 1 e 1 f
The commands in texttool require at least one text argument. Modify the program in such a way that when commands are entered without this text argument, a default text is processed.
We have allowed omitting any text argument in any command. Before the error checking code, we have inserted code that checks if a text variabele has been omitted. All missing text variables have been replaced by a default text name for which we have chosen the empty name since this name will never occur. The remainder of the program has been left unchanged. Example run:
> print text variable does not exist > read abcd > paste > chars > count 10 > print a b c d # a b c d #
alias command1 command2: creates an alias for command command1. From that moment on command1 can be executed by entering command2
We have created a hash containing all the commands as keys with the same command as value, for example $alias{"cat"} = "cat". The alias command will bind a new name to a known command, for example $alias{"concatenate"} = "cat". In the previous version, texttool would read a command and attempt to execute it. In this version between reading and executing a command, the program will look the command up in the alias table and execute the command found there. Because of this, the code that was written in the earlier exercises could be re-used without changes. Example run:
> read test abc > alias alias rename > rename print show > rename show display > display test abc
# added error checking code for count if ($command eq "count" and @args != 1) { $errorNbr = 1; } if ($command eq "count" and @args == 1 and not(defined($text{$args[0]}))) { $errorNbr = 2; } # added processing code for count elsif ($command eq "count") { $tmp = 0; while ($text{$args[0]} =~ /\n/g) { $tmp++; } print "$tmp\n"; # earlier solution removed trailing empty lines: # @lines = split(/\n/,$text{$args[0]}); # print $#lines+1,"\n"; # returns 0 for text containing any number of empty lines }
# added error checking code for grep if ($command eq "grep" and @args != 3) { $errorNbr = 1; } if ($command eq "grep" and @args == 3 and not(defined($text{$args[1]}))) { $errorNbr = 2; } # added processing code for grep elsif ($command eq "grep") { @lines = split(/\n/,$text{$args[1]}); $text{$args[2]} = ""; for ($i=0;$i<@lines;$i++) { if ($lines[$i] =~ /$args[0]/) { $text{$args[2]} .= $lines[$i] . "\n"; } } }
# added error checking code for cat if ($command eq "cat" and @args != 2) { $errorNbr = 1; } if ($command eq "cat" and @args == 2 and not(defined($text{$args[0]}))) { $errorNbr = 2; } # we allow the second text of cat to be undefined # added processing code for cat elsif ($command eq "cat") { if (defined($text{$args[1]})) { $text{$args[1]} .= $text{$args[0]}; } else { $text{$args[1]} = $text{$args[0]}; } }
# added error checking code for chars if ($command eq "chars" and @args != 2) { $errorNbr = 1; } if ($command eq "chars" and @args == 2 and not(defined($text{$args[0]}))) { $errorNbr = 2; } # added processing code for chars elsif ($command eq "chars") { $tmpText = $text{$args[0]}; $tmpText =~ s/\s/#/g; @chars = split(//,$tmpText); $text{$args[1]} = join("\n",@chars) . "\n"; }
# added error checking code for replace if ($command eq "replace" and @args != 4) { $errorNbr = 1; } if ($command eq "replace" and @args == 4 and not(defined($text{$args[2]}))) { $errorNbr = 2; } # added processing code for replace elsif ($command eq "replace") { $text{$args[3]} = $text{$args[2]}; $text{$args[3]} =~ s/$args[0]/$args[1]/g; }
# added error checking code for delete if ($command eq "delete" and @args != 1) { $errorNbr = 1; } if ($command eq "delete" and @args == 1 and not(defined($text{$args[0]}))) { $errorNbr = 2; } # added processing code for delete elsif ($command eq "delete") { undef($text{$args[0]}); }
# added error checking code for paste if ($command eq "paste" and @args != 3) { $errorNbr = 1; } if ($command eq "paste" and @args == 3 and (not(defined($text{$args[0]})) or not(defined($text{$args[1]})))) { $errorNbr = 2; } # added processing code for paste elsif ($command eq "paste") { @lines0 = split(/\n/,$text{$args[0]}); @lines1 = split(/\n/,$text{$args[1]}); $tmpText = ""; $i = 0; while ($i < @lines0 or $i < @lines1) { if ($i < @lines0) { $tmpText .= $lines0[$i]; } $tmpText .= " "; if ($i < @lines1) { $tmpText .= $lines1[$i]; } $tmpText .= "\n"; $i++; } $text{$args[2]} = $tmpText; }
# added error checking code for tail $errorMsg[3] = "expected number argument is not a positive integer"; $errorMsg[4] = "number argument exceeds maximum value"; if ($command eq "tail" and @args != 3) { $errorNbr = 1; } if ($command eq "tail" and @args == 3 and not(defined($text{$args[1]}))) { $errorNbr = 2; } if ($command eq "tail" and @args == 3 and defined($text{$args[1]})) { if ($args[0] !~ /^[0-9]+$/) { $errorNbr = 3; } else { @lines = split(/\n/,$text{$args[1]}); if ($args[0] > @lines) { $errorNbr = 4; } } } # added processing code for tail elsif ($command eq "tail") { @lines = split(/\n/,$text{$args[1]}); $text{$args[2]} = ""; for ($i=$args[0]-1;$i<@lines;$i++) { $text{$args[2]} .= $lines[$i] . "\n"; } }
# added error checking code for tokenize if ($command eq "tokenize" and @args != 2) { $errorNbr = 1; } if ($command eq "tokenize" and @args == 2 and not(defined($text{$args[0]}))) { $errorNbr = 2; } # added processing code for tokenize elsif ($command eq "tokenize") { $_ = $text{$args[0]}; # tokenize code from exercise 3.5* by erikt s/\s+/\n/g; s/^\n//; s/([.,!?:;,])\n/\n$1\n/g; s/\n(["'`])([^\n])/\n$1\n$2/g; s/([^\n])(["'`])\n/$1\n$2\n/g; s/([^\n])([.,])\n/$1\n$2\n/g; s/\n([A-Z])\n\./\n$1./g; s/\n\.\n([^"A-Z])/\.\n$1/g; s/(\.[A-Z]+)\n\.\n/$1.\n/g; s/([^\n])'s\n/$1\n's\n/g; s/([^\n])n't\n/$1\nn't\n/g; s/([^\n])'re\n/$1\n're\n/g; s/\n\$([^\n])/\n\$\n$1/g; s/([^\n])%\n/$1\n%\n/g; s/Mr\n\.\n/Mr.\n/g; # end of tokenize code $text{$args[1]} = $_; }
# added error checking code for uniq if ($command eq "uniq" and @args != 1) { $errorNbr = 1; } if ($command eq "uniq" and @args == 1 and not(defined($text{$args[0]}))) { $errorNbr = 2; } # added processing code for uniq elsif ($command eq "uniq") { @lines = split(/\n/,$text{$args[0]}); %freq = (); $differentLines = 0; # count the occurrences of the lines and store results in %freq foreach $line (@lines) { if (defined($freq{$line})) { $freq{$line}++; } else { $freq{$line} = 1; $differentLines++; } } for ($i=0;$i<$differentLines;$i++) { $freqMostFrequent = 0; $mostFrequent = ""; # select most frequent line foreach $line (keys %freq) { if (defined($freq{$line}) and $freq{$line} > $freqMostFrequent) { $freqMostFrequent = $freq{$line}; $mostFrequent = $line; } } # print it and remove it print "$freq{$mostFrequent} $mostFrequent\n"; undef($freq{$mostFrequent}); } }
# no extra error checking code was required # added processing code for default processing $defaultTextName = ""; if (@args == 0 and ($command eq "read" or $command eq "print" or $command eq "count" or $command eq "delete" or $command eq "cat" or $command eq "chars" or $command eq "paste" or $command eq "tokenize" or $command eq "uniq")) { $args[0] = $defaultTextName; } if (@args == 1 and ($command eq "cat" or $command eq "chars" or $command eq "grep" or $command eq "paste" or $command eq "tail" or $command eq "tokenize")) { $args[1] = $defaultTextName; } if (@args == 2 and ($command eq "replace" or $command eq "grep" or $command eq "paste" or $command eq "tail")) { $args[2] = $defaultTextName; } if (@args == 3 and ($command eq "replace")) { $args[3] = $defaultTextName; }
# added error checking code for alias $errorMsg[5] = "cannot make alias for an unknown command"; if ($command eq "alias" and @args != 2) { $errorNbr = 1; } if ($command eq "alias" and @args == 2 and not(defined($alias{$args[0]}))) { $errorNbr = 5; } # initial alias table $alias{"cat"} = "cat"; $alias{"grep"} = "grep"; $alias{"read"} = "read"; $alias{"tail"} = "tail"; $alias{"uniq"} = "uniq"; $alias{"alias"} = "alias"; $alias{"chars"} = "chars"; $alias{"count"} = "count"; $alias{"paste"} = "paste"; $alias{"print"} = "print"; $alias{"delete"} = "delete"; $alias{"replace"} = "replace"; $alias{"tokenize"} = "tokenize"; # converting alias to real command $command = $alias{$command} if (defined($alias{$command})); # added processing code for alias elsif ($command eq "alias") { $alias{$args[1]} = $alias{$args[0]}; }