Testing for rank.pl ------------------- Satanjeev Banerjee bane0025@d.umn.edu 1. Introduction: ---------------- We have tested rank.pl, a component of Bigram Statistics Package version 0.5. Following is a description of the aspects of rank.pl that we have tested. We provide the scripts and files used for testing so that later versions of rank.pl can be tested for backward compatibility. 2. Phases of Testing: --------------------- We have divided the testing into two main phases: Phase 1: Testing rank.pl's behaviour under normal conditions. Phase 2: Testing rank.pl's response to erroneous conditions. 2.1. Phase 1: Testing rank.pl's behaviour under normal conditions. ------------------------------------------------------------------- The script for this phase is 'normal-op.sh'. To run the tests contained in this script, type "normal-op.sh" at the command prompt. This script performs several subtests: 2.1.1. Subtest 1: ----------------- This test checks to see if we get a value of 1 if we give rank.pl the same ngram frequency file twice, and -1 if we give rank.pl the ngrams in exactly reverse order. Subtest a compares two files with ngrams in reverse order. Subtest b compares the first file of subtest a against itself. Subtest c compares the second file of subtest a against itself. 2.1.2. Subtest 2: ----------------- This subtest checks what happens when one of the files has tied ranks. The way ranks are given by statistic.pl to ngrams following a bunch of tied ngrams is different from the ranking required by rank.pl. Reranking is therefore done by rank.pl. This test checks if the reranking is done correctly. 2.1.3. Subtest 3: ----------------- This subtest checks what happens when one file has more ngrams than the other. Only those ngrams that occur in both files are used, and are reranked. This test checks if this happens correctly. 2.1.4. Subtest 4: ----------------- This subtest checks the switch --precision Subtest a does not use --precision. Subtest b uses the switch --precision 10. Subtest c uses the switch --precision 0. Subtest d uses the switch --precision 1. 2.1.4. Subtest 5: ----------------- This subtest runs rank.pl on two files first in some order, then in the opposite order. Both times, the output should be the same. 2.2. Phase 2: Testing rank.pl's response to erroneous conditions: ------------------------------------------------------------------ The script for this phase is 'error-handling.sh'. To run the tests contained in this script, type "error-handling.sh" at the command prompt. This script performs several subtests: 2.2.1. Subtest 1: ----------------- This subtest checks the response of rank.pl when the source file cannot be opened. 2.2.2. Subtest 2: ----------------- This subtest checks the response of rank.pl when only one source file is provided. 3. Evaluation of execution time of count.pl on big files: --------------------------------------------------------- The following experiment was conducted on machine hh33809 at the Univ of Minnesota Duluth, Computer Science Department laboratory. Each input text had 15,181 bigrams to be compared. 1> time rank.pl report.mi report.dice 2.560u 0.010s 0:02.58 99.6% 0+0k 0+0io 320pf+0w 4. Conclusions: --------------- We have tested program rank.pl and conclude that it runs correctly. We have also provided the test scripts so that future versions of rank.pl can be compared to the current version against these scripts.