Untitled Document
Lab 6 -- Parsing a Text File
Overview
Here is the jar file for this lab.
In this lab you will develop a program that downloads a data file into an array and processes the array in several ways. The files we will work with represent the grades of a class on three exams. The data files are saved in text format -- you could make or modify them in emacs. You will start with a program that reads the file a line at a time and prints out that line as a string. You need to add a facility to parse each input line into its separate fields, so the data can be saved in a structure. You will then add some statistical procedures that make use of the parsed data. By the end of the lab you should be able to print out the data in nice columns, calculate the high and low scores on the first exam, and find the student with the highest overall average. This should give you a lot of practice working with strings and arrays.
The code you need to write for this lab is not terribly lengthy, so devote some attention to writing good code. Use informative variable and method names, use white space to lay the code out clearly, and give comments to explain what you are doing. It should be possible to follow your code by reading it.
Part 0: The jar file and the structure of the program.
If you expand the jar file for this lab you will find the following files:
Gradebook.java: This contains the main() method for the program. Class Gradebook has only two methods:
Gradebook() // The constructor. This calls routines to open up the data file and to create a new Statistician object. It currently uses the Statistican only to print the file; in the later parts of this lab you will add calls to performs the other operations.
main(String args[]) // all that this does is call the Gradebook constructor.
The only change you will make to Gradebook is in Part 2, to add calls to the new methods you will add to class Statistician.
FReader.java: This defines a class for file reading. You do not need to modify this file. The constructor for FReader reads the lines of the input file one at a time, and calls the constructor for GradeRecord with each line. The GradeRecord object that is constructed is stored in an array. Altogether, this class has only three methods, all of which are complete:
FReader(String fileName) // the constructor
gradeRecord[] getTable() // returns the array built by the FReader constructor
int getCount() // returns the size of the table built by the FReader constructor, which is also the number of lines in the input file.
GradeRecord.java: This file is at present almost empty. It defines a "GradeRecord" class that stores all of the information from one line of the data file: a name, an exam 1 score, an exam 1 letter grade, and so forth. At present the constructor assigns default values to all of these variables and prints the line. Your first task, in Part 1, is to parse the line of input that is sent to the constructor and to use this information to assign real values to all of the instance variables. You will eventually add more methods to this class, such as a print method.
Statistician.java: This class handles all of the computations of the lab. At present it has only two methods
Statistician(FReader classgrades) // This constructor takes the list of GradeRecords that were found by the FReader and saves it to varaibles in the Statistician class. This constructor is complete and doesn't need to be altered.
print() // This method is here as a placeholder; the body of it is empty. The method needs to walk through the list of GradeRecords and print each of them, using the GradeRecord print method.
SimpleInput.java: This is David Barnes' input class that we have used in several previous labs.
grades.txt: This is adata file. You can look at it with emacs, but don't add lines to it. Emacs likes to replace <tab> characters with spaces, which will destroy the data formatting.
Part 1: Parsing the input and printing the file.
A. Parsing.
The first step is to add to the GradeRecord constructor code that parses a line of input. This constructor is called (by FReader) witth a string variable line. You need to pull out of this line a string for the name, three integers for exam1, exam2, and exam3, and three letter grades. Here is how to think of this:
Each line of the data file has the following format:
name <tab> exam_1_field <tab> exam_2_field <tab> exam_3_field
such as
Woody Woodpecker 95 A 85 B 92 A
Note that the fields are all separated by <tab> characters, which are denoted in java by '\t'.
The three exam fields have the format
numerical_grade<space>letter_grade
where "numerical_grade" is an integer between 0 and 100 and "letter_grade" is a single character, such as 'A'.
You can assume that every line of the file will have this format; there are no extra spaces or missing grades.
To parse this you need to use two methods of class String:
line.indexOf( char c) returns the position of the first instance of character c in String line.
line.substring(int i, int j) returns a String consisting of the characters of String line starting at position i
and extending to but not including position j.
line.substring(int i) returns a String consiting of all of the characters of line starting at position i.
For instance, the following code will pull off the name at the start of the line:
int index = line.indexOf('\t');
name = line.substring(0,index);
line = line.substring(index+1);
The following code will then get the score on the first exam as a string.
index = line.indexOf( ' ' );
String numberString = line.substring(0, index);
Here is an easy way to convert a string of digits to its numerical value. There is a class called Integer that takes a string for its constructor. This class has a method called intValue() that gives its numeric value. So we have
Integer t = new Integer(numberString);
exam1 = t.intValue();
We get the letter grade for the first exam and remove the exam1 field from the string with
ex1Letter = line.charAt(index+1);
line = line.substring(index+3);
Altogether, the following code takes a String line that fits our format and grabs the name and the first exam info:
int index = line.indexOf('\t');
name = line.substring(0,index);
line = line.substring(index+1);
index = line.indexOf( ' ' );
String numberString = line.substring(0, index);
Integer t = new Integer(numberString);
exam1 = t.intValue();
ex1Letter = line.charAt(index+1);
line = line.substring(index+3);
You need to extend this to extract all of the information in a line of the data file.
B. Printing
The next step is to fill out the print() methods for GradeRecord and Statistician, The basic idea of printing a GradeRecord object is simple: you just print each of its fields, all on the same line. Do this, add the print() method to Statistician (print the list the same way you print any list, with a for loop), and delete the print line in the GradeRecord constructor. Your program should still print the file, but this time it is with your print methods and not the System.out.print() method. Unfortunately, this still does not produce a very nice output. Since the name fields vary in length and you want to print the names first, you have to do something to make the exam scores appear in nice columns. A nice way to do this is to think of the name field as taking up a fixed number of spaces, regardless of the actual size of the name. If the name is too short, insert an appropriate number of blanks to space it out. It is easy to do this if you write a tab method:
String tab(int n) { // returns a string made up of n spaces
Then print the three exam field (both number and letter) and they should appear in columns. Finally, since this represents a gradebook, you should print each student's average score from the three exams. An integer-valued average (just add the scores and divide by 3) will suffice for this.
Altogether, by the end of Part 1 you should have output similar to this:
Daffy Duck 56 D 65 C 72 C 64
Donald Duck 84 B 72 C 86 B 80
Sylvester 46 F 33 F 45 F 41
Beware! A few of the grades are 1 digit instead of two (none are more than 2). You need to allow for this; the letter grades at the end should line up in columns.
Part 2: Highs and lows
At this point you have done the hard work of the lab and are ready to reap the fruits of your labors. Add a routine to the Statistician class to print the high and low scores on Exam 1. This routine will walk through the list, just as print() does. This time, instead of printing the GradeRecord object you want to use it to compare its score on Exam 1 with that of the best and worst scores you have seen so far. If you find a new best or a new worst you want to remember it, and then go on with reading the file.At the end you should print out the highest and lowest scores you have found:
The high score on Exam 1 was 97; the low score was 23.
Note that the individual fields of the GradeRecord class are private. You MAY NOT change these to public. If you want other classes to access these values you can add accessor methods to the GradeRecord class.
Part 3: Highest average
For this method you need to read through the list again, looking for the name and average of the student with the highest average score on the three exams. When you get to the end of the list you should print this information.
Tweety Bird had the highest exam average: 97.
Note that if you add an averageScore() method to GradeRecord you can keep your code simple and easy to read while computing this information.