Text Files
Lab 8-- Reading and searching text files
In this lab you will write a program that reads a text file -- the same sort of unformatted file that you can create or edit in emacs, downloads the file into an arraylist, and then searches the arraylist for lines that contain strings supplied by the user.
Here is the lab's jar file. This contains a few text files for you to use as tests. It also has a small test program that will help you determine if you have written the first part of this lab correctly. It does not contain any of the program you need to write except a copy of Michael Barnes' SimpleInput class. Everything else is to be written by you.
Part 1: Your program structure
The program you will write contains two classes, in addition to SimpleInput. These are
class TextFileReader: This uses the SimpleInput class to open a text file, and dump it line by line into an arraylist of Strings. This is very similar to the class FReader we used in lab 6; you can check the code there if you get stuck. Class TextFileReader has two instance variables:
SimpleInput file; // a SimpleInput object to read the file
ArrayList lines; // this contains the lines of the input file
TextFileReader contains the following methods:
public TextFileReader(String fname): The constructor for TextFileReader makes a new SimpleInput object using the fname value passed to it for the name of the file. It assigns this SimpleInput object to its file instance variable.
public void read(): This reads the file and assigns each line to an entry of the list instance variable.
public String [] getLines(): This returns the lines instance variable.
class StringFinder: This processes the arraylist. It has only one instance variable, which it gets from the TextFileReader object:
ArrayList lines;
StringFinder has the following methods:
public StringFinder(): This asks the user for the name of the input file and makes a new TextFileReader object with this name. It then reads the file, and uses the accessor methods of class TextFileReader to give values to the arraylist.
public void searcher(): This repeatedly asks the user for strings, and searches the arraylist for lines containing these strings. The input loop ends when it gets an empty string (one whose length is 0).
public void find(String s): This performs the central action for method searcher. It looks through the arraylist for any line containing String s, and prints out the line and its index.
public static void main(String args[] ): This makes a new StringFinder and calls its searcher method.
Part 2: Class TextFileReader
Make a file TextFileReader.java to hold the declarations of this class. Remember that since this class uses one of Java's collection classes you need to have the following line at the top of ths file:
import java.util.*;
You need to add the instance variables:
SimpleInput file; // a SimpleInput object to read the file
ArrayList lines; // this contains the lines of the input file
The constructor for TextFileReader:
public TextFileReader(String fname)
makes a SimpleInputObject to read the file whose name is contained in the string fname and assigns this object to instance variable file, and makes a new arraylist for instance variable lines. Note that constructing the arraylist is a different matter from putting data into it. Here you make the list; in method read() you read the file and put its lines into the arraylist as strings.
The accessor method is easy:
public ArrayList getLines()
returns the appropriate class instance variable.
The only method in class TextFileReader that should cause you to think carefully is
public void read()
This needs to read the lines of the file one at a time -- with the nextLine()
method of class SimpleInput; you will have a line of code that says something
like
line = file.nextLine(); where "line" is a variable of type
String. You want to do this inside a loop, that reads lines of the file and
adds them to the array, continuing until it gets to the end of the file. The
problem is determining when we get to the end of the file.
An exception is an event that occurs while a program is running that is outside the normal course of activities for a running program. Common exceptions are caused by trying to use data or methods of an object that doesn't exist, trying to open a file that doesn't exist, or trying to access data past the end of an array. Java uses the terminology throw and catch -- an unexpected event causes an exception to be thrown. The program can register a block of code to be executed if an exception is thrown; this is called catching the exception. If nothing catches an exception the program will crash.
The java catch mechanism looks like this:
try {
... code that might cause an exception to be thrown
}
catch (exceptionType e) {
.... code to handle the exception
}
For example, the following loop will read a file to its end, discarding each line after itis read:
boolean endOfFile = false;
do {
try {
String line = file.nextLine();
}
catch (runtimeException e) {
endOfFile = true;
}
while (!endOfFile);
You need to do something like this, adding to the arraylist each String as you find it.
When you finish implementing class TextFileReader you can check you implementation with class TestReader. This has a little program that uses a TextFileReader to read a file. It prints the number of lines in the file, then the first 5 lines. If both of these are correct with several different files, you can assume that your implementation is working correctly. By the way, you can use the unix wc program to find the number of lines of a text file. This prints the number of lines, the number of words, and the number of bytes (usually the same as the number of characters) in the file. For example
wc words.txt
gives output 234937 234937 2486824 , which means that file "words.txt " has 234,937 lines and words, and 2,486,824 characters.
The jar file also contains 3 data files to help you with these tests:
sample.txt: A brief file with 5 lines.
words.txt: A list of all 234,937 entries in Webster's 2nd International Dictionary (which is long out of copyright).
alice29.txt: The Project Gutenberg etext of Alice in Wonderland.
Part 3: Class StringFinder
Make a new file StringFinder.java to hold class StringFinder. This has one instance variable to hold the arraylist of strings; I call it list. Class StringFinder has three public methods in addition to main(). These are
public StringFinder(): This constructor first asks the user for the name of the data file, then reads the name (so you will need a SimpleInput keyboard object). It then passes this name to a constructor to make a new TextFileReader. It calls the read() method for the TextFileReader class with the object this constructor makes. Finally, after reading it calls the getLines() method for this TextFileReader object, and assigns the value it returns to its list instance variable.
public void searcher(): This also makes use of a SimpleInput keyboard object. In a loop it repeatedly asks the user for a string; if the string it gets has length greater than zero, it calls the find() method to print out all of the lines of the list that contain this string.
public void find(String s): This is called by the searcher() method. It walks through the line list one entry at a time. For each entry it calls the indexOf() method with argument s; if it gets back a value of -1 it means the entry does not contain s; if it gets back a value >= 0 it prints the entry and its index. Finally, it should keep track of how many entries it prints for s; if none, it should print a message that no lines contain s.
Class StringFinder contains the main() method for this program. This calls the StringFinder() constructor to make a new StringFinder object, and calls the searcher() method of this object.
Together, the TextFileReader and StringFinder classes make up the entire program. You can test this program with the three data files that were included in the jar file for this lab: sample.txt, words.txt, and alice29.txt.