Dr. Paul A. G. Sivilotti
Associate Professor
Computer Science and Engineering
Site Navigation
Lab 7: Collections Framework
-
Dense Multiset
A multiset is a set that allows repetition of individual elements. Thus, adding an element always increases the cardinality by one, regardless of whether the element is already present. In this lab, you will implement a multiset of char.
Write a class that implements the MultiSetOfChar interface. The javadoc (html) documentation is available here.
Your implementation should be called DenseMultiSetOfChar and should make use of the Collections Framework. In addition, you should obtimize your implementation for containing many repetitions of relatively few distinct characters. That is, a DenseMultiSetOfChar should not use significantly (any?) more memory if it contains 100,000 repetitions of 'a' vs just one.
For your convenience, the test program TestDenseMultiSetOfCharSimple.java is provided. Your DenseMultiSetOfChar class should compile with this test program. This test program documents the expected behavior so it should help you ensure you have implemented the correct method behaviors as well as signatures.
Your submission for this part of the lab should include:
DenseMultiSetOfChar.java, the class implementing the MultiSetOfChar interface).DenseMultiSetOfChar.html, the Javadoc for the implementation class.MultiSetOfCharTest.java, a JUnit test fixture testing your component.
-
Compressing a Library of Books
How many characters are in the text of "Alice in Wonderland"? How many times does the character 'A' appear? In this part of the lab, you will implement a class that could be used to keep track of this information for every book in a library.
Consider the following skeleton (ie undocumented) interface, FrequencyLibrary:
interface FrequencyLibrary { int size(); boolean contains(String target); MultiSetOfChar getFrequencies(String target); void add(String name, char element); boolean remove(String name, char element); char randomUniformChoose(String name); }An informal summary of the semantics (behavior) of these methods is given below:
-
size - Returns the number of books in the library.
-
contains - Returns true if and only if the argument is already a book title in the library.
-
getFrequencies - Returns the MultiSetOfChar that represents the occurrences of the individual characters in the text of the book indiciated by the argument.
-
add - Modifies the character occurrences associated with name to include one more occurrence of element.
-
remove - Modifies the character occurrences associated with name to include one less occurrence of element. If this removal results in no elements being associated with name, name is removed from the library.
-
randomUniformChoose - Returns a random character, chosen from the same distribution as the characters appear in the book. For example, if 5% of the characters in "Alice in Wonderland" are an 'A', then this method should return an 'A' about 5% of the time.
First, improve the FrequencyLibrary interface by carefully documenting it using Javadoc. Your documentation should reflect the informal semantics given above, but should be somewhat more precise.
Next, write a class that implements the FrequencyLibrary interface. Your implementation should be called DenseFrequencyLibrary and should make use of your solution to part 1 as well as the Collections Framework.
For your convenience, the test program TestDenseFrequencyLibrarySimple.java is provided. Your DenseFrequencyLibrary class should compile with this test program. This test program documents the expected behavior so it should help you ensure you have implemented the correct method behaviors as well as signatures.
Your submission for this part of the lab should include:
FrequencyLibrary.java, the documented interface.FrequencyLibrary.html, the Javadoc for the documented interface.DenseFrequencyLibrary.java, the class implementing the FrequencyLibrary interface).DenseFrequencyLibrary.html, the Javadoc for the implementation class.FrequencyLibraryTest.java, a JUnit test fixture testing your component.
-
-
Performance
Compare the run-time performance of your DenseMultiSetOfChar using two different choices for collection on which it is built. For example, if you wrote DenseMultiSetOfChar to use a HashSet, compare it with using a TreeSet instead. (I am not suggesting or advising you to use a HashSet for your solution! This is just an illustrative example.)
Note: you should be able to change between two implementations (such as HashSet and TreeSet) by changing a single line of code in your implementation.
For your convenience, a basic timing program Profiler.java is provided. Expand and modify this program until you have gathered enough performance data to observe the impact that choice of collection implementation makes on your code. For example, you might want to look at the performance of different methods under different circumstances (e.g., small/large cardinality or few/many repetitions). Include a text file in your submission that describes your findings. (Only a brief discussion is needed: a single paragraph.)
Your submission for this part of the lab should include:
PerformanceDiscussion.txt, a text file describing your findings (only a brief discussion is needed: a single paragraph.)Profiler.java, your modified profiling program used to obtain the data that justifies your claims and observations in your discussion.