The following is based on discussions with Dr. Paul Sivilotti to see if he could guess why some students (who have used Eclipse) were ending up with such large submissions for the Tokenizer project. Paul suggested the following possibilities:
1. Perhaps they are looking at their workspace (maybe zipping it up for submission) including hidden directories?
2. The Eclipse "workspace" is a directory in which projects are stored as subdirectories. Project (ie subdirectories) of the workspace then include the various source files, compiled files, links to other resources, etc.
But the main "workspace" directory also contains a hidden directory called .metadata. That directory holds a great deal of preference and history information. This directory is Eclipse-specific. If students are writing Java code, they only need to turn in the project directories.
3. Another possibility is that they have put a big library jar file in the project directory itself.
4. Javadoc results in many files, but they aren't very big. And Eclipse won't generate Javadoc automatically, you have to explicitly run "Generate javadoc..."
5. I can't think of anything else that would make an Eclipse project look big.
Hopefully that gives you all some ideas on what to look for (especially when it is time to submit the interpreter project).