Saturday, July 9, 2005

Tools to manage the documents

Reading and referring papers and books are two everyday activities for the investigators. There are many commercialized tools to manage the documents. I recommend 3 pieces of software for free in this note.

JabRef is an open source project for BibTex database management. It is developed using Java, and has a neat GUI. If you are familiar with Latex and BibTex, it is very easy to handle the software. It also supports to import data from most reference sources. For Word users, it can generate RTF according to the output template included in the release package or defined by your own.

For those who have thousands of documents saved on local disks, it is unlikely to input the entries for each document to a database. It is better to have a tool automatically index the documents for us. The current version of Google Desktop Search can index and search documents of Word, Excel, PDF, PowerPoint, and even the caches of most popular web browsers. The search results are organized the same as the Google web search.

Docco is a part of the open source project named Tockit. It can index the text, html, xml, pdf, and word documents in a folder, and generate a graph with nodes as the searching keywords. The user will see the frequencies of each keyword and their combinations in the document collection. The process of indexing could be very long if the indexing folder contains lots of document. However, never set the indexing priority to the highest because the system will appear non-responsive.