Clusters a set of documents into a user-specified number of bins. For each bin, identifies the defining topics/keywords.
To explore the major topics in a diverse batch of documents. The algorithm/s do not require training (no document annotation is needed) so topic extraction can serve as a rapid assessment tool.
Your browser will display a topic table which presents the number of documents in each cluster as well as the defining keywords/topic signature of each cluster. In addition, the output csv available for download will contain all the original columns in the input csv, plus a "Topic" column indicating the bin to which each document belongs. The output csv will also contain the topic table described above appended to the right of the Topic column.
Overview of Input and Output file formats.
Classifies a set of documents as relevant or non-relevant to a topic. Assigns discrete (1-6) priority scores to each document.
Classifies a set of documents as relevant or non-relevant to a topic. Assesses the probability of a document being relevant to the topic of interest.