Topical Guide

From NLPWiki

Jump to: navigation, search

Demo

Explore the 2011-2012 Republican Presidential Candidate Debates using the Topical Guide.

Download

The code is in a git repository hosted on Github. It can be obtained using the following command:

git clone https://github.com/BYU-NLP-Lab/topicalguide.git

Please see Getting Started for information on how to use the code.

The Topical Guide is a web application that facilitates the discovery of topical patterns and trends in large document collections. The Topical Guide relies on probabilistic topic models, such as LDA, to reveal the semantic content in such large corpora. Many individualized visualizations of topic models have been reported in the literature, showing the potential of topic models to give valuable insight into a corpus. However, good, general, interactive tools for browsing the entire output of a topic model along with the analyzed corpus have been lacking. The Topical Guide is an interactive tool that incorporates both prior work in displaying topic models as well as some novel ideas that greatly enhance the visualization of data analyzed by these models for the sake of discovery of trends.

The Topical Guide is a Django app. After inferring a topic model from a set of documents and importing the data into the browser, the tool is used by running a locally hosted web server and analyzing the data through a web browser.

Papers

Our "system" paper explaining the abilities of our Topical Guide:

Matthew J. Gardner, Joshua Lutes, Jeff Lund, Josh Hansen, Dan Walker, Eric Ringger, Kevin Seppi. "The Topic Browser: An Interactive Tool for Browsing Topic Models". Proceedings of the Workshop on Challenges of Data Visualization, held in conjunction with the 24th Annual Conference on Neural Information Processing Systems (NIPS 2010). December 11, 2010. Whistler, BC, Canada.

Documentation

For information about many aspects of the Topical Guide, including how to import data into it, how to run the server locally, how to use the browser once it is running, and how to add new features, visit our documentation page.

License

The code for the Topical Guide is released under the terms of the AGPLv3 or (at your choice) any later version of that license. If for any reason you wish to use the code under other terms, please contact the Copyright Licensing Office, Brigham Young University, 3760 HBLL, Provo, UT 84602, (801) 422-9339 or 422-3821, Email: copyright AT byu DOT edu.

We also ask that if you use this code for academic purposes, any papers that result from the use of this code should cite the Gardner et al. paper referenced above.

Contributions to the code are welcome. Currently the best way to contribute is to email a patch to the textmining AT cs DOT byu DOT edu. Because of licensing issues we ask that you assign the copyright of any patch that you contribute to BYU.

Credits

Project Leaders: Eric Ringger and Kevin Seppi

Project Members: Jeff Lund, Chris Tensmeyer, Joey Cozza, Craig Jacobson

Alumni: Jared Forsyth, Matt Gardner, Josh Hansen, Tobias Kin Hou Lei, Joshua Lutes, Dan Walker,

Original Author: Joshua Lutes

Third-party software: See the list of third-party software used by the Topical Guide.

Personal tools