Ashwini Badgujar, Sheng Cheng, Andrew Wang, Kai Yu, Paul Intrevado and David Guy Brizan, University of San Francisco, USA
In this project, we continuously collect data from the RSS feeds of traditional news sources. We apply several pre-trained implementations of named entity recognition (NER) tools, quantifying the success of each implementation. We also perform sentiment analysis of each news article at the document, paragraph and sentence level, with the goal of creating a corpus of tagged news articles that is made available to the public through a web interface. Finally, we show how the data in this corpus could be used to identify bias in news reporting.
Content Analysis, Named Entity Recognition, Sentiment Analysis.