A Novel System for Regional Twitter Hate Speech Analysis and Detection using Deep Learning Models and Web Scraping

Nicole Ma1, Yu Sun2, 1Sage Hill School, USA, 2California State Polytechnic University, USA; Nicole Ma1, Yu Sun2, 1Sage Hill School, USA, 2California State Polytechnic University, USA

A Novel System for Regional Twitter Hate Speech Analysis and Detection using Deep Learning Models and Web Scraping

Authors

Nicole Ma¹, Yu Sun², ¹Sage Hill School, USA, ²California State Polytechnic University, USA

Abstract

Instances of hate speech on popular social media platforms such as Twitter are becoming increasingly common and intense. However, there still exists a lack of comprehensive deeplearning models to combat Twitter hate speech. In this project, a comprehensive detection and reporting platform, entitled “TweetWatch,” was created to solve this issue. A binary classification CNN (Convolutional Neural Network) and a multi-class CNN were created to detect hate speech from real-time Twitter data and classify tweets with hate speech into five categories. The binary classification model has an AUC score of 98.95% and an F1 score of 97.88%. The multi-class classification model has an AUC score of 89.46%. All metrics reached over a targeted 5% increase from previous models in multiple papers, validating the proposed solution. Additionally, the only real-time choropleth map for hate speech in the United States was successfully created.

Keywords

Web scraping, Natural language processing, Deep learning, Neural networks.

CS&IT Conference Proceedings

A Novel System for Regional Twitter Hate Speech Analysis and Detection using Deep Learning Models and Web Scraping