Computational Political Science

This project creates a general research platform to study civil protests, international conflict, and civil unrest using texts from around the world in English, Spanish, and Arabic. Data produced from the text is becoming one of the most important new sources of information for quantitative political science, but most publicly available event datasets are limited to English language sources.

This project we build pipelines to extract political events and create tools to organized teams to effiecntly create gold standard events.

Code

Fajita: Event data data crowd-sourcing tool
Biryani: Natural laguage processing Event data pipeline
Birdcage: a distributed framework for generating event codings with geolocation.

Publications

Andrew Halterman, Christan Grant, Jill A. Irvine, Yan Liang, Manar Landis. Arabic Event Data: A New, Machine-Coded Dataset from Arabic Text. Annual Meeting of the International Studies Association (ISA). San Francisco, CA. 2018.

Yan Liang, Andy Halterman, Phanindra Jalla, Solaimani Mohiuddin, Manar Landis, Jill Irvine, Christan Grant. Adaptive Scalable Pipelines for Political Event Data Generation. The IEEE International Workshop on Benchmarking, Performance Tuning and Optimization for Big Data Applications (BPOD 2017). Boston, Massachusetts. 2017.