Program Day 2

January 7 (Wednesday)

Session 5: Poster Session (9:00–10:30 am) Zoom
	Breakout Room 1	Breakout Room 2
	Moderator: Yuki Shiraito	Moderator: Ben Goldsmith
	Seo-young Silvia Kim, Bernard L. Fraga, Bradley Spahn, Alan N. Yan. When Do Voter Files Accurately Measure Turnout? How Transitory Voter File Snapshots Impact Research and Representation.	Jiongyi Cao, Kosuke Imai, Michael Lingzhi Li. Experimental Evaluation of Dynamic Individualized Treatment Rules. In recent years, machine learning algorithms have been used to develop individualized treatment rules (ITRs). Applications of such methodology include personalized medicine and micro-targeting in business and politics. What is lacking in the literature, however, is a robust way to evaluate the empirical performance of ITRs before implementing them in practice. Recently, Imai and Li (2021) introduced an experimental evaluation methodology that only relies upon the randomization of treatment assignment and random sampling of units without making any modeling assumptions. Thus, the methodology is applicable to ITRs that are derived using any generic machine learning algorithm. We extend this methodology to the dynamic ITRs in sequential multiple assignment randomized trials (SMART). We introduce an evaluation metric that decomposes the performance of a dynamic ITR into separate time periods while accounting for a budget constraint. We propose an unbiased estimator of this evaluation metric and derive its finite-sample variances. We conduct simulation studies to show that the confidence intervals based on the proposed finite-sample variance estimator have a good coverage even in a small sample size. Finally, we apply our methodology to the experimental data from the Tennessee’s Student Teacher Achievement Ratio (STAR) project.
	Michio Umeda. Aggregating qualitative district-level campaign assessments to forecast the 2021 LH general election results in Japan. Nowadays, poll aggregation is being conducted in the USA and European democracies for electoral forecasting. In Japan, however, this has not been the case because the news media report on electoral campaigns with qualitative assessments rather than poll numbers, although these assessments are based on extensive polling. Our study applies the approach Umeda (2021) developed, which aggregates the qualitative district-level election campaign coverage using the Item Response Theory, to the coming 2021 general election for Japan’s Lower House (LH) of the National Diet. Umeda (2021) applied the method to forecast the 2017 LH general election outcomes based on the media coverage available before the voting day. We examine the effectiveness of the approach by assessing the accuracy of the forecast against the actual results at the 2021 LH general election.	Huaitian Lu, Xun Pang. A Causal-Predictive Machine Learning Method with Temporal Convolutional Networks for Panel Data. Recent developments of integrated causal-predictive machine learning models increase the learned representation of control units for counterfactual prediction while reducing model dependence. They have been increasingly used for causal inference with panel data to overcome the identification challenges arising from temporal and spatial dependencies. This research adopts the potential outcome framework and proposes a deep learning method using Temporal Convolutional Networks (TCNs) for causal inference with longitudinal data. This method is related to latent factor models and doubly-robust estimators, but it does not require proper model specifications or assume a linear history of the data-generating process (DGP). Compared to other machine learning approaches such as Recurrent Neural Network (RNNs), our TCN-based method exploits not only temporal dependence in data but also utilizes spatial relationships to learn a representation of control units. It is also more computationally efficient because the convolutional architecture allows parallel computing. We test the performance of the proposed method with simulated data. In the demonstration using empirical data, we re-analyze the example in Poulos and Zeng (2021): an RNN-based approach and its application estimating the causal effect of U.S. homestead policy on public school spending.
	Matthew P. Robertson. Bringing Big Data to Chinese Politics: Learning From the People’s Daily Corpus. Over the past decade the ability to collect massive datasets on political phenomena has allowed political scientists to ask and answer new questions. In the sub-field of Chinese politics, acquiring and analyzing large-scale text corpora is likely to become increasingly important given restrictions on fieldwork and threats to scholars’ safety. Here, I present a new resource for the study of Chinese politics: a dataset of 75 years of articles in the People’s Daily, comprising over 2 million records. I discuss the significance of this dataset in developing and testing theory about Chinese politics, introduce the data schema, and use the dataset to contribute new empirics to ongoing debates in Chinese politics: (1) I examine the salience of Xi Jinping in The People’s Daily versus previous Communist Party leaders, including Mao Zedong, and (2) I propose a forecast model to explore whether characteristics of People’s Daily editorials and news reports are predictive of imminent political purges. I then suggest other questions in Chinese politics the data could be used to address. The dataset will be made publicly available and be continuously updated on a public repository.	Hsu Yumin Wang. A Latent Measure of Mass Threats in Nondemocracies. Mass threats are a critical factor in explaining regime change and various political outcomes of authoritarian politics. However, the literature to date is divided over how to measure it in cross-national settings. To measure mass threats, numerous prior studies rely on measures related to economic grievances, whereas others emphasize the aspect of organizational capacity of mass mobilization. Additionally, substantial missing data remains a common problem of the existing measures of mass threats. In this paper, I propose a more comprehensive, latent measure of mass threats in non-democracies that seeks to bridge the divide. Utilizing a Bayesian dynamic latent variable approach, the model synthesizes information on manifest indicators from the two facets, generating time-series cross-sectional data of mass threats covering 138 authoritarian countries from 1970 to 2008. I conduct several checks to demonstrate the validity of the new measure and use it to replicate Svolik’s (2013) central results of the inverted U-shaped relationship between mass threats and military intervention.
	Lucie Lu We Hear You: How do State-run Media Pay Attention to Online Public Opinion? Winning citizens’ hearts and minds has long preoccupied autocrats, and scholars seem to agree that they have not been very skilled at it. While generations of scholars have pushed for a more nuanced view of elite communication in mobilizing voters in democracies, we know surprisingly little about how the dynamics of top-down communication shape the public in autocracies in an informational age. How do autocrats propagandize on social media so that audiences will not simply ignore, resist or ridicule their messages? I argue that autocrats are attentive to the audiences’ non-political interests even at the expense of promoting their political messages. Based on original data collection on a Twitter-like Chinese social media, Weibo, I attempt to show under what conditions do autocrats use state-controlled media to respond to the online public. I use a combination of machine learning algorithms (Spectral Clustering and Random Forests) to classify and analyze over 100, 000 trending searches and social media posts from key state-controlled media outlets. I show that specific social and entertainment topics are more likely to attract responses from the state-controlled media, while state-controlled media also engineer casual topics for public discussions. The state-controlled media’s political goals of disseminating propaganda are secondary to economic goals of gaining readership on social media. I demonstrate that 1) The state-controlled media leverage soft news to broaden their influence on social media; 2) Autocrats take cues from the online public and demonstrate responsiveness across issue spaces. The results challenge the traditional view of the state-controlled news outlets’ propaganda roles in authoritarian regimes.	Gechun Lin. When Counting Fails to Discover: Examining Short Text Similarity with A Meaning-Based Approach. Political scientists have employed a variety of automatic methods to construct variables-of-interest from texts. The mainstream approaches rely on word counting to discover lexical or thematic similarity of corpora. However, they remain inaccurate for short texts due to the lack of contexts resulted by limited amounts of words. In recent years, short texts are increasingly used in political communication, delivering information crucial to modern politics. Given the need and the challenge of analyzing short texts, I propose a meaning-based approach to examine short texts based on semantic similarity. To implement that, I import a deep-learning model GAN-BERT which precisely predicts semantic relationship of text pairs. This model takes advantage of contextualized text representations produced by BERT and utilizes a semi-supervised learning framework called GAN, which reduces the percentage of labeled data required for good performance. As a demonstration, I apply the model to obtain pairwise similarity of news headlines of US Supreme Court decisions. The GAN-BERT predictions show that similar news headlines are correlated to unanimous decisions. Other traditional approaches (include cosine similarity and Smith-Waterman alignment scores) fail to detect such a relationship. The contributions are twofold. First, this paper identifies a neglected dimension of text similarity, which is useful for studying short texts. Second, I introduce political science a GAN framework to effectively conduct text analysis. The GAN-BERT model which measures semantic text similarity will enable a broad set of studies in political science, such as polarization and information spread on social media.

Break (10:30-10:40 am) Zoom
	Tea Break
Session 6: Deep learning and Principal Strata Effect (10:40 am -12:10 pm) Zoom
10:40 am-	Zhenhua Wang, Olanrewaju Akande, Jason Poulos, and Fan Li. Are deep learning models superior for missing data imputation in surveys? Evidence from an empirical comparison. (paper)
11:10 am-	Cyrus Samii, Ye Wang, Junlong Aaron Zhou. Generalizing Covariate-tightened Trimming Bounds for Principal Strata Effects Using Adaptive Kernels. (paper)
11:40 am-	Discussant: Kentaro Fukumoto (Gakushuin University)
Break (12:10–1:10 pm) Zoom
	Lunch break
Session 7: Big Data and Causal Ordering (1:10-2:40 pm) Zoom
1:10 pm-	Haohan Chen, Yiqiang Wang, Tony Zirui Yang. Emotional Propaganda: An Audio-As-Data Approach to Chinese State-Run Cable News. (paper)
1:40 pm-	David Carlson, Abdulhakim Özcan. Estimating Causal Orderings and Relationships with Causal Gaussian Processes. (paper)
2:10 pm-	Discussant: Inbok Rhee (KDI School of Public Policy and Management)
Break (2:40–2:50 pm) Zoom
	Tea break
Session 8: Analysis of Treaty Text and Preference (2:50–4:20 pm) Zoom
2:50 pm-	Soo Yeon Kim, Thiyaghessan s/o Poongundranar. The Language of Institutional Design: Text Similarity in Preferential Trade Agreements. (paper)
3:20 pm-	ByeongHwa Choi and Yesola Kweon. Age and Trade Policy Preferences in an Aging Society: Evidence from Japan. (paper)
3:50 pm-	Discussant: Sung Eun Kim (Korea University)
Concluding Remarks (4:20 pm) Zoom
	Concluding Remarks