Can an algorithm predict a politician’s future just by analyzing their tweets? | Karina Petrova, PsyPost A new statistical model has successfully sorted members of the U.S. Congress into distinct political and legislative groups based solely on their interaction patterns on the social media platform X. Published in the Journal of Computational and Graphical Statistics, the study also identified a small number of outliers whose online behavior appeared to signal ambitions for higher office. Politicians use public platforms to communicate their principles and policy stances to voters. Beyond formal statements, many also cultivate a personal brand through their choices in language and style. With a great deal of political messaging now occurring online, researchers have been investigating how elected officials position themselves by strategically associating with others in their coalition. The new study was conducted by Benjamin Leinwand, an assistant professor of mathematical sciences at Stevens Institute of Technology, and Vince Lyzinski, a mathematics professor at the University of Maryland. They specialize in network science, a field that analyzes the connections within complex systems. They sought to determine if a model could deduce the underlying structure of a political network by observing only the interactions, without being given any information about political affiliations or which chamber of Congress a member belongs to. To understand the social landscape of Congress, the researchers needed a tool that could map out the complex web of online interactions. At its core, any such statistical model attempts to calculate a simple value for every pair of individuals: the probability that they will connect. This produces a blueprint of the network, showing which connections are likely and which are not. Some widely used network models approach this task by combining a few key factors. For instance, a model might estimate the probability of a connection by multiplying a person’s individual “sociability” score with a score representing how interactive their group is. This method works well in many scenarios, but it can break down in networks with extreme variations. The issue arises in densely connected communities where a few individuals are exceptionally active. In such cases, the model might assign very high sociability scores to these active people and a high interaction score to their group. When these high scores are multiplied together, the resulting probability can exceed 1, or 100 percent. This is a mathematical impossibility that signals the model is failing to accurately represent the underlying social dynamics. The new model developed by Leinwand and Lyzinski is built on a different mathematical foundation specifically designed to avoid this problem. Its internal calculations are constructed in a way that guarantees the final output for any pair of politicians is always a valid probability, a number between 0 and 1. This ensures the model produces a coherent and logical map of the network, even in its most active and complex regions. Beyond just preventing errors, this new approach offers greater flexibility. It does not assume that the patterns of connection are the same across the entire network. For example, some models might implicitly assume that the most socially active members of one group are most likely to connect with the most active members of another. The new model, however, can detect more intricate patterns. It could, for instance, find a situation where moderate members of two different political parties interact frequently, while the most partisan members of those same parties interact very little. It can also recognize that an individual’s tendency to form connections might change depending on the community they are interacting with, providing a more detailed and realistic portrait of political communication. Using this model, Leinwand and Lyzinski analyzed the public activity of 475 members of the 117th U.S. Congress. Their dataset included every member who posted at least 100 tweets during a four-month period, from February 9, 2022, to June 9, 2022. The model defined a connection between any two politicians if one of them tweeted at or retweeted the other during this time frame. “We call two people ‘connected,’ if either one in the pair tweeted at the other one or retweeted the other one during this period,” Leinwand explained. The model was not provided with any information about a politician’s party, their chamber, or their policy positions. It was tasked with sorting the 475 individuals into groups based only on the web of their digital connections. The model identified three primary communities. These algorithmically-defined groups fell along familiar political lines. The first group was composed almost entirely of Senators. The second community consisted mainly of Democratic members of the House of Representatives, and the third was made up largely of Republican members of the House. The analysis showed that politicians within these three groups tended to interact most frequently with members of their own community. “Republican congresspeople talked among themselves a lot, and Democratic congresspeople talked among themselves a lot, though Democratic congresspeople were somewhat more likely to interact with Senators than their Republican counterparts,” said Leinwand. He offered a potential explanation for this pattern. At the time of the observation, Democrats held the majority in the Senate. As a result, “one could imagine that Democratic congresspeople might be incentivized to amplify senate leadership messaging in addition to their allies in the House,” he continued. ...