As now-president, Donald Trump prepared for his inauguration speech about making America great again, we decided to aggregate and visualize The game of tweets between the two US presidential candidates, Hillary Clinton and Donald Trump in the finish of the campaign.
We were specifically interested in tweets where they mentioned each other or other people. To add a little spice, we played around with some simple sentiment analysis too. Scroll down for our key findings.
The game of tweets in one Tableau Public dashboard
How many times the candidates tweeted about each other?
We set up a timeline and counted the number of tweets they posted mentioning each other daily. It turned out that Hillary Clinton tweeted more than twice as much about „the Donald” in average than he did about Hillary. This seems to confirm the theory that Hillary’s campaign team and pro-democratic media outlets had a huge positive impact on spreading the messages of „the Donald” even despite their negative articles about him.
There are also significant peaks in both candidates’ timeline so we researched what happened around those dates. The number of tweets commonly exploded on the days following the presidential debates and after Trump’s surprisingly harsh roast on Hillary at an annual charity dinner in the White House.
What kind of words did they use and how many times?
We filtered the results for the most frequently used top 10 positive and top 10 negative words by each candidate. The Donald seems to have used significantly more emotional words in his tweets than Hillary, regardless of polarity. Also, despite his inconsistent and wide range of proposals, he was very consistent with his selection of adjectives as demonstrated by the frequency of the top 1-2 words in his top 10s („crooked Hillary” for example).
How they tweeted about other people?
It looks like Hillary mostly tagged people when she had something positive to say, except for the Donald, compared to Trump’s usual roasts on pro-democratic media outlets like CNN or NYT. One surprising fact in his mentions is that he hate-tweeted more about both CNN and NYT than his opponent. We wonder if it was intentional to get them write more about him to spread his messages for free, or if it was a bug coded into his personality that accidentally turned into a feature during the campaign. Maybe we should make an analysis on that later…
Since these are public information, based on publicly available data, we only used free tools. For extraction and enrichment of Twitter data we choose KNIME, because KNIME has a built in Twitter connector and a simple user interface for data manipulation. To get the most beautiful dashboard at the end, we used Tableau Public for visualization.
Data extraction and enrichment with KNIME
We started with extracting all timeline data of the two candidates, which has turned out to be quite easy with KNIME. We dropped a Twitter API connection node to handle the authentication and connected that to two timeline-extractor nodes, one for each candidate. Then on one branch, we extracted the tagged users from every tweet of each candidate and counted how many times a certain Twitter user was tagged by a candidate in the „mentions” meta nodes. On the other branch we counted the frequency of words used by each candidate, and while doing that we had some fun with sentiment analysis to see which candidate beefed the other harder.
So as a next step we looked up a sentiment dictionary online. We didn’t want it to be the most comprehensive sentiment model the world has seen, so we used a dictionary differentiating only between positive and negative expressions, without weighting the expressions on any scale. After we found our dictionary, we counted the positive and negative expressions separately in each tweet and put them in two new columns to be able to use them for calculations later in Tableau.
We also mapped out the mentioned countries by each candidate in a third branch but we did not use that data in our Tableau dashboard.
At the end of our KNIME job we got three csv files. One containing words used by the candidates, their polarity and how many times that particular word has been used by each candidate, and the other two containing all the tweets of a certain candidate with additional fields like positive word count and negative word count.
Tableau Vizzard: Ivett Kovacs
KNIME Ninja: Matyas Sereg
Concept: Laszlo Kovacs
Did you like this article? Follow me on Twitter