+41 44 500 21 28

Do. Data.


Interested in getting more out of your data? Our Do. Data event series is for anyone interested in tackling data-related challenges, sharing their experiences with others, and enjoying an Apéro on us!

Our previous Do. Data events

Big Data & Twitter | 19 Oct 2017

We compiled Twitter and relationship data on tweets from @realDonaldTrump from Twitter, Trump’s connections from BuzzFeed, and world’s richest people / most powerful people / top 2000 companies from Forbes lists. Whether it was storing or analyzing the data, the right tool was selected for the right job. Python to download the data from Twitter APIs. A document database (MongoDB) to store the JSON files. Tableau to analyze and visualize the Tweet patterns. Finally, a graph DB (neo4j) to analyze and visualize the connections/relationships. One insight was that Trump started to use paid marketing on Twitter at a time when his approval ratings dropped significantly, from visualizations of his tweeting patterns using Tableau. Another insight was based on a K-means clustering of Trump’s Twitter friends and followers. Surpisingly, not all of his followers were conservative-leaning, gun-rights protectionists. Rather, there were five distinct groups, including urbanites/professionals.

For more insights, check out our Tableau workbook below, download our presentation, and try out some of the Cypher queries in neo4j. Contact us if you’d like a copy of the Trump World neo4j database that includes all the data from Twitter, BuzzFeed, and Forbes lists.

Big Data Twitter

New York City Taxis & Uber | 18 May 2017

In New York City, with a population of roughly 8.5M, there are over 750K taxi rides every day (incl. Uber, Lyft, etc.). The market leaders are Yellow Taxis with 42% of the market, followed by Uber with 30% (as of 2017). With data downloaded from NYC OpenData/Taxi & Limo Commission, 25M records for Dec 2016 were visualized in Tableau. One surprising insight is Uber has a higher market share of rides in upper Manhattan and the outer boroughs and lower market share in lower Manhattan, likely because they face stiff competition from the Yellow Taxis in lower Manhattan (designated the “Yellow Zone” by the NYC Taxi & Limo Commission).

See our Tableau workbook below for the full story. Disclaimer: The data has been limited to Dec 16 – 31 due to Tableau Public limitation of 15M records.

Open Data New York Taxis Uber

Australia’s Gender-Pay Gap | 23 March 2017

The Australian Taxation Office published tax data for the years 2013-2014, with average income broken out by gender and occupation. Not surprisingly, males had higher average incomes than females for the same occupation, across almost all occupations. Exceptions included futures traders and mountain guides. While the data has some interesting insights, it doesn’t alone tell us if there’s a difference in the average hourly wage of females and males.

Peruse our Tableau workbook with different charts to inspire you. Charts include a boxplot, scatterplot, dumbbell (using female/male icons), and distribution.

gap analysis