Network Analysis

Techlab-Feb-19.mov

Network Analysis

A proof-of-concept to use social media data (e.g. twitter) to create network graphs.

The jupyter notebooks summarize and display the capabilities of jupyter notebooks as tools for (data) science in whatever way. In this special example we will look more closely at how to even create interactive graphs with plotly to show the interaction of twitter users in regard to Greenpeace activity. Sneak peak:

Files

jupyter.ipynb summarizes very shortly what jupyter notebooks do and don't do
networkAnalysis.ipynb is the notebook to create the graph shown above (among other things)
networkAnalysis.html is the above notebook exported as .html file, can be downloaded and viewed in your browser (needs javascript), is still interactive!
networkAnalysis-Google-Cloud-DataLab.ipynb is a version of the above notebook that directly works in Google DataLab; some adjustments needed to be made to make in run on Python 3.5, and it contains two lines to update packages in DataLab, since the preinstalled versions are outdated.

Setup of jupyter

The simplest way of setting up jupyter and a functioning Python installation on Windows and Mac is Anaconda: https://www.anaconda.com/distribution/

Download the Python 3.7 version. There is also a download for Linux, but I personally (as a Linux user) would recommend using your Linux distribution's built-in package manager to install all the necessary packages and versions, because Anaconda comes with it's own package manager, conda, which is again different from Python's own package manager, pip. So to avoid confusion about installed packages and versions, just stick to one.

Once everything is installed, you should be able to either open the jupyter Notebook from your applications or from the command line with $ jupyter notebook, which should open a new tab or browser window in your default browser.

Package versions

Packages like numpy are required by pandas anyway and will be installed as dependencies.

packageversion

pandas

0.23.4

plotly

3.2.1

networkx

2.2

matplotlib

3.0.2

python

>=3.6

jupyter in Google DataLab

For a quick start on how to use jupyter notebooks in the Google Cloud Platform, follow this link: https://cloud.google.com/datalab/docs/quickstart

Download

You can download the jupyter notebook used at the show and tell from GitHub

Network Analysis

Network Analysis

Files

Setup of jupyter

Package versions

jupyter in Google DataLab

Further reading on Python and documentations for the used packages

Download