Flight data visualisation with Pandas and Matplotlib

Learn how to quickly plot aircraft flight routes on a map for data visualisation.

Hugo Larcher
Coding Entropy
Published in
4 min readOct 19, 2015

--

I recently had to play with aircraft flight data to assess CO2 emissions along flight routes. Given the data points I found myself a little puzzled when trying to find an adequate way to easily communicate the results. Data tables are boring and do not allow understand the connections between the different routes. I gave a try to a bubble chart overlay but I was only getting a bloated map. I then remembered a famous Facebook visualisation that presented all friends connections with “flight paths” and created a world map. Really nice, and allowed to represent a third dimension using path colours. In this example I’ll try to plot some aircraft routes and color them by the number of flights per day.

Importing data with Pandas

I can haz CSV? — (By Isa2886)

When it comes to data manipulation, Pandas is the library for the job. It allows easy manipulation of structured data with high performances. My dataset being quite small, I directly used Pandas’ CSV reader to import it. I called the read_csv() function to import my dataset as a Pandas DataFrame object. I just needed to escape the first row which contained some headers and to define the delimiter (autodetection does not work here…):

Using GCMap

A little googling led me to GCMap, a great tool ported to Python by paulgb and based on Facebook visualisations from 2010.

My dataset only contains departure and arrival airport. Luckily, GCMap computes the flight routes using a Great Circle Distance calculation. This is not the real aircraft trajectory but will be perfect for our use.

The above script seemed to work but with my first dataset I could only see a few lines on a black background! The script does not plot country borders so it needs a dataset with enough world coverage. Trying with another dataset I got a much more decent result.

Worldwide flights during 24 hours in September. Color scale is based on the number of flights

Back to my original dataset I also needed a way to plot country borders. This is not implemented in GCMapper.

According to documentation, GCMap also uses an algorithm which does not allow comparison between two coordinates pair. This is clearly not suitable for accurate data analysis.

Matplotlib to the rescue

Matplotlib is the default choice for data visualisation in Python. It comes with a handful basemap plotting toolkit which easily allows to add country boundaries. It also includes a useful function to compute the Great Circle path between two geographic points : neat!

The script is similar to GCmap: it estimates the flight path between departure and arrival airports using great circle distance and plots it with a colour depending on the number of flights. I also implemented a little hack that detects when a route intersects the edge of the map: matplotlib’s default behaviour is to link the two opposite points, resulting in a straight line crossing the whole map.

I implemented two approaches to compute the route color. The first one is based on the absolute value of the number of flights. It is perfect for comparison but your distribution must not have an excessive standard deviation. Otherwise you’ll only see the routes with the greater number of flights. In that case, use a power-law normalizer to compute the color (matplotlib PowerNorm). The result is great even with small datasets:

Visualisation of flights for Air France during 24 hours

The second approach sorts the routes by their number of flights and attributes one color to each route ranging for black (less flights) to pink (more flights). The result is quite similar to GCmap, with countries boundaries added:

Visualisation of the dataset with matplotlib using relative coloring

Take care, if you want to compare multiple visualisations you must use the same normalisation.

Here is the script I used:

Pink on black visualisations are really nice for screen display but are clearly a toner cartridge killer. Let’s save the planet and add a printing colour mode :

Saving the planet

There we are : we can easily create a nice and really comprehensive visualisation either for on screen display or printer friendly. You can now use this script to plot more useful data (like CO2 impact of the route).

Get the code on GitHub and feel free to comment!

--

--