I recently started to compute statistical properties of the NYC subway system. In order to create meaningful visualizations of the data, it is useful to be able to plot a map of the subway lines and stations in NYC. Using GeoViews, GeoPandas, and geographic data on subway lines and subway stations provided by the city of New York this daunting task becomes fairly trivial. GeoViews is an extension of HoloViews, enabling several geographic plot types.
First, download the geographic data as GeoJSON files and store them in a convenient folder. Then load them into GeoPandas data frames:
import geopandas as gpd from cartopy import crs lines = gpd.read_file('lines.geojson', crs = crs.LambertConformal()) stations = gpd.read_file('stations.geojson', crs = crs.LambertConformal())
Each row of the “lines” object now refers to a small portion of a subway track:
lines.head()
name | url | rt_symbol | objectid | id | shape_len | geometry | |
---|---|---|---|---|---|---|---|
0 | G | http://web.mta.info/nyct/service/ | G | 753 | 2000393 | 2438.20024902 | LINESTRING (-73.99487524803018 40.680203546062… |
1 | G | http://web.mta.info/nyct/service/ | G | 754 | 2000394 | 3872.83441063 | LINESTRING (-73.97957543205142 40.659930695530… |
2 | Q | http://web.mta.info/nyct/service/ | N | 755 | 2000469 | 1843.36633108 | LINESTRING (-73.97585637503069 40.575974505394… |
3 | M | http://web.mta.info/nyct/service/ | B | 756 | 2000294 | 1919.5592029 | LINESTRING (-73.92414355434533 40.752290926571… |
4 | M | http://web.mta.info/nyct/service/ | B | 757 | 2000296 | 2385.69853589 | LINESTRING (-73.91344685471373 40.756171576368… |
The geographic location of each track segment is defined in the “geometry” column by a Shapely linestring; the other columns provide additional information, for example the column “name” lists the id of the subway lines that usually travel along each segment.
The stations object contains information on the subway stations in New York City:
stations.head()
name | url | line | objectid | notes | geometry | |
---|---|---|---|---|---|---|
0 | Astor Pl | http://web.mta.info/nyct/service/ | 4-6-6 Express | 1 | 4 nights, 6-all times, 6 Express-weekdays AM s… | POINT (-73.99106999861966 40.73005400028978) |
1 | Canal St | http://web.mta.info/nyct/service/ | 4-6-6 Express | 2 | 4 nights, 6-all times, 6 Express-weekdays AM s… | POINT (-74.00019299927328 40.71880300107709) |
2 | 50th St | http://web.mta.info/nyct/service/ | 1-2 | 3 | 1-all times, 2-nights | POINT (-73.98384899986625 40.76172799961419) |
3 | Bergen St | http://web.mta.info/nyct/service/ | 2-3-4 | 4 | 4-nights, 3-all other times, 2-all times | POINT (-73.97499915116808 40.68086213682956) |
4 | Pennsylvania Ave | http://web.mta.info/nyct/service/ | 3-4 | 5 | 4-nights, 3-all other times | POINT (-73.89488591154061 40.66471445143568) |
Let’s add new columns to the data frames to control the color of the plotted lines and points. We will set all stations to be drawn in blue and all lines in grey. The entries in these columns can later be manipulated to communicate additional information — for example, if a train is delayed at a station, then that station’s color code could be set to “red”.
stations['color'] = 'blue' lines['color'] = 'grey'
We can then plot the data frame using GeoViews. The color is set by specifying the color column as a value dimension, and passing it to the color attribute of “opts”.
import geoviews as gv gv.extension('bokeh') lines = gv.Path(lines, vdims=['color']).opts(projection=crs.LambertConformal(), height=500, width=500, color='color') lines
We can also overlay the subway stations over this map:
stations = gv.Points(stations, vdims=['color']).opts(color='color') lines * stations
Finally, it would be nice if it were easy to identify individual stations in the system from the plot. We can do this by constructing a custom hover tool and passing it to the HoloViews plot. Try hovering your mouse over a subway station in the plot below and you should see a tool tip with the station’s name appear.
from bokeh.models import HoverTool hover = HoverTool(tooltips=[("station", "@name")]) stations = gv.Points(stations, vdims=['color', 'name']).opts(tools=[hover],color='color') lines * stations
To enable the tool tip, we had to specify an additional value dimension (‘name’), and instructed the hover tool to look up the value of this column in the data frame (‘@name’).