Visualizations with Dash Leaflet Series - Part 1: Setup and Configuration
Posted by Haw-minn Lu in Visualizations
Introduction
This tutorial series came about when the Data Science team here at the West Health Institute was asked to provide a highly interactive map visualization by the West Health Policy Center and the University of Pittsburg School of Pharmacy for their study: Access to Potential COVID-19 Vaccine Administration Facilities: A Geographic Information Systems Analysis. We'll describe more about the visualization later.
In the evaluation process Dash Leaflet prove to be a powerful though underdocumented library. As a result, many sources of documentation as well as some source code sleuthing were required to acquire sufficent expertise to implement the visualization. This tutorial series serves as a vessel to share this expertise with the community at large.
What is Dash Leaflet?
Dash leaflet is a lightweight wrapper around the popular Leaflet.js
library but also includes supercluster
, MapBox's open source clustering library. The wrapper presents the various React components of Leaflet.js
as dash components that are easy to configure to those comfortable with the dash
framework.
Why Dash Leaflet?
There are other map visualizations and pythonization of Leaflet.js
available, so why pick Dash Leaflet amongst these choices. In our case, it came down to our requirements and Dash Leaflet was the only one which met them.
At a minumum requirements were that it include a reasonably fast clustering solution, be able to handle data on the order of 80,000 features, be low cost or free and be deployable as a standalone app.
MapBox was a leading candidate. MapBox provided map visualization capabilities and had a very fast clustering algorithm which runs client side and even has components in dash
. However, MapBox works on a freemium model and could become too costly for a non-profit project such as the aforementioned study. Additionally, using MapBox would require key management to be incorporated. MapBox might have remained the option of choice had we not found Dash Leaflet.
Jupyter's ipyleaflet
was another potential candidate. In truth, like MapBox, it has a lot more customizations baked in (such as predefined marker shapes). The Achilles heel for ipyleaflet
is that there was no standard deployment method as a standalone app. The documentation recommended the use of voila
for deployment, but for our particular use case, this resulted in a startup time of up to 15 minutes per session, which simply wasn't acceptable.
With all these benefits, Dash Leaflet is very barebones and does not include many of the conveniences of some other packages so some additional effort is required to perform customizations.
Resources
Documentation is somewhat scant with Dash Leaflet. The bulk of the documentation are examples of usage with reference to the Leaflet.js
documentation. Since much of the Dash Leaflet components mirror that of Leaflet.js
, one can get a majority of the needed information. One deficiency is that it is often unclear what properties are exposed to dash
. Often to figure this out, a little sleuthing in the source code is necessary in particular in the src/lib/components/
directory. The code itself is sufficiently documented to make this determination.
An Complex Example
We use the aforementioned Covid Vaccination visualization as an example. The purpose to of this section is to give an overview of the custom features and explain at a high level how the features were implemented. If you are interested in the subject matter please refer to the study mentioned above.
If this visualization takes some time to start up initially it is a consequence of our use of AWS Lambda functions to deploy the app not a consequence of the Dash Leaflet framework. You can deploy to a dedicated server if you want to avoid this issue.
The visualization has two active modes, the overview mode was described in the national view. After a state is selected the visualization enters a detail mode. The selection of various options can yield both a state or county view while in the detail mode.
When the visualization is first shown a national view is presented, each state has a custom marker indicating the number of potential vaccination centers as projected by the study. Over each marker is a tooltip which breaks down the vaccination centers into four categories. These markers are precalculated at this level and do not use clusters. They are intended to give a similar look and feel as the clusters that are presented in the state and county views. To this end the markers are placed at the centroid of each state. Hovering over each state highlights it. Clicking on a state or selecting it from the drop down menus on the lower left selects the state and goes into a state view.
Under the hood, clicking on a state triggers a callback which sets the state of the drop down menu. Because of this zooming is governed by the dropdown menu's callback so automatic zooming features built into leaflet
can not be used.
Once the state is selected, the visualization enters the detail mode. At this point the counties for the selected state are shown. For basic navigation, clicking on the county activates the dropdown menu's callback for each state. Hovering over any county displays an information box in the upper right corner with statistics about the county. Clicking on a neighboring state selects that state. At the detail level potential vaccine centers are clustered employing the MapBox
supercluster
library. An automatic feature of the library is that clicking on the cluster marker zooms in on the cluster until that cluster separates into smaller clusters or individual sites. A custom tooltip is created deriving from the contents of a given cluster.
Finally, the individual potential vaccine sites have different symbols for markers. The marker legend is also a filter which activates the dropdown callback for facility type.
All of these features and customizations will be explained in subsequent blog articles.
Getting Setup
Our github repository contains Jupyter Notebooks for all the articles.
If you are skilled in python, Jupyter Notebooks is not required.
Prerequesites
For this tutorial series, it is expected that the reader be moderately skilled in programming in python and using plotly dash
and is comfortable in the Jupyter environment.
While not required, familiarity with pandas, geopandas and javascript is helpful.
Required Libraries
To run the notebooks, it is assumed that the following python packages be installed.
dash
jupyter-dash
dash-leaflet
pandas
dash_extensions
geopandas
If you are using conda
or mamba
, please note that at the time of the writing of this article, conda
's dependencies have an issue when installing jupyter-dash
. Dash must be installed with conda
first before jupyter-dash
or an obsolete version of dash
will get installed.
A Dockerfile
is provided as well which enables the reader to run the jupyter environment inside a docker container. If you use docker it is recommended that you install jupyter-server-proxy
.
Getting Started
To insure your environment is properly set up. The notebook contains the "Getting Started" example from the Dash-Leaflet
documentation ported to the Jupyter environment. This notebooks is designed to be compatible with operating in the docker environment. If you are not running in the docker enviroment. The following line is not necessary and can be commented out:
JupyterDash.infer_jupyter_proxy_config()
What this line does is it enables JupyterDash
to use Jupyter Server Proxy to proxy all connections to a single exposed port. While JupyterDash
is a wrapper that allows dash
to run inside a notebook. Dash
still opens a local flask server on a selected port.
Additionally, JupyterDash
replaces Dash
in the creation of the app
. The final change is that JupyterDash
's app
takes an mode
parameter. By setting, mode
to 'inline'
the visualization will display inside the notebook.
Finally, one might noticed that the port number is randomly selected. This is to permit the running of simultaneous dash
apps within any container. Remember while JupyterDash
inlines dash
it still uses a port even if it is internal to the container. By randomly selecting a port, it is unlikely you will have a port number conflict.
Outline
The following parts are tentatively scheduled for this blog series. This section will be update as the series is published.
- Part 2 - Basics - Review of Dash, Geopandas and Dash-Leaflet
- Part 3 - Feature Outline Maps
- Part 4 - Scatter Plot Maps
- Part 5 - Cluster Maps
- Part 6 - Complex Visualizations Using Multiple Layers
- Part 7 - Special Tricks and Hacks