In this visualisation, once again achieved with a relatively small amount of code, two sliders are used to select Postcodes based on SEIFA Index of Education and Occupation and also Private Health Cover Percentage.
What interesting correlations can you find?
from bokeh.plotting import figure, show
from bokeh.layouts import layout
from bokeh.io import output_notebook
from bokeh.models import GeoJSONDataSource, LinearColorMapper, ColumnDataSource, CDSView, BooleanFilter, RangeSlider, CustomJS
output_notebook()
# switch to mercator projection to match OpenStreetMap background
pc_sf_mercator = pc_sf.to_crs(epsg=3857)
pc_sf_mercator['long'] = pc_sf_mercator.centroid.x
pc_sf_mercator['lat'] = pc_sf_mercator.centroid.y
pc_sf_data = pc_sf_mercator[['Details','PrivateHealth_percentpp','ieo_percentile','Postcode','long','lat']]
source = ColumnDataSource(pc_sf_data)
geo_source = GeoJSONDataSource(geojson=pc_sf_mercator.to_json())
range_sliderieo = RangeSlider(
title='SEIFA Index of Education and Occupation',
start=0,
end=100,
step=1,
value=(pc_sf_data['ieo_percentile'].min(), pc_sf_data['ieo_percentile'].max()),
)
range_sliderph = RangeSlider(
title='Private Health Cover Percentage per Postcode',
start=0,
end=100,
step=1,
value=(pc_sf_data['PrivateHealth_percentpp'].min(), pc_sf_data['PrivateHealth_percentpp'].max())
)
ieofilter = BooleanFilter([True]*len(pc_sf_data))
phfilter = BooleanFilter([True]*len(pc_sf_data))
callback_ieo = CustomJS(args=dict(ieofilter=ieofilter, source=source), code="""
const start = cb_obj.value[0];
const end = cb_obj.value[1];
const bools = []
for (var i = 0; i < source.length; i++) {
if (source.data['ieo_percentile'][i] >= start && source.data['ieo_percentile'][i] <= end) {
bools.push(true);
}
else {
bools.push(false);
}
}
ieofilter.booleans = bools;
source.change.emit();
""")
callback_ph = CustomJS(args=dict(phfilter=phfilter, source=source), code="""
const start = cb_obj.value[0];
const end = cb_obj.value[1];
const bools = []
for (var i = 0; i < source.length; i++) {
if (source.data['PrivateHealth_percentpp'][i] >= start && source.data['PrivateHealth_percentpp'][i] <= end) {
bools.push(true);
}
else {
bools.push(false);
}
}
phfilter.booleans = bools;
source.change.emit();
""")
range_sliderieo.js_on_change('value', callback_ieo)
range_sliderph.js_on_change('value', callback_ph)
TOOLTIPS = [
("ieo", "@ieo_percentile"),
("ph", "@PrivateHealth_percentpp"),
("ti", "@TaxableIncome_dollarspr"),
("pc", "@Postcode")
]
p = figure(title='Aus',
height=600,
width=800,
tools=['pan','wheel_zoom','hover','reset'],
tooltips=TOOLTIPS,
x_axis_type='mercator',
y_axis_type='mercator',
lod_threshold=None,
match_aspect=True
)
p.patches('xs', 'ys',
fill_alpha=0.0,
fill_color='white',
line_color='black',
line_width=0.5,
source=geo_source
)
p.add_tile('CartoDB Voyager',
retina=True
)
p.scatter(x="long",
y="lat",
size=8,
color="blue",
hover_color="red",
source=source,
view=CDSView(filter=ieofilter & phfilter)
)
layout = layout(
[
[range_sliderieo,range_sliderph],
[p]
]
)
show(layout)
Key Learning
Key Learning #6 - Datasets with common ‘keys’ can be combined to allow for more powerful analysis.
Key Learning #7 - Python with SQLite can perform SQL style queries with external SQL databases.
Key Learning #8 - Interactive visualisations are best; the user is not dictated to, instead they are guided and can choose their own analysis journey. Incorporating intuitive elements can allow even more insights.
Key Learning #9 - Web browsers are the ideal vehicle to use for sharing visualisations. Python has many tools which can present visualisations using only the web browser, no need for other software or databases/data servers.