Interactive Visualisations

In the previous visualisations the user can’t change the core information which is shown, in effect the visualisations are ‘static’.

The most useful visualisations don’t limit the user to a static analysis, they incorporate interactivity, where the visualisation can be changed by the user varying relevant parameters.

The user is not dictated to, instead they are guided and can choose their own analysis journey. Incorporating intuitive elements can allow even more insights.

There is always one key issue which needs to be addressed with data analysis and interactive visualisations - how to share them with those who will use them?

Essentially the humble web browser is the one application we know will most likely be on any device capable of using the visualisations.

Assuming we have a method of sharing a webpage (eg GitPages, static website instance), let’s explore some visualisations which function completely within a web browser without the need for a separate data server.

Bokeh and JS Callbacks

Bokeh is a powerful package for creating static and interactive visualisations. Whilst a little more complicated to setup than Matplotlib from earlier, Bokeh allows interactive visualisations which work in a web browser alone and are light on resource use. Interactivity is achieved using JavaScript callbacks, where Javascript ‘aware’ Bokeh functions and small amounts of JavaScript are included to update a visualisation when an interactive element is changed, such as a slider or a button.

Demonstration 2 - Bokeh Visualisation

Let’s create a visualisation which explores correlations between the two data sources earlier. We will plot SEIFA percentile/rank versus Private Health participation for each Postcode for a user chosen range of Taxable Income.

This is achieved by

  • Recreating the previous dataframe combining tax and seifa data, with an additional column created ‘Returns_scaled’, to limit the variation in ‘bubble size’ in the plot.
  • Defining a RangeSlider to allow the user to vary the range of Mean Taxable Income shown.
  • In Bokeh 3.6.3, for the legend to function correctly it is necessary to plot the ‘bubbles’ separately for each State.
  • Creating a ColumnDataSource for each State’s data.
  • Creating BooleanFilters, also for each State, which are initialised to show all and then use JS Callbacks (CustomJS) to update the view (CDSView) for each State based on a change in the slider range (js_on_change).
  • Adding Tooltips for the Hover function.

Can you draw any interesting conclusions from the visualisations?

from bokeh.plotting import figure, show
from bokeh.layouts import layout
from bokeh.io import output_notebook
from bokeh.models import ColumnDataSource, CDSView, BooleanFilter, RangeSlider, CustomJS, Range1d
from bokeh.transform import factor_cmap
from bokeh.palettes import Paired

output_notebook()

# Create a dataframe from earlier # Tax data combined workflow
# map the number of Returns for each Postcode into a range 0.5 thru 4, to limit range of circle sizes in plot
tax_seifa = (tax2022_raw.query('~State.isin(["Unknown","Overseas"])')
            .assign(TaxableIncome_dollarspr = lambda x: round(x.TaxableIncome_dollars/x.Returns/1000,0))
            .assign(PrivateHealth_percentpp = lambda x: round(x.PrivateHealth_returns/x.Returns*100,0))
            .assign(Returns_scaled = lambda x: np.interp(x.Returns,[tax_seifa['Returns'].min(),tax_seifa['Returns'].max()],[0.5,4]))
            .merge(seifa2021_raw, how="inner", on="Postcode"))

# The legend in Bokeh @ver:3.6.3 behaves incorrectly when using a single circle plot command for all States along with JS Callbacks
# Solution is to use separate circle plot commands maintained in arrays.
sources = []
tifilters = []
states = tax_seifa['State'].unique()
for state in states:
    sources.append(ColumnDataSource(tax_seifa[tax_seifa['State'] == state]))
    tifilters.append(BooleanFilter([True]*len(tax_seifa))) # initialise BooleanFilter to all True

range_sliderti = RangeSlider(
    title='Mean Taxable Income per return ($K)',
    start=tax_seifa['TaxableIncome_dollarspr'].min(),
    end=tax_seifa['TaxableIncome_dollarspr'].max(),
    step=1,
    value=(tax_seifa['TaxableIncome_dollarspr'].min(),tax_seifa['TaxableIncome_dollarspr'].max())
)

callback = CustomJS(args=dict(tifilters=tifilters, sources=sources), code="""
                                          const start = cb_obj.value[0];
                                          const end = cb_obj.value[1];
                                          for (var sourceno = 0; sourceno < sources.length; sourceno++) {
                                            const bools = []
                                            for (var i = 0; i < sources[sourceno].length; i++) {
                                              if (sources[sourceno].data['TaxableIncome_dollarspr'][i] >= start && sources[sourceno].data['TaxableIncome_dollarspr'][i] <= end) {
                                                bools.push(true);
                                              }
                                              else {
                                                bools.push(false);
                                              }
                                            }
                                            tifilters[sourceno].booleans = bools;
                                          }
                                          sources[sourceno].change.emit(); 
                                          """)
range_sliderti.js_on_change('value', callback)

TOOLTIPS = [
    ("ieo", "@ieo_percentile"),
    ("ph", "@PrivateHealth_percentpp"),
    ("rt", "@Returns"),
    ("ti", "@TaxableIncome_dollarspr"),
    ("pc", "@Postcode")    
]

p = figure(title='Demonstration 2 - Bokeh Visualisation with interactive slider',
           x_axis_label='ieo_percentile',
           y_axis_label='PrivateHealth_percentpp',
           tooltips=TOOLTIPS,           
           lod_threshold=None,
           match_aspect=False,
           width=1000)

for idx, state in enumerate(states):
    p.circle(x='ieo_percentile',
             y='PrivateHealth_percentpp',
             radius='Returns_scaled',
             fill_color=factor_cmap('State', palette=Paired[8], factors=states),
             fill_alpha=0.5,
             source=sources[idx],
             legend_label=state,
             view=CDSView(filter=tifilters[idx]))

p.legend.location = "top_left"
p.legend.click_policy="hide"
p.x_range = Range1d(-10,110)
p.y_range = Range1d(-10,110)

layout = layout(
    [
        [range_sliderti],        
        [p],
    ],
)

show(layout)
Loading BokehJS ...
Static HTML file