Python and Pandas have some powerful packages to make use of map data. Map data can be imported for example as a shapefile and visualised along with data using GeoPandas.
These shapefiles are not images based on pixels, but use vectors to effectively redraw the postcode boundaries in whatever environment it is used. They provide much more definition than is needed to display on a screen on a web page, so there are tools in Python, namely topojson, which resample the vectors to suit a webpage, greatly simplifying the file and making it smaller and quicker to render.
Once again, the map/shapefile data for Postcodes in Australia is readily available from the ABS website, with a suitable Creative Commons Licence.
To reduce time and resources needed for this workflow, we have created a preprocessed shapefile named pc_sf_raw.shz, which is already a simplified version of the map file and can be loaded with the code below.
pc_sf_raw = geopandas.read_file('pc_sf_raw.shz')
pc_sf_raw.head(5)
POA_CODE21 | POA_NAME21 | AUS_CODE21 | AUS_NAME21 | AREASQKM21 | LOCI_URI21 | SHAPE_Leng | SHAPE_Area | geometry | |
---|---|---|---|---|---|---|---|---|---|
0 | 0800 | 0800 | AUS | Australia | 3.1731 | http://linked.data.gov.au/dataset/asgsed3/POA/... | 0.081893 | 0.000264 | POLYGON ((130.85017 -12.45301, 130.85192 -12.4... |
1 | 0810 | 0810 | AUS | Australia | 24.4283 | http://linked.data.gov.au/dataset/asgsed3/POA/... | 0.241859 | 0.002031 | POLYGON ((130.87154 -12.39532, 130.84584 -12.4... |
2 | 0812 | 0812 | AUS | Australia | 35.8899 | http://linked.data.gov.au/dataset/asgsed3/POA/... | 0.278788 | 0.002983 | POLYGON ((130.90841 -12.40479, 130.87154 -12.3... |
3 | 0820 | 0820 | AUS | Australia | 39.0642 | http://linked.data.gov.au/dataset/asgsed3/POA/... | 0.409134 | 0.003248 | POLYGON ((130.85017 -12.45301, 130.83331 -12.4... |
4 | 0822 | 0822 | AUS | Australia | 150775.8030 | http://linked.data.gov.au/dataset/asgsed3/POA/... | 90.601831 | 12.564238 | MULTIPOLYGON (((136.75021 -12.23023, 136.7697 ... |
See also the list of points to draw the postcode boundary/polygon for Bentley, Perth (6102).
pc_sf_raw[pc_sf_raw['POA_CODE21'] == '6102']['geometry'].get_coordinates()
x | y | |
---|---|---|
2216 | 115.899067 | -32.013453 |
2216 | 115.886680 | -32.012399 |
2216 | 115.886629 | -31.998110 |
2216 | 115.888878 | -31.991374 |
2216 | 115.902449 | -32.003307 |
2216 | 115.915000 | -31.993887 |
2216 | 115.932771 | -32.002420 |
2216 | 115.921517 | -32.012744 |
2216 | 115.899067 | -32.013453 |
Full citation for source of modified map/shapefile:
Australian Bureau of Statistics (2021) 'Non ABS Structures: Postal Areas - 2021 [https://www.abs.gov.au/statistics/standards/australian-statistical-geography-standard-asgs-edition-3/jul2021-jun2026/access-and-downloads/digital-boundary-files]' [Shapefile], Digital boundary files: Australian Statistical Geography Standard (ASGS) Edition 3, accessed 27th February 2024.
The file was created using the code below, there is no need to execute the code in this workflow, it can take a while and depending on the prequantize/epsilon values chosen (which here determine the level of simplification for each postcode’s boundaries), can cause ‘out of memory’ type errors.
import topojson as tp
pc_sf_url = 'https://www.abs.gov.au/statistics/standards/australian-statistical-geography-standard-asgs-edition-3/jul2021-jun2026/access-and-downloads/digital-boundary-files/POA_2021_AUST_GDA2020_SHP.zip'
pc_sf_in = geopandas.read_file(pc_sf_url).to_crs(epsg=4283)
pc_sf_topo = tp.Topology(pc_sf_in.dropna(), prequantize=False, topology=True)
pc_sf_raw = pc_sf_topo.toposimplify(epsilon=0.006).to_gdf()
pc_sf_raw.to_file('pc_sf_raw.shz')
Here we create a visualisation demonstration including a map of Australia with postcodes shown in colours reflecting the Index of Education and Occupation, and hovering over each postcode will display a label detailing the combined tax/seifa data from earlier in abbreviated form.
Though explaining the code for the visualisation is beyond the scope of this workflow, a powerful visualisation has been created with relatively little code. We can also see the effects of not ‘cleaning’ the data earlier. We can see areas missing data, where tax data was summarised into ‘Other’ categories, or there was a change in postcodes between the two data sources. The merge commands from earlier couldn’t find a match between the two datasets for these Postcodes and thus there is no corresponding data.
# Create a dataframe from earlier # Tax data combined workflow
tax_seifa = (tax2022_raw.query('~State.isin(["Unknown","Overseas"])')
.assign(TaxableIncome_dollarspr = lambda x: round(x.TaxableIncome_dollars/x.Returns/1000,0))
.assign(PrivateHealth_percentpp = lambda x: round(x.PrivateHealth_returns/x.Returns*100,0))
.merge(seifa2021_raw, how="inner", on="Postcode"))
# Combine map shapefile and tax data
pc_sf = pc_sf_raw.merge(tax_seifa, left_on='POA_CODE21', right_on='Postcode', how="inner")
# Add a label to data which combines all of the tax data into a single abbreviated field
pc_sf["Details"] = "PCode:" + pc_sf["POA_CODE21"] + " Income:$" + pc_sf["TaxableIncome_dollarspr"].astype('str') + "K PrivHlth:" + pc_sf["PrivateHealth_percentpp"].astype('str') + "% IEO:" + pc_sf["ieo_percentile"].astype('str')
# Demo Visualisation 1
pc_sf.explore("ieo_percentile",
cmap = "YlOrRd_r",
scheme='quantiles',
tiles = "cartodb voyager",
tooltip = ["Details"],
style_kwds = dict(color='black',weight=0.25),
highlight_kwds = dict(color="black",weight=0.75,fillOpacity=0.75),
legend_kwds = dict(caption="SEIFA Index of Education/Occupation percentile"))