Version 1.8 - July 2024
COPYRIGHT © Curtin University 2024
R has some powerful libraries to make use of map data. Map data can be imported for example as a shapefile and visualised along with data using the interactive library Leaflet.
These shapefiles are not images based on pixels, but use vectors to effectively redraw the postcode boundaries in whatever environment it is used. They provide much more definition than is needed to display on a screen on a web page, so R provides tools which resample the vectors to suit a webpage, greatly simplifying the file and making it smaller and quicker to render.
Once again, the map/shapefile data for Postcodes in Australia is readily available from the ABS website, with a suitable Creative Commons Licence.
To reduce time and resources needed for this workflow - we have created an R object file named pc_sf_raw.RData, which is already a simplified version of the map file and can be loaded with the code below. Full citation for source of modified map/shapefile: Australian Bureau of Statistics (2021) ‘Non ABS Structures: Postal Areas - 2021 [https://www.abs.gov.au/statistics/standards/australian-statistical-geography-standard-asgs-edition-3/jul2021-jun2026/access-and-downloads/digital-boundary-files]’ [Shapefile], Digital boundary files: Australian Statistical Geography Standard (ASGS) Edition 3, accessed 27th February 2024.
load(file = "pc_sf_raw.RData")
head(pc_sf_raw)
## Simple feature collection with 6 features and 10 fields
## Geometry type: GEOMETRY
## Dimension: XY
## Bounding box: xmin: 129.3556 ymin: -14.89182 xmax: 136.982 ymax: -10.90649
## Geodetic CRS: GDA2020
## # A tibble: 6 × 11
## POA_CODE21 AUS_CODE21 POA_NAME21 AUS_NAME21 AREASQKM21 LOCI_URI21 SHAPE_Leng
## <chr> <chr> <chr> <chr> <dbl> <chr> <dbl>
## 1 0800 AUS 0800 Australia 3.17 http://link… 0.0819
## 2 0810 AUS 0810 Australia 24.4 http://link… 0.242
## 3 0812 AUS 0812 Australia 35.9 http://link… 0.279
## 4 0820 AUS 0820 Australia 39.1 http://link… 0.409
## 5 0822 AUS 0822 Australia 150776. http://link… 90.6
## 6 0828 AUS 0828 Australia 28.7 http://link… 0.246
## # ℹ 4 more variables: SHAPE_Area <dbl>, geometry <GEOMETRY [°]>, long <dbl>,
## # lat <dbl>
The file was created using the code below, there is no need to execute the code in this workflow, but how long does it take to execute on your machine?
pc_sf_url = 'https://www.abs.gov.au/statistics/standards/australian-statistical-geography-standard-asgs-edition-3/jul2021-jun2026/access-and-downloads/digital-boundary-files/POA_2021_AUST_GDA2020_SHP.zip'
download.file(pc_sf_url, 'POA_2021_AUST_GDA2020_SHP.zip', mode = 'wb')
unzip("POA_2021_AUST_GDA2020_SHP.zip")
pc_sf_raw <- sf::read_sf("POA_2021_AUST_GDA2020.shp") %>%
ms_simplify()
pc_sf_raw$long <- st_coordinates(st_centroid(pc_sf_raw$geometry))[,"X"]
pc_sf_raw$lat <- st_coordinates(st_centroid(pc_sf_raw$geometry))[,"Y"]
save(pc_sf_raw, file = "pc_sf_raw.RData")
Here we create a visualisation demonstration including a map of Australia with postcodes shown in colours reflecting the Index of Education and Occupation, and hovering over each postcode will display a label detailing the combined tax/seifa data from earlier in abbreviated form.
Though explaining the code for the visualisation is beyond the scope of this workflow, a powerful visualisation has been created with relatively little code. We can also see the effects of not ‘cleaning’ the data earlier. We can see areas missing data, where tax data was summarised into ‘Other’ categories, or the older SEIFA data was missing for new Postcodes. The join commands from earlier couldn’t find a match between the two datasets for these Postcodes and thus there is no corresponding data.
# Create a dataframe from earlier # Tax data combined workflow
tax_seifa <- tax2020_raw %>%
filter( State !="Unknown" & State!="Overseas" ) %>%
mutate(TaxableIncome_dollarspr = TaxableIncome_dollars/Returns) %>%
mutate(PrivateHealth_percentpp = round(PrivateHealth_returns/Returns*100,0)) %>%
inner_join( x= ., y = seifa2016_raw, by = "Postcode")
# Data cleaning - add leading zero to three digit postcodes from tax data
tax_seifa$Postcode <- sprintf("%04d",as.numeric(tax_seifa$Postcode))
# Combine map shapefile and tax data into a new R object
pc_sf <- pc_sf_raw %>%
inner_join(x=.,y=tax_seifa,by = c('POA_CODE21'='Postcode'))
# Add a label to data which combines all of the tax data into a single abbreviated field
pc_sf$data_label <- paste0("PCode:",pc_sf$POA_CODE21," Income:$",round(pc_sf$TaxableIncome_dollarspr/1000,0),"K PrivHlth:",pc_sf$PrivateHealth_percentpp,"% IEO:",pc_sf$ieo_percentile)
# Create a colour palette based on index of educational opportunity precentile
pc_v1_palette <- colorQuantile("YlOrRd", pc_sf$ieo_percentile, n = 9, reverse = TRUE)
# Leaflet Visualisation 1
pc_v1 <- leaflet(pc_sf) %>%
addPolygons(color="black", weight=0.5, smoothFactor=0.2, fillOpacity=0.5, fillColor = ~pc_v1_palette(pc_sf$ieo_percentile), label = ~pc_sf$data_label, highlightOptions = highlightOptions(color="white",weight=1,bringToFront = TRUE)) %>%
addProviderTiles(providers$CartoDB.Voyager) %>%
addLegend(pal=pc_v1_palette, values=~pc_sf$ieo_percentile, title="SEIFA<br> Index of<br>Education/<br>Occupation<br>percentile", position="bottomleft" )
pc_v1