Briefly, R is
R is used by executing statements of the R programming language, or code, commonly at a command line or in a notebook in RStudio or a Jupyter Notebook. RStudio is a dedicated Integrated Development Environment (IDE) utilising a Graphical User Interface (GUI), and also provides package installation, document and website generation, file management and many other aspects of working with R.
Whilst R is an ‘interpreted language’, many of the extension packages are ‘compiled’, which makes them seriously fast and powerful for processing data.
The next steps in this workflow need the following libraries/packages to extend the capabilities of base R. The code can be copy and pasted! R will return a number of messages related to loading these libraries, they can be helpful when developing workflows and can be suppressed when no longer required.
options(repr.plot.width=12, repr.plot.height=5, repr.plot.res=200) # increase ggplot sizes jupyter notebooks
if(!require(tidyverse)){
install.packages("tidyverse")
library(tidyverse)
}
if(!require(readxl)){
install.packages("readxl")
library(readxl)
}
if(!require(plotly)){
install.packages("plotly")
library(plotly)
}
if(!require(sf)){
install.packages("sf")
library(sf)
}
if(!require(rmapshaper)){
install.packages("rmapshaper")
library(rmapshaper)
}
if(!require(leaflet)){
install.packages("leaflet")
library(leaflet)
}
if(!require(htmltools)){
install.packages("htmltools")
library(htmltools)
}
if(!require(crosstalk)){
install.packages("crosstalk")
library(crosstalk)
}
if(!require(RSQLite)){
install.packages("RSQLite")
library(RSQLite)
}
if(!require(jsonlite)){
install.packages("jsonlite")
library(jsonlite)
}
Simple mathematics is possible with R. Much like a calculator.
Let’s start with the following code. Execute the code by using the keyboard shortcut Ctrl+Enter (Command+Enter on a Mac).
The output response from R will appear below the executed code.
1 + 1
Hopefully the answer 2 appeared below the code. The code can be typed over and re-executed again. (Try changing the code above, perhaps to 2 + 3 for example, and re-execute to observe the changed output).
All sorts of maths is possible, including a variety of functions similar to those available on a calculator.
Try the following
( 5 * 6 ) + 12
try also
6^2 + 6
and finally also try
sqrt(49) * mean(1:11) * sin(pi/2)
Note. Trig functions are in radians. The colon : operator returns a number series between the two numbers.
Variables are at the heart of coding. Variables are placeholders for sets of data, and allow shorthand style code statements to powerfully manipulate data, repeatedly as required.
In R, the symbol ‘<-‘ is used to assign a variable a value rather than ‘=’. We’ll skip the discussion about why in this introduction. In R and RStudio the keyboard shortcut is Alt + -, or Option + - on a Mac. Not available in our Jupyter notebooks at this time.
To see the value of a variable, simply execute the variable name, or use print().
For example, lets assign the variable integer1 with the integer value 42 and then print it.
integer1 <- 42
integer1
Try the following to explore some common variable data types and structures.
string1 <- "forty two"
string1
vector_string1 <- c("apples","oranges","lemons")
vector_string1
list_string1 <- list("apples","oranges","lemons")
list_string1
named_list_string1 <- list("fruit1"="apples","fruit2"="oranges","fruit3"="lemons")
named_list_string1
The data frame type/structure is very commonly used in data analysis.
Below is an example of creating a data frame manually, showing how it is a variable and how it is stored.
data_frame1 <- data.frame(fruit=c("apples","oranges","lemons"),
quantity=c(7,14,21))
data_frame1
fruit | quantity |
---|---|
<chr> | <dbl> |
apples | 7 |
oranges | 14 |
lemons | 21 |
Data in tabular format, or tables, is a very common starting point when working with data.
R includes some sample data sets to work with whilst exploring R.
One such data set is called mtcars, which has various features for 32 now ancient cars from a 1974 survey for a US car magazine.
Running this head() command will give a stylised table consisting of all columns for the first five rows of data.
head(mtcars, 5)
mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb | |
---|---|---|---|---|---|---|---|---|---|---|---|
<dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | |
Mazda RX4 | 21.0 | 6 | 160 | 110 | 3.90 | 2.620 | 16.46 | 0 | 1 | 4 | 4 |
Mazda RX4 Wag | 21.0 | 6 | 160 | 110 | 3.90 | 2.875 | 17.02 | 0 | 1 | 4 | 4 |
Datsun 710 | 22.8 | 4 | 108 | 93 | 3.85 | 2.320 | 18.61 | 1 | 1 | 4 | 1 |
Hornet 4 Drive | 21.4 | 6 | 258 | 110 | 3.08 | 3.215 | 19.44 | 1 | 0 | 3 | 1 |
Hornet Sportabout | 18.7 | 8 | 360 | 175 | 3.15 | 3.440 | 17.02 | 0 | 0 | 3 | 2 |