One of the reasons you may wish to do a hypothesis test is to determine whether there is a statistically significant relationship between two or more variables. Different tests are required for this based on the type of variables, and this page details the process for performing some of the most common tests for assessing relationships between two variables using SPSS.
A question you may wish to ask of the wider population is: Is there a statistically significant association between having children and owning a dishwasher?
This question can be answered by following the recommended steps, as follows:
The appropriate hypotheses for this question are:
\(\textrm{H}_\textrm{0}\): There is no significant association between having children and owning a dishwasher
\(\textrm{H}_\textrm{A}\): There is significant association between having children and owning a dishwasher
The appropriate test to use is a chi-square test of independence, as we are testing for association between two categorical variables (having children and owning a dishwasher).
While the first four assumptions should be met during the design and data collection phases, the fifth assumption can be checked during the analysis stage. If this assumption is violated and your variables each have only two categories, you can use the results displayed in SPSS for Fisher’s exact test instead. If your variables have more categories, you may be able to exclude or combine some of them. For instructions on combining categories by recoding, see the Transformations page of this module.
The output should look like this:
The first table simply shows that \(80\) cases have been processed. The second table shows how the actual sample data compares with what would be expected if there was no association between having children and dishwasher ownership. The fact that there is a bit of a difference between the observed and expected values provides evidence of association in the sample, with the nature of the association being that people with children are more likely to own a dishwasher.
To find out whether the association is significant, we need to refer to the third table and to the ‘Asymptotic Significance (2-sided)’ value in the ‘Pearson Chi-Square’ row. Since \(p < .05\) (\(p = .032\)) we can reject the null hypothesis and conclude that there is a statistically significant association between having children and owning a dishwasher.
Finally, the fourth table provides the effect sizes, which can be used to test for practical significance. The ‘Phi’ of \(.239\) indicates a small to medium effect.
For more information on how to interpret these results see the Introduction to statistics module.
A question you may wish to ask of the wider population is: Is there a statistically significant linear correlation between summer daily energy consumption and winter daily energy consumption?
This question can be answered by following the recommended steps, as follows:
The appropriate hypotheses for this question are:
\(\textrm{H}_\textrm{0}\): There is no significant linear correlation between summer and winter daily energy consumption
\(\textrm{H}_\textrm{A}\): There is significant linear correlation between summer and winter daily energy consumption
The appropriate test to use is Pearson’s correlation coefficient, as we are testing for linear correlation between two variables (summer daily energy consumption and winter daily energy consumption).
While the first three assumptions should be met during the design and data collection phases, the fourth, fifth and sixth assumptions should be checked at this stage (for instructions on checking the normality assumption in SPSS, see the The normal distribution page of this module).
If the normality assumption is not met you can try transforming the data or using Spearman’s Rho or Kendall’s Tau-B instead. You can also use one of these tests if you have ordinal rather than continuous variables, or if there is non-linear correlation.
To check for linearity and homoscedasticity, you can create a scatter plot with the independent variable on the \(x\)-axis and the dependent variable on the \(y\)-axis (for this example these are interchangeable; we will put summer consumption on the \(x\)-axis). For instructions on creating a scatterplot in SPSS, see the Charts page of this module.
The scatterplot, with the line of best fit included, should look as follows. This shows that the relationship is approximately linear as the points lie close to the line of best fit. It also shows that the relationship is homoscedastic, as the points are a similar distance from the line of best fit all the way along (they don’t create a ‘funnel’ shape in either direction). Hence the fifth and sixth assumptions have been met.
The output should look like this:
This table shows that Pearson’s correlation coefficient is \(.949\), indicating a strong positive linear correlation between summer and winter energy consumption (for more information on how to interpret this see the Descriptive statistics page of this module
To test whether this linear correlation is statistically significant requires the \(p\) value (listed as ‘Sig. (2-tailed)’). Since \(p < .05\) (in fact \(p < .001\)) we can reject the null hypothesis and conclude that there is a statistically significant linear correlation between summer energy consumption and winter energy consumption.
Pearson’s correlation coefficient and its square (the coefficient of variation) are also measures of effect size, which can be used to test for practical significance. The correlation coefficient of \(.949\) indicates a large effect, and the coefficient of variation of \(90.06\%\) indicates that \(90.06\%\) of variation in winter energy consumption can be explained by variation in summer energy consumption.
For more information on how to interpret these results see the Introduction to statistics module.