Transformations - Introduction to SPSS - UniSkills

Often when you are doing your analysis you will find that it is helpful to create new variables, or to make changes to existing variables. This page details three of the transformation facilities provided by SPSS which enable you to do this, all of which are found under the Transform menu.

In brief, this page covers the following:

How to compute a new variable using data from existing variables
How to recode a variable in order to create new categories for it
How to change a continuous variable into a categorical one using visual binning

In addition, two examples that make use of some of these transformation types are also detailed in the Extras page of this module.

Note that the examples covered here make use of the data described in the Getting started page of this module. If you want to work through the examples provided and haven’t already downloaded this data, you can do so using the link below:

SPSS sample data [SAV, 2kB]

Before commencing the analysis, note that the default is for dialog boxes in SPSS to display any variable labels, rather than variable names. You may find this helpful, but if you would prefer to view the variable names instead then from the menu choose:

Edit
Options…
Change the Variable Lists option to Display names

Computing a new variable

Sometimes you may wish to create a new variable or variables to add to your data file, either from scratch or using the data from an existing variable or variables. For example, in the sample data file you may wish to create a new variable which gives the difference between summer and winter household energy consumption for each survey participant. You can do this by choosing the following from the SPSS menu (either from the Data Editor or Output window):

Transform
Compute Variable…
specify the new variable name in the Target Variable box, for example ‘Consumption_difference’
enter the required formula in the Numeric Expression box, for example by moving the ‘Summer_consumption’ variable into the box, using the keypad provided or your keyboard to type the - (minus) sign, and then moving the ‘Winter_consumption’ variable into the box (spaces between each item are optional)

click OK

If you then navigate to the Data View of the Data Editor window, you will see that a new ‘Consumption_difference’ variable has been added to the end of the data file, with the difference for each of the 10 cases determined using the numeric expression entered. You can then analyse this variable as you would any of the original variables.

Note that you can also move the new variable if wished, either in the Data View or in the Variable View , by dragging and dropping. For example, you could move the new variable to sit after the ‘Winter_consumption’ variable by selecting the variable name in the Data View , then holding down the left mouse key and dragging until it is in the required spot.

Recoding an existing variable

Sometimes you may wish to recode an existing categorical variable, most likely to reduce the number of categories by combining existing ones together. For example, in the sample data file you may wish to recode the ‘Consumption_reduction’ variable to reduce the number of categories from five to three (particularly as there are so few people in each category, and no-one in the ‘Strongly disagree’ category). You can do this by choosing the following from the SPSS menu (either from the Data Editor or Output window):

Transform
Recode into Different Variables… (this will keep the existing variable and create a new one, which provides maximum flexibility; if you would prefer to over-write the existing variable though you can select Recode into Same Variables…)
move the required variable into the Numeric Variable - > Output Variable box, for example ‘Consumption reduction’
specify a name for the new variable in the Name field of the Output Variable box, for example ‘Consumption_reduction_recoded’
enter a label for the new variable in the Label field of the Output Variable box if desired
click Change

The second part of the process is to decide how the categories of the existing variable are going to map to categories of the new variable. Sometimes this can require quite a bit of thought and planning, but with so few categories in this example it is more straightforward. In particular, the existing categories lend themselves to being recoded into three new categories (‘Agree’, ‘Neutral’ and ‘Disagree’), as follows:

Existing category	New category
1 (Strongly disagree)	1 (Disagree)
2 (Disagree)	1 (Disagree)
3 (Neutral)	2 (Neutral)
4 (Agree)	3 (Agree)
5 (Strongly agree)	3 (Agree)

To specify this in SPSS, do the following in the Recode into Different Variables: Old and New Values dialogue box:

select Old and New Values…
specify the existing category number(s) in the Old Value side of the dialogue box, and the new category number in the New Value side of the dialogue box, then press Add. You can map each category individually, or multiple categories can be mapped at once using the options available. For example, you could specify the required mappings as follows:
select Range, LOWEST through value: and specify 2 on the Old Value side of the dialogue box, and specify 1 on the New Value side of the dialogue box, then press Add
select Value: and specify 3 on the Old Value side of the dialogue box, and specify 2 on the New Value side of the dialogue box, then press Add
select Range, value through HIGHEST: and specify 4 on the Old Value side of the dialogue box, and specify 3 on the New Value side of the dialogue box, then press Add
click Continue

click OK

If you then navigate to the Data View of the Data Editor window, you will see that a new ‘Consumption_reduction_recoded’ variable has been added to the end of the data file (note that you can move it if wished, either in the Data View or in the Variable View , by dragging and dropping). The category values do not currently have any labels (e.g. ‘Disagree’, ‘Neutral’ and ‘Agree’), and you may need to change the variable Measure (from Nominal to Ordinal), but you can do both of these things as described in the Getting started page of this module.

Once you have finished setting up the variable, you can analyse it in the usual way. For example, you could run the Frequencies procedure (as described in the Descriptive statistics page of this module) on the new variable, which should result in the following table:

Visual binning

Sometimes it is helpful to transform a continuous variable into a categorical variable, as this provides additional analysis options. For example, in the sample data file you may wish to transform the continuous ‘Age’ variable into categories, perhaps in order to make some comparisons for different age groups.

While you can in fact do this using either of the procedures outlined above, the purpose-built procedure for this in SPSS is Visual Binning. You can make use of this by choosing the following from the SPSS menu (either from the Data Editor or Output window):

Transform
Visual Binning…
select the required variable, for example the ‘Age’ variable, and move it across to the Variables to Bin box
select Continue
specify a name for the new variable in the Binned Variable box, for example ‘Age_grouped’
click on the Make Cutpoints… button, to specify how you are going to ‘cut’ the data in order to make categories (sometimes you might use the histogram of the data to help you decide how to do this, while other times you might have set categories already in mind)
specify a value for the First Cutpoint Location ; for example if you want the first age category to include those up to and including the age of 19, you would enter 19
specify the Number of Cutpoints , which will be one less than the number of categories you want to have; for example if you want to have four age categories you would enter 3
adjust the Width of each cutpoint; for example from 9.667 to 10

click Apply
click Make Labels to automatically create labels for each new category

click OK

If you then navigate to the Data View of the Data Editor window, you will see that a new ‘Age_grouped’ variable has been added to the end of the data file (note that you can move it if wished, either in the Data View or in the Variable View , by dragging and dropping). You can analyse it in the usual way, for example you could run the Frequencies procedure (as described in the Descriptive statistics page of this module) on the new variable, which should result in the following table: