SPSS Syntax
Overview
SPSS, like all general purpose statistical software, is built around a programming language. Working directly with the programming language gives you access to many options that do not appear in the graphical user interface. Syntax also makes repetitive tasks quicker and less err0r-prone, ensures that you can repeat all the steps in your analysis, and makes troubleshooting quicker and more precise.
Syntax pasted from the graphical user interface is very useful as a memory aid - the dialog boxes include pointers to many more options than I can usually remember. You can also copy-and-paste syntax from the Output Viewer. However, automatically generated syntax is verbose, almost always including keywords for unnecessary default options. This can make it difficult to spot the crucial parts of a given command. If you use pasted syntax, I recommend editing it as you generate it, to ensure that you understand your code and to make it easier to read later.
Commands
The main unit of work in SPSS is the command. Each command begins with an SPSS keyword (also referred to as the "command" or "command name") and ends with a period before a line break, the SPSS command terminator. For example, a GET
command might look like this:
get file = "Y:\spss\data\cars.sav".
Most commands have additional options and subcommands. Subcommands begin with a forward slash and a keyword. For example, a GET
command that opens a working data set with only two of its original variables uses a KEEP
subcommand, and might look like this:
get file = "Y:\spss\data\cars.sav"
/keep=mpg weight.
Help on Command Syntax
For details of what options and subcommands are available and how they work, you will want to look at the on-line Help files. There are two versions of these, in either HTML or PDF form. The easiest to access are the HTML help files: in any SPSS window click Help - Topics. You can also quickly get help on a specific command by typing or pasting the command name in a syntax window, and then pressing the F1 key - using the F1 key with any command in the syntax editor brings up the relevant Help page.
If you find it easier to use Help in book form, the entire SPSS Syntax Reference Manual is on-line as a PDF. Click Help - Command Syntax Reference. Each command is essentially a chapter in the printed Reference Manual.
Each chapter begins with a syntax diagram. For example, the chapter on the DESCRIPTIVES
command begins with a diagram similar to this:
DESCRIPTIVES VARIABLES= varname [(zname)] [varname...]
[/MISSING= {VARIABLE**} [INCLUDE]]
{LISTWISE }
[/SAVE]
[/STATISTICS= [DEFAULT**] [MEAN**] [MIN**] [SKEWNESS]]
[STDDEV** ] [SEMEAN] [MAX**] [KURTOSIS]
[VARIANCE ] [SUM ] [RANGE] [ALL]
[/SORT=[{MEAN }] [ {(A)}]]
{SMEAN } {(D)}
{STDDEV }
{VARIANCE}
{KURTOSIS}
{SKEWNESS}
{RANGE }
{MIN }
{MAX }
{SUM }
{NAME }
The square brackets, "[ ]", indicate optional specifications. This particular command requires at least the keywords DESCRIPTIVES
and VARIABLES=
, as well as the name of at least one variable to be analyzed, everything else is optional. For example, a minimal specification for DESCRIPTIVES
might look like this:
descriptives variables=mpg.
Options that may be specified repeatedly are indicated with ellipses. For example, you may specify more than one variable name:
descriptives variables=mpg weight.
Keywords in bold in a syntax diagram represent two types of default options. Those marked with a double asterisk, "**", show you options that will have effect if no subcommand is specified. For example, these commands produce the same output:
descriptives variables=mpg.
/statistics=default.
descriptives variables=mpg.
Keywords in bold, but without asterisks, are the default only when the optional subcommand is specified. For example, the following two commands present your output in different order:
descriptives variables=all.
descriptives variables=all
/sort.
(The syntax diagram for DESCRIPTIVES in the SPSS 23 documentation is wrong - the explanation given further into the chapter is correct - the diagram in SPSS 22, shown here, is correct.)
Many commands have lists of alternative options. The alternatives are indicated by curly braces, "{ }" For example, when sorting DESCRIPTIVES
output, you could sort your output by either mean or variance, among many options.
descriptives variables=all
/sort=mean.
descriptives variables=all
/sort=variance.
Keywords, Special Symbols, and Variable Names
Commands are composed of keywords, special symbols like the "equals" symbol or parentheses, and user-supplied "words". User-supplied parts of commands are typically things like data set names, variable name, or data values. In SPSS documentation (and in pasted syntax) it is conventional for SPSS keywords to be indicated in ALL CAPS, while user-supplied "words" are given in lower case. However, the SPSS interpreter ignores case. For example, in the DESCRIPTIVES
syntax above, the only specifications that are not SPSS keywords are variables names and standardized-variable names.
Spacing in Syntax
The rules for how to put SPSS syntax on the page vary, depending on the source (run from a window or from a file), mode of execution (interactive or batch), and even the operating system of the computer you are using. The guidelines I will give you here should work in any context, but if you work with SPSS syntax long enough, you will eventually see exceptions to many of these recommendations!
Begin each command in column one.
End each command with a period at the end of a line.
Commands may be written on more than one line. Continuations should be indented at least one space.
You may use blank lines between commands, but not within commands.
Words and keywords must be separated by spaces or special symbols. Where one space is allowed, multiple spaces may be used.
For example, these work equally well:
compute kpl1=mpg/(0.621371*3.78541).
compute kpl2 = mpg / (0.621371 * 3.78541) .
execute.
Capitalization
Capitalization does not matter in keywords or in variable names. It does matter in data values, and may matter in file names, depending on the operating system you are working on (case matters in linux, not Windows).
SPSS will use the capitalization you supply for things like new variable names, variable labels, and value labels, which will affect how these are displayed in SPSS windows and in output. However, subsequent syntax use of the variable name can be in any case: to SPSS the variables "Salary" and "salary" are the same thing.
Further Reading
See the Universals chapter in the Command Syntax Reference, the sub-section "Commands" (page 37 in the SPSS 23 manual).
For a more accurate description of the default options for many commands, see the SPSS 22 Command Syntax Reference
Last Revised: 7/13/2016
Comments in Syntax
A comment is any text which is to be ignored by the command processor. These are generally used two ways: to add explanatory notes to a command file, and to disable parts of our code.
In SPSS, comments work at two levels: as a whole "command" (begins with a keyword/symbol, ends with a period), or within a line.
A command level comment begins with the keyword
COMMENT
, or with an asterisk, "*". It ends with a period at the end of a line.A within-line comment begins with a forward slash-asterisk, "/*", and ends at the end of the line or at an asterisk-slash, "*/", whichever comes first. These do not continue over more than one line. A line of syntax with only a within-line comment is treated as a blank line (so it is not allowed within a command).
Here are all three forms of comment: