In the example SAS program, these lines create the dataset CLASS from raw data input:
DATA CLASS; INPUT NAME $ SEX $ AGE HEIGHT WEIGHT; CARDS; JOHN M 12 59.0 99.5 JAMES M 12 57.3 83.0 ... (more data lines)
In the example SAS program, these lines call two SAS procedures to analyze the CLASS dataset:
PROC PRINT; PROC MEANS; VARIABLES HEIGHT WEIGHT;
PROC MEANS DATA=CLASS; VAR HEIGHT WEIGHT;The VAR or VARIABLES statement can be used with all procedures to indicate which variables are to be analyzed. If this statement is omitted, the default is to include all variables of the appropriate type (character or numeric) for the given analysis.
Some other statements that can be used with most SAS procedure steps are:
PROC SORT DATA=CLASS; BY SEX; PROC MEANS DATA=CLASS; VAR HEIGHT WEIGHT; BY SEX; LABEL SEX='Gender';If the DATA= option is not used, SAS procedures process the most recently created dataset. In the brief summaries below, the required portions of a PROC step are shown in bold. Only a few representative options are shown.
PROC CORR DATA=SASdataset options;
options:NOMISS ALPHA
VAR variable(s);
WITH variable(s);
PROC FREQ DATA=SASdataset;
TABLES variable(s) / options;
options:NOCOL NOROW NOPERCENT
OUTPUT OUT=SASdataset;
PROC MEANS DATA=SASdataset options;
options:N MEAN STD MIN MAX SUM VAR CSS USS
VAR variable(s);
BY variable(s);
OUTPUT OUT=SASdataset keyword=variablename ... ;
Statistical options on the PROC MEANS statement determine
which statistics are printed. The (optional) OUTPUT statement
is used to create a SAS dataset containing the values of these
statistics.
PROC UNIVARIATE DATA=SASdataset options;
options:PLOT
VAR variable(s);
BY variable(s);
OUTPUT OUT=SASdataset keyword=variablename ... ;
PROC ANOVA DATA=SASdataset options; CLASS variable(s); MODEL dependent(s)= effect(s);
PROC GLM DATA=SASdataset options; CLASS variable(s); MODEL dependent(s)= effect(s); OUTPUT OUT=SASdataset keyword=variablename ... ;
PROC REG DATA=SASdataset options;
MODEL dependent(s) = regressors
/ options;
PLOT variable | keyword. *
variable | keyword. = symbol ;
OUTPUT OUT=SASdataset P=name R=name ... ;
PROC CHART DATA=SASdataset options;
VBAR variable / options;
HBAR variable / options;
options: MIDPOINTS= GROUP= SUMVAR=
PROC PLOT DATA=SASdataset options;
options: HPERCENT= VPERCENT=
PLOT yvariable *
xvariable = symbol / options;
PLOT (yvariables) *
(xvariables) = symbol / options ;
PLOT options: BOX OVERLAY VREF= HREF=
BY variable(s) ;
Note that the parenthesized form in the PLOT statement plots
each y-variable listed against each x-variable.
PROC PRINT DATA= SASdataset options;
options: UNIFORM LABEL SPLIT='char'
VAR variable(s);
BY variable(s);
SUM variable(s);
PROC SORT DATA=SASdataset options;
options: OUT=
BY variable(s);
Some of the (many) statements that can be used in the DATA step are:
DATA SASdataset(s);
INPUT NAME $ SEX $ AGE HEIGHT WEIGHT;Column input reads data in specified columns. Use column input when your data is not separated by blanks, to read character fields longer than 8 characters, or when you do not want to read all the information on each data line.
INPUT NAME $1-8 SEX $11 AGE 13-14 HEIGHT 16-19 WEIGHT 22-25;
data newclass; set class;
Symbol Operation Example ** Exponentiation Y = X **2; * Multiplication AREA = LEN * WIDTH ; / Division DENSITY = MASS / VOLUME; + Addition PRICE = COST + MARKUP; - Subtraction COST = PRICE - MARKUP;Use parentheses to indicate grouping in complex expressions:
AVG = (TEST1 + 2*TEST2 + 5*FINAL) / 8 + BONUS;
IF expression THEN statement; ELSE statement;The ELSE statement is optional. The IF ... THEN parts comprise a single statment. For example,
If age < 13 then group = 'preteen';
else group = 'teen';
If sex = 'F' then SX = 1; /* Dummy variable for sex */
else SX = 0;
SX = (sex='F'); /* same as above (if no missing) */
SAS comparison operators are shown below. You can use either
the symbol or the two-letter abbreviation.
Symbol Abbrev <, <= LT, LE less than, less than or equal >, >= GT, GE greater than, greater than or equal =, ^= EQ, NE equal, not equalA special form of the IF statement is used for subsetting a dataset. To extract the males from the CLASS dataset:
DATA MALES; SET CLASS; IF SEX='M';The statement IF SEX='M'; is equivalent to each of the statements:
IF SEX='M' THEN OUTPUT; IF SEX^='M' THEN DELETE;
data class; * Read in the variables; input name $ sex $ age height weight; /* ignore next statement age = age + 3; */
DATA CLASS;
INPUT NAME $ SEX $ AGE HEIGHT WEIGHT;
If age < 13 then group = 'preteen';
else group = 'teen';
logwt = log10(weight); /* transform variables */
rootht= sqrt(height);
CARDS;
JOHN M 12 59.0 99.5
JAMES M 12 57.3 83.0
... (more data lines)