SYMBOL statement

You can use a SYMBOL statement anywhere in your SAS program. SYMBOL statements give PROC GPLOT information about plot characters, plot lines, color, and interpolation (smoothing oflines). [SAS/GRAPH User's Guide: pg. 60-70] The information contained in a SYMBOL statement is used by PROC GPLOT in threedifferent ways. The general form of a SYMBOL statement is:
 SYMBOLn options;
where
n
is a number ranging from 1 to 30. Each SYMBOL statement remains in effect until you specify another SYMBOL statement ending in the same number. If you do not specify a number following the keyword SYMBOL, SYMBOL1 is assumed.

options
allow you to specify the plot characters, plot lines, color, and interpolation. SYMBOL statements are additive; that is, if you specify a given option in a SYMBOL statement and then omit that option in a later SYMBOL statement ending in the same number, the option remains in effect. To turn off all options specified in previous SYMBOL statements, you can specify all options in a new SYMBOLn statement, use the keyword SYMBOLn followed by a semicolon, or you can specify a null value. A comma can be used (but is not required) to separate a null parameter from the next option.
The options below can appear in the SYMBOL statement.

General Options

COLOR=color
C=color
specifies the color to use for the corresponding plot specification. Both the points and the line will have this color. If you omit the C= value, the first color in the COLORS= list is used.

V=symbol
gives the plot character for the corresponding plot specifications. Possible V= values are the letters A through W, the numbers 0 through 9, and the special symbols shown in Figure 5.2 in the SAS/GRAPH User's Guide. [SAS/GRAPH User's Guide: pg. 62] The special symbols include STAR, PLUS, SQUARE, DIAMOND, TRIANGLE, etc., and are designed so they are centered at the plotting point. Note: If you use the special symbol comma (,) with V=, you must enclose the comma in quotes. If you omit the V= value, V=NONE is used, so individual points are not marked. This is useful for drawing smooth curves, in conjunction with the I=options.

F=font
W=width
H=height
F= specifies the font from which the value specified with V= is to be drawn. W= specifies the thickness of any interpolated lines. H= specifies the height of the characters.

You can specify height as n PCT (where n is expressed as a percentage of the display area); n IN (where n is expressed in inches); n CM (where n is expressed in centimeters), or n CELL (where n is expressed in character cell units). If you do not specify a unit for height, the value specified with the GOPTION GUNIT= is used. If you did not specify a GUNIT= value, the default value, character cells, is used.

REPEAT=number
R=number
specifies the number of times the SYMBOL statement is to be reused. See the description in the PATTERN Statement.

L=number
The L= option allows you to specify the kind of line that is drawn on the plot. [SAS/GRAPH User's Guide pg. 63] The following are some possible L=values:
1 a solid line (the default value when L= is omitted)
2-32 various dashed lines. 

Interpolation Options

I=interpolation
specifies whether to leave the plotted points unconnected or to connect the plotted points with either straight lines or a smoothed line; use regression to fit a line to the points; or draw vertical lines connecting the points and the zero horizontal. [SAS/GRAPH User's Guide: pg. 64-70]
Possible values for the I= option are given below.

Miscellaneous Interpolation Options

I=NONE
requests that the points on the plot be left unconnected.

I=JOIN
requests that the points on the plot be connected by a straight line.

I=NEEDLE
draws a vertical line from each point on the plot to a horizontal line at zero on the Y axis.

I=STEPxx
requests that a step function be used to plot the data. When STEPL is used, the data point is on the left of the step; with STEPR, the point is on the right; with STEPC, the point is in the center of the step. If L, R, or C is not specified, L is assumed. Optionally, to connect the steps with a vertical line, follow the STEPx specification with a J.

I=Mxxxxx
fills a figure that your plotted points define. See the V= option in the PATTERN Statement for more information about the Mxxxxx value.

I=STDkxxx
is used when multiple Y values occur for each level of X and you want to join the mean Y value with (+ or -) 1, 2, or 3 standard deviations for each X. The value of k can be 1, 2, or 3 standard deviations. If you do not specify a value for k, the default value of two standard deviations is used. The xxx values can be M, P, J, B, and T. The sample variance is computed about each mean and from the sy, the standard deviation, is computed; or if you specify I=STDM, sy, the standard error of the mean, is computed.

If you specify I=STDP, sample variances are computed using a pooled estimate, as in a one-way ANOVA model.

If you specify I=STDJ, the means are connected from bar to bar. Use B to request bars (rather than lines) to connect the points for each X. T specifies that tops and bottoms should be added to each line.

B and T should not be used together, but other combinations of M, P, J, and B or T are acceptable.

Note: If VAXIS= is not specified, the vertical axis ranges from the minimum to maximum Y value in the data. If the requested number of standard deviations from the mean covers a range of values that exceeds the maximum or is less than the minimum, the STD lines are cut off at the minimum and maximum Y values. When this cutoff occurs, you should rescale the axis using the VAXIS= specification.

I=HILOxxx
is used when multiple Y values occur for each level of X. The value of xxx can be T, B, C, or J, or combinations of these letters (except T with B). When you specify I=HILOxxx, the minimum and maximum Y values at each X level are connected by a solid line.For each X value, the mean Y value is marked with a tick.

R-series Interpolation Option

I=Rxxxxxxx
specifies the characteristics of the regression analysis to use for fitting a line to the plotted points. The regression equations can be linear, quadratic, or cubic; confidence limits can be drawn at one of three levels. The points on the plot do not necessarily fall on the regression line.

The form of the Rxxxxxxx value determines whether linear, quadratic, or cubic regression is used to fit a plot line; whether the intercept is set to zero to force the line through the origin; whether additional lines representing confidence limits should be drawn, and, if so, what confidence level should be used.

              
      R x x x x x x x                   
          L
          Q            
          C                    
             0   
               C L I               
               C L M
                    9 0
                    9 5              
                    9 9
As shown above, valid values for the first variable after the R include L, Q, and C.

Confidence limits

When you request regression, you can ask that lines representing confidence limits for individual or mean predicted values be shown on the plot. Follow the L, Q, C, or zero in the Rxxxxxxx specification with the characters CLM to show confidence limits for mean predicted values; use CLI to show confidence limits for individual predicted values.

To specify the confidence level, include the numbers 90, 95, or 99 after CLI or CLM. If the confidence level is omitted, a value of 95 is used. The line style used for the confidence limit lines is determined by adding 1 to the L= value.

Spline Interpolation Options

Several spline methods are available for smoothing points in a plot. SPLINE, which draws the smoothest line and is the least expensive of the non-trivial methods, is the method of choice for most plots. The Lagrange methods are useful chiefly when your data consist of tabulated, precise values. SM is the method to use for smoothing noisy data. Parametric forms of these methods are also available.

I=SPLINE
specifies that the plot line be smoothed using a spline routine. The points on the plot fall on the line. . When you use I=SPLINE, the plot line is smoothed using a cubic spline method with continuous second derivatives. The polynomial passes through the plotted points and matches the first and second derivatives or neighboring segments at the points. If the values of the horizontal variable are not strictly increasing, the parametric interpolation method SPLINEP is used instead (Pizer, 1975).

I=SPLINEP
results in the use of a parametric spline method with continuous second derivatives. Using the method described above for the SPLINE option, a parametric spline is fitted to both the horizontal and vertical values. The parameter used is the distance between points, t = x2 + y2. If two points are so close together that the computations overflow, the second point is not used.

I=Lx
specifies that the plot line be smoothed using a Lagrange interpolation. When you specify I=L1, I=L3, or I=L5, the plot is smoothed using a Lagrange interpolation of degree 1, 3, or 5. A polynomial of the specified degree (1, 3, or 5) is fitted through the nearest 2, 4, or 6 points. In general, the first derivative is not continuous. If the values of the horizontal variable are not strictly increasing, the corresponding parametric method (L1P, L3P, or L5P) is used.

Specifying I=L1P, I=L3P, or I=L5P, results in a parametric Lagrange interpolation of degree 1, 3, or 5. The method described above for the L= option is used, but a parametric interpolation of degree 1, 3, or 5 is done on both the horizontal and vertical variables, using the distance between points as a parameter.

I=SMxx
specifies that a smooth line be fit to noisy data using a spline routine. The points on the plot do not necessarily fall on the line. Specifying I=SMxx results in fitting a cubic spline that minimizes a linear combination of the sum of squares of the residuals of fit and the integral of the square of the second derivative (Reinsch, 1967). The value xx can range from 01 to 99 and determines the relative importance of the two components: the larger the value, the smoother the fitted curve.

I=SMxxP
results in the use of a parametric cubic spline as described in I=SMxx.

Setting the x variable

You can add the letter S to the end of any of the spline interpolation methods, above, if you want the procedure to sort by the x axis variable before plotting.

Prepared by Michael Friendly
Email<friendly@yorku.ca>

[Previous] Previous Document. [Next] Next Document.