Y = alpha + beta % X + e sub yHowever, if both X and Y are contaminated by error, and you want to estimate the the relation between their true, error-free parts, the appropriate model consists of the structural equations,
Y = alpha + beta % F sub X + E sub Y
X = F sub X + E sub X (1)
cov ( F sub X , E sub X ) = cov ( F sub X , E sub Y ) = cov ( E sub X , E sub Y ) = 0
proc calis;
lineqs y = beta fx + ey,
x = fx + ex;
std fx = vfx,
ey = vey,
ex = vex;
The LINEQS statement specifies the structural equations for Y and
X; The STD statement specifies the variances of each latent
variable in the model. When the variance of a variable is given as
a name, as in fx = vfx, that variance is considered as a free
parameter which is estimated along with other parameters, such as
BETA.
Two tests, X1 and X2 had 15 items and were administered under liberal time limits; tests Y1 and Y2 had 75 items and were given under time pressure. The data (from Lord, 1957) consists of the covariance matrix from the scores of 649 examinees given these four tests. The covariance matrix is read in with the following DATA step:
data lord(type=cov); input _type_ $ _name_ $ x1 x2 y1 y2; datalines; n . 649 . . . cov x1 86.3937 . . . cov x2 57.7751 86.2632 . . cov y1 56.8651 59.3177 97.2850 . cov y2 58.8986 59.6683 73.8201 97.8192 ;(Note the first observation with _TYPE_='N' is necessary to establish the sample size for a covariance matrix.)
The statistical question is to determine a model which accounts
for the variances and covariances among these tests. One model
states that tests X1 and X2 are determined by a single common
factor, F1, and that Y1 and Y2 are determined by a second common
factor, F2. The two common factors are assumed to be correlated
and it is desired to estimate this correlation. This model can be
specified by the following structural equations:
X sub 1 = beta sub 1 % F sub 1 + e sub 1
X sub 2 = beta sub 2 % F sub 1 + e sub 2 (2)
Y sub 1 = beta sub 3 % F sub 2 + e sub 3
Y sub 2 = beta sub 4 % F sub 2 + e sub 4
left [ matrix < ccol < x sub 1 above x sub 2 above y sub 1 above y sub 2 > > right ] %% = %% left [ matrix < ccol < beta sub 1 above beta sub 2 above 0 above 0 > ccol < 0 above 0 above beta sub 3 above beta sub 4 > > right ] %%% left [ matrix < ccol < F sub 1 above F sub 2 > > right ] % + % left [ matrix < ccol < e sub 1 above e sub 2 above e sub 3 above e sub 4 > > right ] %% = %% bold Lambda % F + e
Model (2) can be estimated by PROC CALIS by transcribing the structural equations into the LINEQS statement shown below:
title "Lord's data: H4- unconstrained two-factor model";
proc calis data=lord cov;
lineqs x1 = beta1 F1 + e1,
x2 = beta2 F1 + e2,
y1 = beta3 F2 + e3,
y2 = beta4 F2 + e4;
std F1 F2 = 1,
e1 e2 e3 e4 = ve1 ve2 ve3 ve4;
cov F1 F2 = rho;
run;
+-----------------------------------------------------------------+ | | | Figure 6: Path diagram for Lord's | | vocabulary tests | | | +-----------------------------------------------------------------+
In the CALIS procedure, a path diagram is described by a RAM statement. The model for the vocabulary tests shown in Figure 6 is specified by the RAM statement shown below. Each line in the statement describes one link or arrow between nodes in the path diagram. The last item on the line is either the name of a parameter to be estimated (e.g., BETA1) or a constant, indicating a fixed parameter (e.g., 1.0, for the variances of F sub 1 and F sub 2 ).
*-- Same model represented as a path (RAM) model;
proc calis data=lord cov;
/* number of heads on arrow in path diagram (matrix number)
| node pointed to (matrix row)
| | node arrow leaves from (matrix col)
| | | parameter or value
| | | |
v v v v */
ram 1 1 5 beta1,
1 2 5 beta2,
1 3 6 beta3,
1 4 6 beta4,
2 1 1 ve1,
2 2 2 ve2,
2 3 3 ve3,
2 4 4 ve4,
2 5 5 1.0,
2 6 6 1.0,
2 5 6 rho;