Survival Analysis in SAS

In many mdeical studies, the main outcome variable is the time to the occurrence of a particular event. In a randomized controlled trial of treatment for cancer, for example, surgery, radiation, and chemotherapy might be compared with respect to time from randomization and the start of therapy until death. In this case the event of interest is death of a patient, but in other situations, it might be remission from a disease, relief from symptoms, or the recurrence of a particular condition. Such observations are generally referred to by the generic term survival data even when the endpoint or event considered is not death but something else. Such data generally requiree special techniques for their analysis for two main reasons:

  • Survival data are generally right skewed instead of normally distributed
  • AT the completion of the study, some patients may not have reached the endpoint of interest (death, relapse, etc.). Consequently, the exact survival times are not known. All that is known is that the survival time is longer thant he subject was in the study. This is known as censored data (right censored in this case).

In this lecture, we will examing the WHAS500 data set which contains data on the Worcester Heart Attach Study. This study examined several factors such as age, gender and BMI, that may influence survival time after a heart attack. Follow up time for all patients begins at the time of hospital admission after a heart attach and ends with death or loss to follow up (censoring). We will use the following variables

  • lenfol: length of follow up, terminated either by death or censoring
  • fstat: the censoring variable. 0 = loss to follow up, 1 = death
  • age: age at hospitalization
  • bmi: body mass index
  • hr: initial heart rate
  • gender: 0 = male, 1 = females

The Survival and Hazard Functions

Traditionally, important functions in statistics are the density function and the cumulative distribution function. Let T represent our survival time for uncensored patients, f represent the density, and $F(t)=Pr(T\leq t)$ be the cumualitive distribution function.

In [2]:
LIBNAME surv "H:\BiostatCourses\PublicHealthComputing\Lectures\Week11Survival\SAS";

ODS SELECT Histogram;
proc univariate data = surv.whas500(where=(fstat=1));
var lenfol;
histogram lenfol / kernel;
run;
Out[2]:
SAS Output

SAS Output

The SAS System

The UNIVARIATE Procedure

The UNIVARIATE Procedure

LENFOL

Histogram 1

Panel 1

Histogram for LENFOL

The histogram with kernal density estimate of the (uncensored) survival time show that the highest risk of death is shortly after the heart attack and decreases as time passes. (Notice that it is right skewed which is very typical for survival data).

In [3]:
ODS SELECT cdfplot;
proc univariate data = surv.whas500(where=(fstat=1));
var lenfol;
cdfplot lenfol;
run;
Out[3]:
SAS Output

SAS Output

The SAS System

The UNIVARIATE Procedure

The UNIVARIATE Procedure

LENFOL

CDF Plot 1

Panel 1

Cumulative Distribution Plot for LENFOL

The estimate cdf for these uncensored survival times show that after around 200 days a patients has accumulated quite a bit of the risk of death (around 50%), that is the probability of living dying before 200 days is 0.5. After this point the cdf increase more slowly. A faster increase in the cdf occurs in the time when there were more deaths (where the probability of death is more likely).

With censored data, we cannot estimate either of these functions. With (right) censeored data, we can estimate the survival and hazard rate functions. The survival function is

$$S(t)=1-F(t)=Pr(T>t)$$

which described the probability of living longer than time t. In the presence of censoring, the survival function is estimated using the nonparametric estimator known as the Kaplan-Meier estimator. To calculate the Kaplan-Meier estimator, let

  • $t_1
  • $d_j$ represent the number of deaths at time $t_j$
  • $r_j$ represent the number of patients still at risk at time $t_j$ (alive and not censored)

Then the Kaplan-Meier estimator is given by

$$\hat{S}(t)=\Pi_{t_j\leq t}\left(1-\dfrac{d_j}{r_j}\right)$$

The Greenwood estimator of the variance of the Kaplan-Meier estimatory is

$$V\left(\hat{S}(t)\right)=\hat{S}^2(t)\sum_{t_j\leq t}\dfrac{d_j}{r_j(r_j-d_j)}$$

To get the Kaplan-Meier estimator in SAS, you will use PROC LIFETEST.

In [13]:
ODS SELECT SURVIVALPLOT;
proc lifetest data=surv.whas500 plots=survival(atrisk cb);
ODS OUTPUT ProductLimitEstimates = ple;
time lenfol*fstat(0);
run; 

PROC PRINT DATA=ple(obs=25);
RUN;
Out[13]:
SAS Output

SAS Output

The SAS System

The LIFETEST Procedure

The LIFETEST Procedure

Stratum 1

Survival Curve

Product-Limit Survival Curve with Number of Subjects at Risk and 95% Hall-Wellner Band

The SAS System

The PRINT Procedure

Data Set WORK.PLE

Obs STRATUM LENFOL Censor Survival Failure StdErr Failed Left
1 1 0.00 . 1.0000 0 0 0 500
2 1 1.00 0 . . . 1 499
3 1 1.00 0 . . . 2 498
4 1 1.00 0 . . . 3 497
5 1 1.00 0 . . . 4 496
6 1 1.00 0 . . . 5 495
7 1 1.00 0 . . . 6 494
8 1 1.00 0 . . . 7 493
9 1 1.00 0 0.9840 0.0160 0.00561 8 492
10 1 2.00 0 . . . 9 491
11 1 2.00 0 . . . 10 490
12 1 2.00 0 . . . 11 489
13 1 2.00 0 . . . 12 488
14 1 2.00 0 . . . 13 487
15 1 2.00 0 . . . 14 486
16 1 2.00 0 . . . 15 485
17 1 2.00 0 0.9680 0.0320 0.00787 16 484
18 1 3.00 0 . . . 17 483
19 1 3.00 0 . . . 18 482
20 1 3.00 0 0.9620 0.0380 0.00855 19 481
21 1 4.00 0 . . . 20 480
22 1 4.00 0 0.9580 0.0420 0.00897 21 479
23 1 5.00 0 . . . 22 478
24 1 5.00 0 0.9540 0.0460 0.00937 23 477
25 1 6.00 0 . . . 24 476

The TIME statement is required with PROC LIFETEST and is read as

The hazard function is another important function in survival anlysis. The hazard function is defined as

$$h(t)=\lim_{\Delta t\downarrow 0}\dfrac{Pr(t\leq T\leq t+\Delta t|T\geq t)}{s}=\dfrac{f(t)}{S(t)}.$$

This function (approximately) describes the probability of dying at time t given that the patient has lived up to time t.

In [9]:
ODS SELECT HAZARDPLOT;
proc lifetest data=surv.whas500 plots=hazard(bw=200);
time lenfol*fstat(0);
run;
Out[9]:
SAS Output

SAS Output

The SAS System

The LIFETEST Procedure

The LIFETEST Procedure

Stratum 1

Estimated Smoothed Hazard Curves

Estimated Smoothed Hazard Curves

This type of "bathtub" shape for the hazard fucntion is common. The risk if death is typically highests immediately following hospitilization and then drops until eventually other factors such as age lead to the risk of death increasing again.

Comparing Survival Functions

The survival and hazard rate functions provide useful summaries of survival times for a single group of patients, but usually the interest is in comparing groups by comparing their survival curves (for example compare surivival times of males vs females or new vs old cancer treatmenat). In the following, we compare the survival times after a heart attack for men vs women;

In [19]:
PROC FORMAT;
VALUE gdr 0 = "male" 1 = "female";

ODS SELECT SurvivalPlot HomTests;
proc lifetest data=surv.whas500 atrisk plots=survival(atrisk cb) outs=outwhas500;
strata gender / test=(all);
time lenfol*fstat(0);
FORMAT gender gdr.;
run;
Out[19]:
SAS Output

SAS Output

The SAS System

The LIFETEST Procedure

Stratum 1: GENDER = female

The LIFETEST Procedure

Stratum 1

Product-Limit Estimates

Product-Limit Survival Estimates
LENFOL   Number
at Risk
Observed
Events
Survival Failure Survival Standard Error Number
Failed
Number
Left
0.00   200 0 1.0000 0 0 0 200
1.00   . . . . . 1 199
1.00   200 2 0.9900 0.0100 0.00704 2 198
2.00   . . . . . 3 197
2.00   . . . . . 4 196
2.00   . . . . . 5 195
2.00   . . . . . 6 194
2.00   198 5 0.9650 0.0350 0.0130 7 193
3.00   . . . . . 8 192
3.00   193 2 0.9550 0.0450 0.0147 9 191
4.00   191 1 0.9500 0.0500 0.0154 10 190
6.00   . . . . . 11 189
6.00   190 2 0.9400 0.0600 0.0168 12 188
7.00   . . . . . 13 187
7.00   188 2 0.9300 0.0700 0.0180 14 186
10.00   . . . . . 15 185
10.00   186 2 0.9200 0.0800 0.0192 16 184
11.00   . . . . . 17 183
11.00   . . . . . 18 182
11.00   . . . . . 19 181
11.00   184 4 0.9000 0.1000 0.0212 20 180
14.00   180 1 0.8950 0.1050 0.0217 21 179
16.00   179 1 0.8900 0.1100 0.0221 22 178
19.00   . . . . . 23 177
19.00   . . . . . 24 176
19.00   178 3 0.8750 0.1250 0.0234 25 175
22.00   . . . . . 26 174
22.00   175 2 0.8650 0.1350 0.0242 27 173
31.00   173 1 0.8600 0.1400 0.0245 28 172
32.00   172 1 0.8550 0.1450 0.0249 29 171
33.00   171 1 0.8500 0.1500 0.0252 30 170
34.00   170 1 0.8450 0.1550 0.0256 31 169
37.00   169 1 0.8400 0.1600 0.0259 32 168
42.00   168 1 0.8350 0.1650 0.0262 33 167
46.00   167 1 0.8300 0.1700 0.0266 34 166
49.00   166 1 0.8250 0.1750 0.0269 35 165
53.00   165 1 0.8200 0.1800 0.0272 36 164
57.00   . . . . . 37 163
57.00   164 2 0.8100 0.1900 0.0277 38 162
64.00   162 1 0.8050 0.1950 0.0280 39 161
83.00   161 1 0.8000 0.2000 0.0283 40 160
93.00   160 1 0.7950 0.2050 0.0285 41 159
95.00   159 1 0.7900 0.2100 0.0288 42 158
101.00   158 1 0.7850 0.2150 0.0290 43 157
132.00   157 1 0.7800 0.2200 0.0293 44 156
134.00   156 1 0.7750 0.2250 0.0295 45 155
135.00   155 1 0.7700 0.2300 0.0298 46 154
137.00   154 1 0.7650 0.2350 0.0300 47 153
145.00   153 1 0.7600 0.2400 0.0302 48 152
146.00   152 1 0.7550 0.2450 0.0304 49 151
151.00   151 1 0.7500 0.2500 0.0306 50 150
197.00   150 1 0.7450 0.2550 0.0308 51 149
200.00   149 1 0.7400 0.2600 0.0310 52 148
226.00   148 1 0.7350 0.2650 0.0312 53 147
235.00   147 1 0.7300 0.2700 0.0314 54 146
274.00   146 1 0.7250 0.2750 0.0316 55 145
287.00   145 1 0.7200 0.2800 0.0317 56 144
289.00   144 1 0.7150 0.2850 0.0319 57 143
321.00   143 1 0.7100 0.2900 0.0321 58 142
328.00   142 1 0.7050 0.2950 0.0322 59 141
358.00   141 1 0.7000 0.3000 0.0324 60 140
359.00   . . . . . 61 139
359.00   140 2 0.6900 0.3100 0.0327 62 138
363.00   138 1 0.6850 0.3150 0.0328 63 137
371.00 * . 0 . . . 63 136
371.00 * 137 0 . . . 63 135
373.00 * 135 0 . . . 63 134
385.00   134 1 0.6799 0.3201 0.0330 64 133
386.00 * 133 0 . . . 64 132
392.00   132 1 0.6747 0.3253 0.0331 65 131
397.00 * 131 0 . . . 65 130
400.00 * 130 0 . . . 65 129
412.00 * 129 0 . . . 65 128
422.00   128 1 0.6695 0.3305 0.0333 66 127
424.00 * 127 0 . . . 66 126
442.00   126 1 0.6642 0.3358 0.0335 67 125
445.00 * . 0 . . . 67 124
445.00 * 125 0 . . . 67 123
446.00   123 1 0.6588 0.3412 0.0336 68 122
449.00 * 122 0 . . . 68 121
451.00 * 121 0 . . . 68 120
458.00 * 120 0 . . . 68 119
465.00   119 1 0.6532 0.3468 0.0338 69 118
478.00 * 118 0 . . . 69 117
479.00   117 1 0.6476 0.3524 0.0340 70 116
497.00 * 116 0 . . . 70 115
506.00 * 115 0 . . . 70 114
516.00 * 114 0 . . . 70 113
521.00 * 113 0 . . . 70 112
522.00 * 112 0 . . . 70 111
524.00 * 111 0 . . . 70 110
535.00   110 1 0.6417 0.3583 0.0342 71 109
535.00 * . 0 . . . 71 108
537.00   108 1 0.6358 0.3642 0.0344 72 107
542.00   107 1 0.6299 0.3701 0.0345 73 106
542.00 * . 0 . . . 73 105
550.00 * 105 0 . . . 73 104
551.00 * 104 0 . . . 73 103
552.00   103 1 0.6237 0.3763 0.0347 74 102
578.00 * 102 0 . . . 74 101
589.00 * 101 0 . . . 74 100
606.00 * 100 0 . . . 74 99
631.00 * 99 0 . . . 74 98
632.00   98 1 0.6174 0.3826 0.0350 75 97
646.00   97 1 0.6110 0.3890 0.0352 76 96
649.00   96 1 0.6047 0.3953 0.0354 77 95
659.00 * 95 0 . . . 77 94
662.00 * 94 0 . . . 77 93
670.00   93 1 0.5982 0.4018 0.0356 78 92
673.00   92 1 0.5916 0.4084 0.0358 79 91
704.00   91 1 0.5851 0.4149 0.0360 80 90
714.00   90 1 0.5786 0.4214 0.0362 81 89
725.00 * 89 0 . . . 81 88
865.00   88 1 0.5721 0.4279 0.0364 82 87
905.00   87 1 0.5655 0.4345 0.0365 83 86
920.00   86 1 0.5589 0.4411 0.0367 84 85
1065.00   85 1 0.5523 0.4477 0.0368 85 84
1096.00   84 1 0.5458 0.4542 0.0370 86 83
1103.00 * 83 0 . . . 86 82
1117.00 * 82 0 . . . 86 81
1126.00 * 81 0 . . . 86 80
1140.00 * 80 0 . . . 86 79
1152.00   79 1 0.5389 0.4611 0.0372 87 78
1157.00 * 78 0 . . . 87 77
1160.00 * 77 0 . . . 87 76
1161.00 * 76 0 . . . 87 75
1162.00 * 75 0 . . . 87 74
1165.00   74 1 0.5316 0.4684 0.0374 88 73
1170.00 * 73 0 . . . 88 72
1174.00   72 1 0.5242 0.4758 0.0376 89 71
1174.00 * . 0 . . . 89 70
1187.00 * . 0 . . . 89 69
1187.00 * 70 0 . . . 89 68
1191.00 * 68 0 . . . 89 67
1199.00 * 67 0 . . . 89 66
1200.00   66 1 0.5163 0.4837 0.0378 90 65
1217.00   65 1 0.5083 0.4917 0.0381 91 64
1224.00 * 64 0 . . . 91 63
1232.00   63 1 0.5002 0.4998 0.0383 92 62
1244.00 * 62 0 . . . 92 61
1245.00 * 61 0 . . . 92 60
1251.00 * 60 0 . . . 92 59
1256.00 * 59 0 . . . 92 58
1257.00 * 58 0 . . . 92 57
1265.00 * 57 0 . . . 92 56
1274.00 * 56 0 . . . 92 55
1277.00 * 55 0 . . . 92 54
1279.00 * 54 0 . . . 92 53
1295.00 * 53 0 . . . 92 52
1302.00 * 52 0 . . . 92 51
1317.00   51 1 0.4904 0.5096 0.0388 93 50
1320.00 * . 0 . . . 93 49
1320.00 * 50 0 . . . 93 48
1325.00 * 48 0 . . . 93 47
1332.00 * 47 0 . . . 93 46
1333.00 * 46 0 . . . 93 45
1338.00 * 45 0 . . . 93 44
1346.00 * 44 0 . . . 93 43
1359.00   43 1 0.4790 0.5210 0.0395 94 42
1363.00 * 42 0 . . . 94 41
1374.00 * 41 0 . . . 94 40
1377.00   40 1 0.4671 0.5329 0.0403 95 39
1385.00 * 39 0 . . . 95 38
1420.00 * 38 0 . . . 95 37
1451.00 * 37 0 . . . 95 36
1454.00 * 36 0 . . . 95 35
1536.00   35 1 0.4537 0.5463 0.0413 96 34
1576.00   34 1 0.4404 0.5596 0.0422 97 33
1577.00   33 1 0.4270 0.5730 0.0430 98 32
1579.00   32 1 0.4137 0.5863 0.0437 99 31
1627.00   31 1 0.4003 0.5997 0.0442 100 30
1836.00 * 30 0 . . . 100 29
1858.00 * 29 0 . . . 100 28
1885.00 * 28 0 . . . 100 27
1887.00 * . 0 . . . 100 26
1887.00 * 27 0 . . . 100 25
1914.00 * 25 0 . . . 100 24
1919.00 * 24 0 . . . 100 23
1926.00   23 1 0.3829 0.6171 0.0456 101 22
1931.00 * 22 0 . . . 101 21
1933.00 * 21 0 . . . 101 20
1936.00 * 20 0 . . . 101 19
1941.00 * 19 0 . . . 101 18
1955.00 * 18 0 . . . 101 17
1964.00 * 17 0 . . . 101 16
1969.00 * 16 0 . . . 101 15
1979.00 * 15 0 . . . 101 14
2009.00 * 14 0 . . . 101 13
2057.00 * 13 0 . . . 101 12
2064.00 * 12 0 . . . 101 11
2108.00 * 11 0 . . . 101 10
2114.00 * 10 0 . . . 101 9
2123.00 * 9 0 . . . 101 8
2125.00 * 8 0 . . . 101 7
2132.00 * 7 0 . . . 101 6
2145.00 * 6 0 . . . 101 5
2156.00 * 5 0 . . . 101 4
2190.00 * 4 0 . . . 101 3
2350.00   3 1 0.2553 0.7447 0.1086 102 2
2353.00   2 1 0.1276 0.8724 0.1053 103 1
2358.00   1 1 0 1.0000 . 104 0

Note:The marked survival times are censored observations.

Summary of LENFOL

Summary Statistics for Time Variable LENFOL

Quartiles of the Survival Distribution

Quartile Estimates
Percent Point
Estimate
95% Confidence Interval
Transform [Lower Upper)
75 2353.00 LOGLOG 2350.00 2358.00
50 1317.00 LOGLOG 865.00 1579.00
25 174.00 LOGLOG 57.00 359.00

Mean

Mean Standard
Error
1260.21 75.27

The SAS System

The LIFETEST Procedure

Stratum 2: GENDER = male

Stratum 2

Product-Limit Estimates

Product-Limit Survival Estimates
LENFOL   Number
at Risk
Observed
Events
Survival Failure Survival Standard Error Number
Failed
Number
Left
0.00   300 0 1.0000 0 0 0 300
1.00   . . . . . 1 299
1.00   . . . . . 2 298
1.00   . . . . . 3 297
1.00   . . . . . 4 296
1.00   . . . . . 5 295
1.00   300 6 0.9800 0.0200 0.00808 6 294
2.00   . . . . . 7 293
2.00   . . . . . 8 292
2.00   294 3 0.9700 0.0300 0.00985 9 291
3.00   291 1 0.9667 0.0333 0.0104 10 290
4.00   290 1 0.9633 0.0367 0.0109 11 289
5.00   . . . . . 12 288
5.00   289 2 0.9567 0.0433 0.0118 13 287
6.00   . . . . . 14 286
6.00   . . . . . 15 285
6.00   287 3 0.9467 0.0533 0.0130 16 284
7.00   . . . . . 17 283
7.00   . . . . . 18 282
7.00   . . . . . 19 281
7.00   284 4 0.9333 0.0667 0.0144 20 280
10.00   280 1 0.9300 0.0700 0.0147 21 279
14.00   279 1 0.9267 0.0733 0.0151 22 278
17.00   . . . . . 23 277
17.00   278 2 0.9200 0.0800 0.0157 24 276
18.00   . . . . . 25 275
18.00   . . . . . 26 274
18.00   276 3 0.9100 0.0900 0.0165 27 273
20.00   . . . . . 28 272
20.00   273 2 0.9033 0.0967 0.0171 29 271
26.00   271 1 0.9000 0.1000 0.0173 30 270
32.00   270 1 0.8967 0.1033 0.0176 31 269
33.00   . . . . . 32 268
33.00   269 2 0.8900 0.1100 0.0181 33 267
52.00   267 1 0.8867 0.1133 0.0183 34 266
55.00   266 1 0.8833 0.1167 0.0185 35 265
60.00   265 1 0.8800 0.1200 0.0188 36 264
61.00   264 1 0.8767 0.1233 0.0190 37 263
62.00   263 1 0.8733 0.1267 0.0192 38 262
64.00   262 1 0.8700 0.1300 0.0194 39 261
69.00   . . . . . 40 260
69.00   261 2 0.8633 0.1367 0.0198 41 259
76.00   259 1 0.8600 0.1400 0.0200 42 258
81.00   258 1 0.8567 0.1433 0.0202 43 257
88.00   257 1 0.8533 0.1467 0.0204 44 256
91.00   256 1 0.8500 0.1500 0.0206 45 255
97.00   255 1 0.8467 0.1533 0.0208 46 254
100.00   254 1 0.8433 0.1567 0.0210 47 253
108.00   253 1 0.8400 0.1600 0.0212 48 252
109.00   252 1 0.8367 0.1633 0.0213 49 251
113.00   251 1 0.8333 0.1667 0.0215 50 250
116.00   250 1 0.8300 0.1700 0.0217 51 249
117.00   249 1 0.8267 0.1733 0.0219 52 248
118.00   248 1 0.8233 0.1767 0.0220 53 247
129.00   247 1 0.8200 0.1800 0.0222 54 246
140.00   . . . . . 55 245
140.00   246 2 0.8133 0.1867 0.0225 56 244
143.00   244 1 0.8100 0.1900 0.0226 57 243
166.00   243 1 0.8067 0.1933 0.0228 58 242
169.00   . . . . . 59 241
169.00   242 2 0.8000 0.2000 0.0231 60 240
187.00   . . . . . 61 239
187.00   240 2 0.7933 0.2067 0.0234 62 238
192.00   238 1 0.7900 0.2100 0.0235 63 237
233.00   237 1 0.7867 0.2133 0.0237 64 236
259.00   . . . . . 65 235
259.00   236 2 0.7800 0.2200 0.0239 66 234
269.00   234 1 0.7767 0.2233 0.0240 67 233
295.00   233 1 0.7733 0.2267 0.0242 68 232
297.00   . . . . . 69 231
297.00   232 2 0.7667 0.2333 0.0244 70 230
312.00   230 1 0.7633 0.2367 0.0245 71 229
313.00   229 1 0.7600 0.2400 0.0247 72 228
343.00   228 1 0.7567 0.2433 0.0248 73 227
345.00   227 1 0.7533 0.2467 0.0249 74 226
354.00   226 1 0.7500 0.2500 0.0250 75 225
368.00 * 225 0 . . . 75 224
371.00 * 224 0 . . . 75 223
376.00 * . 0 . . . 75 222
376.00 * 223 0 . . . 75 221
382.00   221 1 0.7466 0.2534 0.0251 76 220
386.00 * 220 0 . . . 76 219
390.00 * 219 0 . . . 76 218
397.00   218 1 0.7432 0.2568 0.0252 77 217
398.00 * 217 0 . . . 77 216
399.00 * 216 0 . . . 77 215
403.00 * . 0 . . . 77 214
403.00 * 215 0 . . . 77 213
405.00   213 1 0.7397 0.2603 0.0254 78 212
406.00   212 1 0.7362 0.2638 0.0255 79 211
407.00 * 211 0 . . . 79 210
408.00 * 210 0 . . . 79 209
411.00 * 209 0 . . . 79 208
412.00 * 208 0 . . . 79 207
416.00 * 207 0 . . . 79 206
418.00 * 206 0 . . . 79 205
419.00   205 1 0.7326 0.2674 0.0256 80 204
421.00 * 204 0 . . . 80 203
422.00 * 203 0 . . . 80 202
426.00 * 202 0 . . . 80 201
427.00 * 201 0 . . . 80 200
433.00 * 200 0 . . . 80 199
437.00 * 199 0 . . . 80 198
440.00 * 198 0 . . . 80 197
442.00 * 197 0 . . . 80 196
445.00 * . 0 . . . 80 195
445.00 * 196 0 . . . 80 194
446.00 * 194 0 . . . 80 193
450.00 * 193 0 . . . 80 192
452.00 * 192 0 . . . 80 191
457.00 * 191 0 . . . 80 190
458.00 * . 0 . . . 80 189
458.00 * 190 0 . . . 80 188
459.00 * . 0 . . . 80 187
459.00 * 188 0 . . . 80 186
466.00 * 186 0 . . . 80 185
467.00   185 1 0.7287 0.2713 0.0258 81 184
473.00   184 1 0.7247 0.2753 0.0259 82 183
475.00 * 183 0 . . . 82 182
480.00 * 182 0 . . . 82 181
486.00 * 181 0 . . . 82 180
497.00   180 1 0.7207 0.2793 0.0261 83 179
507.00 * 179 0 . . . 83 178
510.00 * 178 0 . . . 83 177
511.00 * 177 0 . . . 83 176
516.00 * 176 0 . . . 83 175
519.00 * 175 0 . . . 83 174
521.00 * 174 0 . . . 83 173
523.00 * 173 0 . . . 83 172
529.00 * 172 0 . . . 83 171
530.00   171 1 0.7165 0.2835 0.0263 84 170
532.00 * . 0 . . . 84 169
532.00 * 170 0 . . . 84 168
544.00 * 168 0 . . . 84 167
550.00 * . 0 . . . 84 166
550.00 * 167 0 . . . 84 165
554.00 * 165 0 . . . 84 164
559.00   164 1 0.7121 0.2879 0.0265 85 163
562.00   163 1 0.7077 0.2923 0.0267 86 162
568.00 * 162 0 . . . 86 161
570.00 * 161 0 . . . 86 160
573.00 * . 0 . . . 86 159
573.00 * 160 0 . . . 86 158
587.00 * 158 0 . . . 86 157
589.00 * 157 0 . . . 86 156
609.00 * 156 0 . . . 86 155
612.00   155 1 0.7031 0.2969 0.0269 87 154
614.00   154 1 0.6986 0.3014 0.0271 88 153
626.00 * 153 0 . . . 88 152
644.00   152 1 0.6940 0.3060 0.0273 89 151
654.00   151 1 0.6894 0.3106 0.0275 90 150
675.00 * 150 0 . . . 90 149
718.00   149 1 0.6848 0.3152 0.0277 91 148
849.00   148 1 0.6801 0.3199 0.0279 92 147
903.00   147 1 0.6755 0.3245 0.0281 93 146
936.00   146 1 0.6709 0.3291 0.0283 94 145
953.00   145 1 0.6663 0.3337 0.0285 95 144
1048.00   144 1 0.6616 0.3384 0.0286 96 143
1054.00   143 1 0.6570 0.3430 0.0288 97 142
1098.00 * 142 0 . . . 97 141
1102.00 * 141 0 . . . 97 140
1105.00 * 140 0 . . . 97 139
1106.00 * . 0 . . . 97 138
1106.00 * 139 0 . . . 97 137
1107.00 * 137 0 . . . 97 136
1108.00 * 136 0 . . . 97 135
1109.00 * 135 0 . . . 97 134
1114.00 * . 0 . . . 97 133
1114.00 * 134 0 . . . 97 132
1121.00 * 132 0 . . . 97 131
1123.00 * . 0 . . . 97 130
1123.00 * 131 0 . . . 97 129
1125.00 * 129 0 . . . 97 128
1136.00   128 1 0.6519 0.3481 0.0290 98 127
1136.00 * . 0 . . . 98 126
1140.00 * 126 0 . . . 98 125
1150.00 * 125 0 . . . 98 124
1151.00 * 124 0 . . . 98 123
1159.00   123 1 0.6466 0.3534 0.0293 99 122
1163.00 * 122 0 . . . 99 121
1169.00 * 121 0 . . . 99 120
1178.00 * 120 0 . . . 99 119
1182.00 * 119 0 . . . 99 118
1189.00 * 118 0 . . . 99 117
1190.00 * 117 0 . . . 99 116
1196.00 * 116 0 . . . 99 115
1203.00 * 115 0 . . . 99 114
1207.00 * 114 0 . . . 99 113
1211.00 * 113 0 . . . 99 112
1223.00 * 112 0 . . . 99 111
1231.00 * 111 0 . . . 99 110
1232.00 * . 0 . . . 99 109
1232.00 * 110 0 . . . 99 108
1233.00   108 1 0.6406 0.3594 0.0296 100 107
1234.00 * 107 0 . . . 100 106
1235.00 * 106 0 . . . 100 105
1248.00 * . 0 . . . 100 104
1248.00 * 105 0 . . . 100 103
1253.00 * 103 0 . . . 100 102
1262.00 * 102 0 . . . 100 101
1266.00 * . 0 . . . 100 100
1266.00 * 101 0 . . . 100 99
1272.00 * 99 0 . . . 100 98
1273.00 * 98 0 . . . 100 97
1279.00   97 1 0.6340 0.3660 0.0300 101 96
1280.00 * 96 0 . . . 101 95
1290.00 * 95 0 . . . 101 94
1298.00 * 94 0 . . . 101 93
1308.00 * 93 0 . . . 101 92
1314.00 * 92 0 . . . 101 91
1317.00 * 91 0 . . . 101 90
1319.00 * 90 0 . . . 101 89
1329.00 * 89 0 . . . 101 88
1336.00 * . 0 . . . 101 87
1336.00 * 88 0 . . . 101 86
1347.00 * 86 0 . . . 101 85
1353.00 * 85 0 . . . 101 84
1365.00 * 84 0 . . . 101 83
1366.00 * 83 0 . . . 101 82
1377.00   82 1 0.6262 0.3738 0.0307 102 81
1378.00 * 81 0 . . . 102 80
1381.00 * 80 0 . . . 102 79
1384.00 * 79 0 . . . 102 78
1388.00 * 78 0 . . . 102 77
1390.00 * 77 0 . . . 102 76
1400.00 * 76 0 . . . 102 75
1408.00 * 75 0 . . . 102 74
1409.00 * 74 0 . . . 102 73
1430.00 * 73 0 . . . 102 72
1433.00 * 72 0 . . . 102 71
1438.00 * 71 0 . . . 102 70
1444.00 * 70 0 . . . 102 69
1449.00 * 69 0 . . . 102 68
1454.00 * 68 0 . . . 102 67
1456.00 * 67 0 . . . 102 66
1458.00 * 66 0 . . . 102 65
1496.00   65 1 0.6166 0.3834 0.0317 103 64
1506.00   64 1 0.6070 0.3930 0.0326 104 63
1527.00   63 1 0.5973 0.4027 0.0335 105 62
1548.00   62 1 0.5877 0.4123 0.0343 106 61
1553.00   61 1 0.5781 0.4219 0.0351 107 60
1624.00   60 1 0.5684 0.4316 0.0358 108 59
1671.00   59 1 0.5588 0.4412 0.0364 109 58
1831.00 * 58 0 . . . 109 57
1847.00 * 57 0 . . . 109 56
1854.00 * 56 0 . . . 109 55
1858.00 * 55 0 . . . 109 54
1863.00 * 54 0 . . . 109 53
1880.00 * 53 0 . . . 109 52
1883.00 * . 0 . . . 109 51
1883.00 * 52 0 . . . 109 50
1889.00 * 50 0 . . . 109 49
1893.00 * 49 0 . . . 109 48
1899.00 * 48 0 . . . 109 47
1904.00 * 47 0 . . . 109 46
1920.00 * 46 0 . . . 109 45
1923.00 * 45 0 . . . 109 44
1934.00 * 44 0 . . . 109 43
1939.00 * . 0 . . . 109 42
1939.00 * 43 0 . . . 109 41
1940.00 * 41 0 . . . 109 40
1942.00 * 40 0 . . . 109 39
1954.00   39 1 0.5445 0.4555 0.0382 110 38
1954.00 * . 0 . . . 110 37
1976.00 * 37 0 . . . 110 36
1977.00 * 36 0 . . . 110 35
1993.00 * 35 0 . . . 110 34
1994.00 * 34 0 . . . 110 33
2006.00 * 33 0 . . . 110 32
2009.00 * 32 0 . . . 110 31
2025.00 * 31 0 . . . 110 30
2032.00 * 30 0 . . . 110 29
2048.00 * . 0 . . . 110 28
2048.00 * 29 0 . . . 110 27
2061.00 * 27 0 . . . 110 26
2065.00 * 26 0 . . . 110 25
2066.00 * 25 0 . . . 110 24
2083.00 * 24 0 . . . 110 23
2084.00 * 23 0 . . . 110 22
2086.00 * 22 0 . . . 110 21
2100.00 * 21 0 . . . 110 20
2113.00 * 20 0 . . . 110 19
2114.00 * 19 0 . . . 110 18
2118.00 * 18 0 . . . 110 17
2122.00 * 17 0 . . . 110 16
2123.00 * 16 0 . . . 110 15
2126.00 * . 0 . . . 110 14
2126.00 * 15 0 . . . 110 13
2131.00 * 13 0 . . . 110 12
2139.00 * 12 0 . . . 110 11
2146.00 * 11 0 . . . 110 10
2151.00 * 10 0 . . . 110 9
2152.00 * 9 0 . . . 110 8
2160.00   8 1 0.4764 0.5236 0.0719 111 7
2166.00 * 7 0 . . . 111 6
2168.00 * 6 0 . . . 111 5
2172.00 * 5 0 . . . 111 4
2173.00 * 4 0 . . . 111 3
2175.00 * 3 0 . . . 111 2
2178.00 * 2 0 . . . 111 1
2192.00 * 1 0 0.4764 . . 111 0

Note:The marked survival times are censored observations.

Summary of LENFOL

Summary Statistics for Time Variable LENFOL

Quartiles of the Survival Distribution

Quartile Estimates
Percent Point
Estimate
95% Confidence Interval
Transform [Lower Upper)
75 . LOGLOG . .
50 2160.00 LOGLOG 1624.00 .
25 368.00 LOGLOG 187.00 614.00

Mean

Mean Standard
Error
1433.26 55.19

Note:The mean survival time and its standard error were underestimated because the largest observation was censored and the estimation was restricted to the largest event time.

Censored Summary

Summary of the Number of Censored and Uncensored Values
Stratum GENDER Total Failed Censored Percent
Censored
1 female 200 104 96 48.00
2 male 300 111 189 63.00
Total   500 215 285 57.00

Strata Homogeneity


The SAS System

The LIFETEST Procedure

Testing Homogeneity of Survival Curves for LENFOL over Strata

Rank Statistics

Rank Statistics
GENDER Log-Rank Wilcoxon Tarone Peto ModifiedPeto Fleming
female 19.726 6271.0 345.2 14.516 14.474 14.570
male -19.726 -6271.0 -345.2 -14.516 -14.474 -14.570

Log-Rank Covariance

Covariance Matrix for the Log-Rank Statistics
GENDER female male
female 49.9453 -49.9453
male -49.9453 49.9453

Wilcoxon Covariance

Covariance Matrix for the Wilcoxon Statistics
GENDER female male
female 7102339 -7102339
male -7102339 7102339

Tarone Covariance

Covariance Matrix for the Tarone Statistics
GENDER female male
female 17878.3 -17878.3
male -17878.3 17878.3

Peto Covariance

Covariance Matrix for the Peto Statistics
GENDER female male
female 31.1709 -31.1709
male -31.1709 31.1709

Modified Peto Covariance

Covariance Matrix for the Modified Peto Statistics
GENDER female male
female 30.9855 -30.9855
male -30.9855 30.9855

Fleming Covariance

Covariance Matrix for the Fleming Statistics
GENDER female male
female 31.5401 -31.5401
male -31.5401 31.5401

Homogeneity Tests

Test of Equality over Strata
Test Chi-Square DF Pr >
Chi-Square
Log-Rank 7.7911 1 0.0053
Wilcoxon 5.5370 1 0.0186
Tarone 6.6664 1 0.0098
Peto 6.7602 1 0.0093
Modified Peto 6.7611 1 0.0093
Fleming(1) 6.7309 1 0.0095

Survival Curves

Product-Limit Survival Curves with Number of Subjects at Risk and 95% Hall-Wellner Bands

These test are of

$$H_0: S_1=S_2\;\text{ vs }\;H_1:S_1\neq S_2$$

The most common test is the log rank test, but SAS has other tests of these hypotheses as well. The log rank test is popular becuase under certain conditions it is the most powerful test. In this case, all of the tests conclud that the survival functions are significantly different between males and females. Females generally have a worse surival time.

Cox's Proportional Hazards Regression Model

In Cox's proportional hazards regression model, we model the hazard function with a baseline hazard function, $h_0(t)$, and a function of the covariates in the following way.

$$h(t|X)=h_0(t)\exp(\beta_1X_1+\cdots+\beta_pX_p)$$
  • $h_0(t)$ modeled nonparametrically, that is no assumptions are made about its functional form
  • the covariates are in an exponential term to restrict the values of the hazard function to remain positive

If we look at the ratio of the hazard functions at two different values of the covariates, then we have a quantity that is independent of time. This is the assumption that the hazards are proportional

$$\dfrac{h(t|x_2)}{h(t|x_1)}=\dfrac{h_0(t)\exp(\beta_1x_2)}{h_0(t)\exp(\beta_1x_1}=\exp(\beta_1(x_2-x_1)$$

This means that this model assums that if a subject has a risk of death twice as high as another subject at some time point then the risk of death is always twice as high at every time point.

  • In this model, $\beta_1$ is interpreted as "the risk of dying for group 2 is $\exp{\beta_1}$ times the risk of dying for group when, holding all other variables constant."
In [26]:
proc phreg data = surv.whas500 plots=survival;
class gender;
model lenfol*fstat(0) = gender age;
FORMAT gender gdr.;
run;
Out[26]:
SAS Output

SAS Output

The SAS System

The PHREG Procedure

The PHREG Procedure

Model Information

Model Information
Data Set SURV.WHAS500
Dependent Variable LENFOL
Censoring Variable FSTAT
Censoring Value(s) 0
Ties Handling BRESLOW

Number of Observations

Number of Observations Read
Number of Observations Used
500
500

Class Level Information

Class Level Information
Class Value Design Variables
GENDER female 1
  male 0

Summary of Event and Censored Observations

Summary of the Number of Event and Censored Values
Total Event Censored Percent
Censored
500 215 285 57.00

Convergence Status

Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.

Model Fit Statistics

Model Fit Statistics
Criterion Without
Covariates
With
Covariates
-2 LOG L 2455.158 2313.140
AIC 2455.158 2317.140
SBC 2455.158 2323.882

Test of Global Null Hypothesis

Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 142.0177 2 <.0001
Score 126.6381 2 <.0001
Wald 119.3806 2 <.0001

Type 3 Tests

Type 3 Tests
Effect DF Wald Chi-Square Pr > ChiSq
GENDER 1 0.2175 0.6410
AGE 1 116.3986 <.0001

Maximum Likelihood Estimates of Model Parameters

Analysis of Maximum Likelihood Estimates
Parameter   DF Parameter
Estimate
Standard
Error
Chi-Square Pr > ChiSq Hazard
Ratio
Label
GENDER female 1 -0.06556 0.14057 0.2175 0.6410 0.937 GENDER female
AGE   1 0.06683 0.00619 116.3986 <.0001 1.069  

Survivorship

Predicted Survivor Functions

Reference Set of Covariates

Reference Set of Covariates for Plotting
AGE GENDER
69.845947 male
  • The resulting model is
$$h(t|gender,age}=h_0(t)\exp(-0.066*x_gender + 0.067*age)$$
  • The global test of the null hypothesis: BETA=0 is a test of all the BETAS in the model being equal to zero. In this case, the model is significant, so at least on predictor is useful at predicting hazard rates.

  • The estimated hazard rate ratio between females and males is $\exp{-0.06556}=0.937$ which is roughly a 6% decrease in risk. This coefficient is not significant though.

  • The estimated hazard rate increase by about 7% (hazard rate ratio = $\exp(0.06683)=1.069$) for each additional year.

Model Diagnostics

Cox's model is a semiparametric model meaning that we have made some parametric (functional form of the covariates and proportional hazards) assumptions and some nonparametric assumptions (no assumptions on the form of the baseline hazard). Just in multiple regression, we can use diagnostic plots to assess the these assumptions:

  • proportional hazards assumption
  • functional form of the covariates

To assess the whether or not the functional form of the covariates is correct we can examine residual plots. For Cox's model, there are different types of residuals such as Cox-Snell, diviance, martingale and Schoenfeld residuals. How all these residuals are calculated and the theory behind them is beyond the scope of this course, but we will discuss how to use them in diagnostic plots. To assess the funcitonal form, we can plot the martingal residuals vs each variables. Similar to multiple regression, we should the relationship modeling between these two should be a flat line through 0. We could for example use a loess curve to estimate this relationship.

In [30]:
/*full model with linear and quadratric term for bmi */
ODS SELECT NONE;
proc phreg data = surv.whas500;
class gender;
model lenfol*fstat(0) = gender|age bmi hr;
outout out=residfull resmart=martingale;
run;

ODS SELECT ALL;
proc loess data = residfull plots=ResidualsBySmooth(smooth);
model martingale = bmi / smooth=0.2 0.4 0.6 0.8;
run;
Out[30]:
SAS Output

SAS Output

The SAS System

The LOESS Procedure

The LOESS Procedure

Scaling Information

Independent Variable Scaling
Scaling applied: None
Statistic BMI
Minimum Value 13.04546
Maximum Value 44.83886

The SAS System

The LOESS Procedure

Smoothing Parameter: 0.2

Dependent Variable: martingale

Smoothing Parameter: 0.2

Fit Summary

Fit Summary
Fit Method kd Tree
Blending Linear
Number of Observations 500
Number of Fitting Points 33
kd Tree Bucket Size 20
Degree of Local Polynomials 1
Smoothing Parameter 0.20000
Points in Local Neighborhood 100
Residual Sum of Squares 205.38952

Fit Plot

Fit plot of martingale by BMI for smoothing parameter 0.2.

Residual Plots

BMI

Scatter plot of residuals by BMI for martingale with smoothing parameter 0.2.

Diagnostic Plots

Fit Diagnostics

Panel of fit diagnostics for martingale with smoothing parameter 0.2. The panel consists of scatter plots of residuals by predicted values, observed by predicted values, a Q-Q plot of residuals, a residual histogram, and a residual-fit spread plot.

The SAS System

The LOESS Procedure

Smoothing Parameter: 0.4

Dependent Variable: martingale

Smoothing Parameter: 0.4

Fit Summary

Fit Summary
Fit Method kd Tree
Blending Linear
Number of Observations 500
Number of Fitting Points 17
kd Tree Bucket Size 40
Degree of Local Polynomials 1
Smoothing Parameter 0.40000
Points in Local Neighborhood 200
Residual Sum of Squares 208.03774

Fit Plot

Fit plot of martingale by BMI for smoothing parameter 0.4.

Residual Plots

BMI

Scatter plot of residuals by BMI for martingale with smoothing parameter 0.4.

Diagnostic Plots

Fit Diagnostics

Panel of fit diagnostics for martingale with smoothing parameter 0.4. The panel consists of scatter plots of residuals by predicted values, observed by predicted values, a Q-Q plot of residuals, a residual histogram, and a residual-fit spread plot.

The SAS System

The LOESS Procedure

Smoothing Parameter: 0.6

Dependent Variable: martingale

Smoothing Parameter: 0.6

Fit Summary

Fit Summary
Fit Method kd Tree
Blending Linear
Number of Observations 500
Number of Fitting Points 17
kd Tree Bucket Size 60
Degree of Local Polynomials 1
Smoothing Parameter 0.60000
Points in Local Neighborhood 300
Residual Sum of Squares 209.02735

Fit Plot

Fit plot of martingale by BMI for smoothing parameter 0.6.

Residual Plots

BMI

Scatter plot of residuals by BMI for martingale with smoothing parameter 0.6.

Diagnostic Plots

Fit Diagnostics

Panel of fit diagnostics for martingale with smoothing parameter 0.6. The panel consists of scatter plots of residuals by predicted values, observed by predicted values, a Q-Q plot of residuals, a residual histogram, and a residual-fit spread plot.

The SAS System

The LOESS Procedure

Smoothing Parameter: 0.8

Dependent Variable: martingale

Smoothing Parameter: 0.8

Fit Summary

Fit Summary
Fit Method kd Tree
Blending Linear
Number of Observations 500
Number of Fitting Points 9
kd Tree Bucket Size 80
Degree of Local Polynomials 1
Smoothing Parameter 0.80000
Points in Local Neighborhood 400
Residual Sum of Squares 208.80100

Fit Plot

Fit plot of martingale by BMI for smoothing parameter 0.8.

Residual Plots

BMI

Scatter plot of residuals by BMI for martingale with smoothing parameter 0.8.

Diagnostic Plots

Fit Diagnostics

Panel of fit diagnostics for martingale with smoothing parameter 0.8. The panel consists of scatter plots of residuals by predicted values, observed by predicted values, a Q-Q plot of residuals, a residual histogram, and a residual-fit spread plot.

The SAS System

The LOESS Procedure

Dependent Variable: martingale

Fit Plots

Panel 1

Panel of fit plots for martingale as the smoothing parameter varies.

Residuals versus BMI

Panel 1

Panel of residual plots for martingale by BMI as the smoothing parameter varies.

In this case, the loess curves do vary some about the line y=0 but not too much. We may want to condiser a different functional form for a better fit though. Another way to assess the functional form is with the assess statement in proc phreg.

In [31]:
proc phreg data = surv.whas500;
class gender;
model lenfol*fstat(0) = gender|age bmi hr;
assess var=(age bmi hr) / resample;
run;
Out[31]:
SAS Output

SAS Output

The SAS System

The PHREG Procedure

The PHREG Procedure

Model Information

Model Information
Data Set SURV.WHAS500
Dependent Variable LENFOL
Censoring Variable FSTAT
Censoring Value(s) 0
Ties Handling BRESLOW

Number of Observations

Number of Observations Read
Number of Observations Used
500
500

Class Level Information

Class Level Information
Class Value Design Variables
GENDER 0 1
  1 0

Summary of Event and Censored Observations

Summary of the Number of Event and Censored Values
Total Event Censored Percent
Censored
500 215 285 57.00

Convergence Status

Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.

Model Fit Statistics

Model Fit Statistics
Criterion Without
Covariates
With
Covariates
-2 LOG L 2455.158 2280.523
AIC 2455.158 2290.523
SBC 2455.158 2307.376

Test of Global Null Hypothesis

Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 174.6350 5 <.0001
Score 156.5101 5 <.0001
Wald 142.7471 5 <.0001

Joint Tests

Joint Tests
Effect DF Wald Chi-Square Pr > ChiSq
GENDER 1 5.5911 0.0181
AGE 1 16.1283 <.0001
AGE*GENDER 1 6.3916 0.0115
BMI 1 7.7668 0.0053
HR 1 20.2952 <.0001

Note:Under full-rank parameterizations, Type 3 effect tests are replaced by joint tests. The joint test for an effect is a test that all of the parameters associated with that effect are zero. Such joint tests might not be equivalent to Type 3 effect tests under GLM parameterization.

Maximum Likelihood Estimates of Model Parameters

Analysis of Maximum Likelihood Estimates
Parameter   DF Parameter
Estimate
Standard
Error
Chi-Square Pr > ChiSq Hazard
Ratio
Label
GENDER 0 1 -2.35441 0.99572 5.5911 0.0181 . GENDER 0
AGE   1 0.04006 0.00997 16.1283 <.0001 .  
AGE*GENDER 0 1 0.03180 0.01258 6.3916 0.0115 . GENDER 0 * AGE
BMI   1 -0.04314 0.01548 7.7668 0.0053 0.958  
HR   1 0.01231 0.00273 20.2952 <.0001 1.012  

Cumulative Martingale Residual Plot

Cumulative Martingale Residual Plot

Cumulative Martingale Residual Plot

Cumulative Martingale Residual Plot

Cumulative Martingale Residual Plot

Cumulative Martingale Residual Plot

Supremum Test for Functional Form

Supremum Test for Functional Form
Variable Maximum Absolute Value Replications Seed Pr >
MaxAbsVal
AGE 11.2240 1000 177039786 0.1480
BMI 11.0212 1000 177039786 0.2460
HR 9.3459 1000 177039786 0.3760

What we want to see in these plots is the solid line doesn't extend beyond the region set by the many dashed lines. The solid line should stay well within this region if the form is correct. These plots are accompanied by simulated p-value. A small p-value indicates that the model should be modified. Most of these plots look fine, but the bmi plot does have a slighly high residual in the lower bmi range around 20-27.

In [32]:
proc phreg data = surv.whas500;
class gender;
model lenfol*fstat(0) = gender|age bmi|bmi hr;
assess var=(age bmi bmi*bmi hr) / resample;
run;
Out[32]:
SAS Output

SAS Output

The SAS System

The PHREG Procedure

The PHREG Procedure

Model Information

Model Information
Data Set SURV.WHAS500
Dependent Variable LENFOL
Censoring Variable FSTAT
Censoring Value(s) 0
Ties Handling BRESLOW

Number of Observations

Number of Observations Read
Number of Observations Used
500
500

Class Level Information

Class Level Information
Class Value Design Variables
GENDER 0 1
  1 0

Summary of Event and Censored Observations

Summary of the Number of Event and Censored Values
Total Event Censored Percent
Censored
500 215 285 57.00

Convergence Status

Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.

Model Fit Statistics

Model Fit Statistics
Criterion Without
Covariates
With
Covariates
-2 LOG L 2455.158 2276.150
AIC 2455.158 2288.150
SBC 2455.158 2308.374

Test of Global Null Hypothesis

Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 179.0077 6 <.0001
Score 174.7924 6 <.0001
Wald 154.9497 6 <.0001

Joint Tests

Joint Tests
Effect DF Wald Chi-Square Pr > ChiSq
GENDER 1 4.5117 0.0337
AGE 1 17.0581 <.0001
AGE*GENDER 1 5.4646 0.0194
BMI 1 7.0382 0.0080
BMI*BMI 1 4.8858 0.0271
HR 1 21.4528 <.0001

Note:Under full-rank parameterizations, Type 3 effect tests are replaced by joint tests. The joint test for an effect is a test that all of the parameters associated with that effect are zero. Such joint tests might not be equivalent to Type 3 effect tests under GLM parameterization.

Maximum Likelihood Estimates of Model Parameters

Analysis of Maximum Likelihood Estimates
Parameter   DF Parameter
Estimate
Standard
Error
Chi-Square Pr > ChiSq Hazard
Ratio
Label
GENDER 0 1 -2.10986 0.99330 4.5117 0.0337 . GENDER 0
AGE   1 0.04161 0.01007 17.0581 <.0001 .  
AGE*GENDER 0 1 0.02925 0.01251 5.4646 0.0194 . GENDER 0 * AGE
BMI   1 -0.23323 0.08791 7.0382 0.0080 .  
BMI*BMI   1 0.00363 0.00164 4.8858 0.0271 . BMI * BMI
HR   1 0.01277 0.00276 21.4528 <.0001 1.013  

Cumulative Martingale Residual Plot

Cumulative Martingale Residual Plot

Cumulative Martingale Residual Plot

Cumulative Martingale Residual Plot

Cumulative Martingale Residual Plot

Cumulative Martingale Residual Plot

Cumulative Martingale Residual Plot

Cumulative Martingale Residual Plot

Supremum Test for Functional Form

Supremum Test for Functional Form
Variable Maximum Absolute Value Replications Seed Pr >
MaxAbsVal
AGE 9.7412 1000 1018669205 0.2910
BMI 7.8329 1000 1018669205 0.6470
BMIBMI 7.8329 1000 1018669205 0.6470
HR 9.1548 1000 1018669205 0.3810

The residuals for bmi are now smaller on the lower end of bmi. All of these plots look reasonable.

To assess the proportional hazards assumption, we can use several methods. If the predictor is categorical we can visually plot the two estimated Kaplan-Meier curves to see if this assumption is violated as we did for males and female above. We can also use the Schoenfel residuals. Plots of the Schoenfeld resisudal vs each predictor in the model should not show any estimated mean relationship (i.e a the line y=0). We can estimate this using a loess curve.

In [33]:
ODS SELECT NONE;
proc phreg data=surv.whas500;
class gender;
model lenfol*fstat(0) = gender|age bmi|bmi hr;
output out=schoen ressch=schgender schage schgenderage
   schbmi schbmibmi schhr;
run;

ODS SELECT ALL;
proc loess data = schoen;
model schage=lenfol / smooth=(0.2 0.4 0.6 0.8);
run;
Out[33]:
SAS Output

SAS Output

The SAS System

The LOESS Procedure

The LOESS Procedure

Scaling Information

Independent Variable Scaling
Scaling applied: None
Statistic LENFOL
Minimum Value 1.00000
Maximum Value 2358.00000

The SAS System

The LOESS Procedure

Smoothing Parameter: 0.2

Dependent Variable: schage

Smoothing Parameter: 0.2

Fit Summary

Fit Summary
Fit Method kd Tree
Blending Linear
Number of Observations 215
Number of Fitting Points 32
kd Tree Bucket Size 8
Degree of Local Polynomials 1
Smoothing Parameter 0.20000
Points in Local Neighborhood 43
Residual Sum of Squares 25947

Fit Plot

Fit plot of schage by LENFOL for smoothing parameter 0.2.

Residual Plots

LENFOL

Scatter plot of residuals by LENFOL for schage with smoothing parameter 0.2.

Diagnostic Plots

Fit Diagnostics

Panel of fit diagnostics for schage with smoothing parameter 0.2. The panel consists of scatter plots of residuals by predicted values, observed by predicted values, a Q-Q plot of residuals, a residual histogram, and a residual-fit spread plot.

The SAS System

The LOESS Procedure

Smoothing Parameter: 0.4

Dependent Variable: schage

Smoothing Parameter: 0.4

Fit Summary

Fit Summary
Fit Method kd Tree
Blending Linear
Number of Observations 215
Number of Fitting Points 17
kd Tree Bucket Size 17
Degree of Local Polynomials 1
Smoothing Parameter 0.40000
Points in Local Neighborhood 86
Residual Sum of Squares 26297

Fit Plot

Fit plot of schage by LENFOL for smoothing parameter 0.4.

Residual Plots

LENFOL

Scatter plot of residuals by LENFOL for schage with smoothing parameter 0.4.

Diagnostic Plots

Fit Diagnostics

Panel of fit diagnostics for schage with smoothing parameter 0.4. The panel consists of scatter plots of residuals by predicted values, observed by predicted values, a Q-Q plot of residuals, a residual histogram, and a residual-fit spread plot.

The SAS System

The LOESS Procedure

Smoothing Parameter: 0.6

Dependent Variable: schage

Smoothing Parameter: 0.6

Fit Summary

Fit Summary
Fit Method kd Tree
Blending Linear
Number of Observations 215
Number of Fitting Points 17
kd Tree Bucket Size 25
Degree of Local Polynomials 1
Smoothing Parameter 0.60000
Points in Local Neighborhood 129
Residual Sum of Squares 26381

Fit Plot

Fit plot of schage by LENFOL for smoothing parameter 0.6.

Residual Plots

LENFOL

Scatter plot of residuals by LENFOL for schage with smoothing parameter 0.6.

Diagnostic Plots

Fit Diagnostics

Panel of fit diagnostics for schage with smoothing parameter 0.6. The panel consists of scatter plots of residuals by predicted values, observed by predicted values, a Q-Q plot of residuals, a residual histogram, and a residual-fit spread plot.

The SAS System

The LOESS Procedure

Smoothing Parameter: 0.8

Dependent Variable: schage

Smoothing Parameter: 0.8

Fit Summary

Fit Summary
Fit Method kd Tree
Blending Linear
Number of Observations 215
Number of Fitting Points 9
kd Tree Bucket Size 34
Degree of Local Polynomials 1
Smoothing Parameter 0.80000
Points in Local Neighborhood 172
Residual Sum of Squares 26400

Fit Plot

Fit plot of schage by LENFOL for smoothing parameter 0.8.

Residual Plots

LENFOL

Scatter plot of residuals by LENFOL for schage with smoothing parameter 0.8.

Diagnostic Plots

Fit Diagnostics

Panel of fit diagnostics for schage with smoothing parameter 0.8. The panel consists of scatter plots of residuals by predicted values, observed by predicted values, a Q-Q plot of residuals, a residual histogram, and a residual-fit spread plot.

The SAS System

The LOESS Procedure

Dependent Variable: schage

Fit Plots

Panel 1

Panel of fit plots for schage as the smoothing parameter varies.

Residuals versus LENFOL

Panel 1

Panel of residual plots for schage by LENFOL as the smoothing parameter varies.

The smoothed loess curve appears roughly flat at 0 showing suggesting that the coefficient for age does not change over time and that the proportional hazards assumption holds for this covariate.

A third way we can check this assumption is by using the assess statement.

In [34]:
proc phreg data=surv.whas500;
class gender;
model lenfol*fstat(0) = gender|age bmi|bmi hr;
assess var=(age bmi bmi*bmi hr) ph / resample ;
run;
Out[34]:
SAS Output

SAS Output

The SAS System

The PHREG Procedure

The PHREG Procedure

Model Information

Model Information
Data Set SURV.WHAS500
Dependent Variable LENFOL
Censoring Variable FSTAT
Censoring Value(s) 0
Ties Handling BRESLOW

Number of Observations

Number of Observations Read
Number of Observations Used
500
500

Class Level Information

Class Level Information
Class Value Design Variables
GENDER 0 1
  1 0

Summary of Event and Censored Observations

Summary of the Number of Event and Censored Values
Total Event Censored Percent
Censored
500 215 285 57.00

Convergence Status

Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.

Model Fit Statistics

Model Fit Statistics
Criterion Without
Covariates
With
Covariates
-2 LOG L 2455.158 2276.150
AIC 2455.158 2288.150
SBC 2455.158 2308.374

Test of Global Null Hypothesis

Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 179.0077 6 <.0001
Score 174.7924 6 <.0001
Wald 154.9497 6 <.0001

Joint Tests

Joint Tests
Effect DF Wald Chi-Square Pr > ChiSq
GENDER 1 4.5117 0.0337
AGE 1 17.0581 <.0001
AGE*GENDER 1 5.4646 0.0194
BMI 1 7.0382 0.0080
BMI*BMI 1 4.8858 0.0271
HR 1 21.4528 <.0001

Note:Under full-rank parameterizations, Type 3 effect tests are replaced by joint tests. The joint test for an effect is a test that all of the parameters associated with that effect are zero. Such joint tests might not be equivalent to Type 3 effect tests under GLM parameterization.

Maximum Likelihood Estimates of Model Parameters

Analysis of Maximum Likelihood Estimates
Parameter   DF Parameter
Estimate
Standard
Error
Chi-Square Pr > ChiSq Hazard
Ratio
Label
GENDER 0 1 -2.10986 0.99330 4.5117 0.0337 . GENDER 0
AGE   1 0.04161 0.01007 17.0581 <.0001 .  
AGE*GENDER 0 1 0.02925 0.01251 5.4646 0.0194 . GENDER 0 * AGE
BMI   1 -0.23323 0.08791 7.0382 0.0080 .  
BMI*BMI   1 0.00363 0.00164 4.8858 0.0271 . BMI * BMI
HR   1 0.01277 0.00276 21.4528 <.0001 1.013  

Cumulative Martingale Residual Plot

Cumulative Martingale Residual Plot

Cumulative Martingale Residual Plot

Cumulative Martingale Residual Plot

Cumulative Martingale Residual Plot

Cumulative Martingale Residual Plot

Cumulative Martingale Residual Plot

Cumulative Martingale Residual Plot

Supremum Test for Functional Form

Supremum Test for Functional Form
Variable Maximum Absolute Value Replications Seed Pr >
MaxAbsVal
AGE 9.7412 1000 1807356845 0.2760
BMI 7.8329 1000 1807356845 0.6400
BMIBMI 7.8329 1000 1807356845 0.6400
HR 9.1548 1000 1807356845 0.4010

Standardized Score Process Plot

Standardized Score Process Plot

Standardized Score Process Plot

Standardized Score Process Plot

Standardized Score Process Plot

Standardized Score Process Plot

Standardized Score Process Plot

Standardized Score Process Plot

Standardized Score Process Plot

Standardized Score Process Plot

Standardized Score Process Plot

Standardized Score Process Plot

Supremum Test for Proportional Hazards Assumption

Supremum Test for Proportionals Hazards Assumption
Variable Maximum Absolute Value Replications Seed Pr >
MaxAbsVal
GENDER0 4.4713 1000 1807356845 0.7640
AGE 0.7745 1000 1807356845 0.9310
GENDER0AGE 5.9527 1000 1807356845 0.4440
BMI 5.9498 1000 1807356845 0.3310
BMIBMI 5.8837 1000 1807356845 0.3540
HR 0.8979 1000 1807356845 0.2950

Again, we are looking for solid lines that stay within the region defined by the dashed simulated lines. None of these lines looks particularly worrisome and all of the p-values are large suggesting that the proportional hazards assumption holds for each of these covariates.

In [ ]: