{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# The Output Delivery System and Graphics\n", "\n", "In this lesson, we'll learn how to use the SAS System's Output Delivery System (ODS) to create other forms of output, such as HTML output that can be viewed by your web browser, PDF files that are formatted for high-resolution printers, and RTF files that can be easily imported into Microsoft Word. Along the way, we'll also learn how to modify the appearance of the output that is obtained by default from any procedure. Finally, we'll learn how to use the Output Delivery System to create SAS data sets instead of the default output from various procedures.\n", "\n", "We will also return to the SG plotting procedures to investigate how to adjust graphical parameters such as line types, colors, axes labels, legends, etc.\n", "\n", "## The Output Delivery System\n", "\n", "You might be getting the impression that by learning about the Output Delivery System (ODS) in this lesson that it will be the first time we use it. In reality, SAS has been using it behind the scenes all along to create the listing output that the procedures we've used generates by default. All we want to do in this lesson is learn how ODS works and how to change the default settings so that we can get the output that we want rather than the output that SAS wants.\n", "\n", "### How ODS Works\n", "\n", "So how does ODS work? Whenever you submit a program that creates output, ODS does the following:\n", "\n", "1) ODS creates your output in the form of **output objects**. Each output object is comprised of two components. The **data component** contains the results — think numbers — of a procedure or a DATA step. The **table definition** tells SAS how to render the results — think structure. For example, suppose we executed the FREQ procedure so that it created the following output:\n", "\n", "
\n", "\n", "\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\n", "\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\n", "
AB
Frequency Percent12Total
16040100
30.0020.0050.00
24060100
20.0030.0050.00
Total100100200
50.0050.00100.00
\n", "
\n", "\n", "

SAS actually creates this piece of output from its two parts, the table definition:

\n", "\n", "\n", "
\n", "\n", "\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\n", "\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\n", "
AB
Frequency Percent12Total
1   
   
2   
   
Total   
   
\n", "
\n", "\n", "

and the data component:

\n", "\n", "\n", "
\n", "\n", "\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\n", "\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\n", "
ABCOUNTPERCENT
 16030
124020
214020
226030
\n", "
\n", "\n", "

2) Once SAS creates all of the output objects from an executed program, it then just needs to figure out where to send the objects. It's actually pretty easy... SAS sends the output to whatever ODS destination(s) you tell SAS to send it. And when doing so, SAS sends the output in the format specified by the destination. This is where ODS is really powerful and therefore really neat! Besides the listing output generated by SAS procedures by default, and among others, you can send your output to...

\n", "\n", "\n", "\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\n", "
this destination ...to produce...
HTMLoutput that is formatted in HyperText Markup Language (HTML), and therefore viewable by web browsers
OutputSAS data sets
Printer Familyoutput that is formatted for a high-resolution printer, such as Post Script (PS), Portable Document Format (PDF), and Printer Control Language (PCL) files
RTFrich text format output for use with Microsoft Word
\n", "\n", "

In the next section, we'll learn how to tell SAS where to send the output it generates by \"opening\" and \"closing\" these various ODS destinations.

" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Opening and Closing ODS Destinations\n", "\n", "If you are perfectly content with your output being sent to the output window in the default HTML format, then you don't have tell SAS anything at all. That's because the HTML destination is open by default. On the other hand, if you want to tell SAS to send your output to another ODS destination, PDF say, then you have to open the destination before the SAS code that generates your output.\n", "\n", "To open a destination, you simply submit the following ODS statement:\n", "\n", "`ODS open-destination;`;\n", "\n", "where open-destination is a keyword (as well as any required options for the destination) that tells SAS where you want to send your output. In this lesson, we'll focus only on the most commonly used keyword destinations: **Listing**, **HTML**, **RTF**, and **PDF**.\n", "\n", "After the SAS code that generates your output, you have to tell SAS to close the destination so that you can access your output. To close a destination, you simply submit the following ODS statement:\n", "\n", "`ODS close-destination CLOSE;`\n", "\n", "where close-destination is the same keyword as the open-destination.\n", "\n", "In theory, you can submit ODS statements in any order, depending on whether you need to open or close an ODS destination. In practice, however, most ODS destinations are closed by default, so that you open them at the beginning of your program and close them at the end. The exception is the HTML destination, which is open by default. Let's take a look at an example.\n", "\n", "
\n", "

Example

\n", "

You might recall that the SAS data set called penngolf contains information, such as the total yardage and par, of eleven golf courses in Pennsylvania. The following program opens the HTML destination so that a subset of the penngolf data set can be printed in HTML format as well as the default HTML format in the output window:

\n", "
" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

Some of the penngolf data set variables

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
NameYearTypeParYards
Toftrees1968Resort727018
Penn State Blue1921Public726525
Centre Hills1921Private716392
Lewistown CC.Private726779
State College Elks1973SemiPri716369
Park Hills CC1966SemiPri706004
Sinking Valley CC1967SemiPri726755
Williamsport CC1909Private716489
Standing Stone GC1973SemiPri706593
Bucknell GC1960SemiPri706253
Mount Airy Lodge1972Resort727123
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "LIBNAME phc6089 \"/folders/myfolders/SAS_Notes/data\";\n", " \n", "ODS HTML file = '/folders/myfolders/SAS_Notes/output/html/golf.html';\n", " \n", "PROC PRINT data = phc6089.penngolf NOOBS;\n", " title 'Some of the penngolf data set variables';\n", " ID name;\n", " var year type par yards;\n", "RUN;\n", " \n", "ODS HTML CLOSE;\n", "ODS HTML;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "

This code illustrates the standard ODS practice mentioned earlier... open your destinations at the top of your program, and close them at the bottom. Here, the first ODS statement tells SAS to open the HTML destination and to save the HTML output generated by the PRINT procedure that follows to the specified file name. The second ODS statement tells SAS to close the HTML destination so that we can access the created HTML file.

\n", "

Download and save the penngolf (see the data folder on the course website if you don't already have it) data set to a convenient location on your computer. Then, launch the SAS code, and edit the LIBNAME statement so that it reflects the location in which you saved the data set. Also, edit the first ODS statement's FILE= option so that it reflects the location and name of the file where you want the resulting HTML output to be sent. (Make sure that you give your filename the standard .html extension.) Finally, run the SAS program. In doing so, you should note that SAS generates two pieces of output. The default output is displayed, as always, in the output window (as shown above). An html file copy of this output is save to at the PATH you specified in the FILE statement of ODS HTML.

\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can simultaneously create output in multiple outputs by using ODS, so you can have multiple output destinations open at any given time. Each open output destination uses resources, so if you do not need to output to that particular destination, it is a good idea to close it. When you have more than one open ODS destination you can use the keyword shortcut _ALL_ to close all of the destinations concurrently. That is, the following statement:\n", "\n", "`ODS _ALL_ CLOSE;`\n", "\n", "closes all currently open destinations at once. Then you can reopen whichever output desination you would like to use.\n", "\n", "### Producing HTML Output\n", "\n", "We have been using HTML output as our default output to the Results Viewer, but if we would like to save the HTML output to an external file, then we can use the HTML keyword in the ODS statement with a save option such as FILE. In this section, we'll extend what we learned there by:\n", "\n", "* creating HTML output from multiple procedures at once;\n", "* creating HTML output with a table of contents; and\n", "* using options to specify links and paths.\n", "\n", "
\n", "

Example

\n", "

The following program uses the penngolf data set to simultaneously create HTML output from the PRINT and REPORT procedures:

\n", "
" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

Some Par 72 Pennsylvania Golf Courses

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
NameYearTypeYards
Toftrees1968Resort7018
Penn State Blue1921Public6525
Lewistown CC.Private6779
Sinking Valley CC1967SemiPri6755
Mount Airy Lodge1972Resort7123
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "

Average Size of Some PA Courses

\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
TypeParYards
Private71.36553.3
Public72.06525.0
Resort72.07070.5
SemiPri70.66394.8
\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ODS HTML body = '/folders/myfolders/SAS_Notes/output/html/golf2.html';\n", " \n", "PROC PRINT data = phc6089.penngolf NOOBS;\n", " title 'Some Par 72 Pennsylvania Golf Courses';\n", " ID name;\n", " var year type yards;\n", " where par = 72;\n", "RUN;\n", " \n", "PROC REPORT data = phc6089.penngolf NOWINDOWS HEADLINE HEADSKIP;\n", " title 'Average Size of Some PA Courses';\n", " column type par yards;\n", " define type /group;\n", " define yards / analysis mean format = 6.1 width = 10;\n", " define par / analysis mean format = 4.1 width = 10;\n", "RUN;\n", " \n", "ODS HTML CLOSE;\n", "ODS HTML;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "

Before launching and running the program, let's take a quick look at the code to make sure we know what it's doing:

\n", "
    \n", "
  • The first ODS statement tells SAS to open the HTML destination and to save the HTML output generated by the code to the specified file name. Note that rather than using the ODS HTML statement's FILE= option, we used the BODY= option to tell SAS where to save the HTML output. The two options are interchangeable. That is, the BODY= option is an alias for the FILE= option.
  • \n", "
  • Then, we use the PRINT procedure to tell SAS to print some information about the par 72 golf courses.
  • \n", "
  • Then, we use the REPORT procedure to tell SAS to calculate the average yardage and average par for each of the four types of golf courses.
  • \n", "
  • The second ODS statement tells SAS to close the HTML destination so that we can access the created HTML file.
  • \n", "
  • And, the last ODS statement tells SAS to re-open the HTML destination.
  • \n", "
\n", "

Now, go ahead and launch the SAS program. Again, you'll have to edit the first ODS HTML statement to reflect where you would like your HTML file stored. Then, run the SAS program, and review the output as it appears in the SAS Results Viewer. You should first see the output from the PRINT procedure and then the output from the REPORT procedure as shown above.

\n", "

You should also note that SAS saves the generated HTML output in the file specified in the first ODS HTML statement. To see the file, go to the folder in which you told SAS to store the HTML file. Here's what my folder looks like after running the program.

\n", "

When we run the above code, SAS creates the golf2.html file. You should see the same output that SAS displays in the SAS Results Viewer. It is this physical golf2.html file though that you could easily post to a public web site or e-mail to someone else.

\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "

Example: Creating HTML Output with a Table of Contents

\n", "

When you have a program that creates many pages of output, you might find it useful for SAS to create a table of contents for the output. The following program is identical to the previous program, except the first ODS HTML statement has been modified to tell SAS to create a table of contents for the output that SAS generates:

\n", "
" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

Some Par 72 Pennsylvania Golf Courses

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
NameYearTypeYards
Toftrees1968Resort7018
Penn State Blue1921Public6525
Lewistown CC.Private6779
Sinking Valley CC1967SemiPri6755
Mount Airy Lodge1972Resort7123
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "

Average Size of Some PA Courses

\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
TypeParYards
Private71.36553.3
Public72.06525.0
Resort72.07070.5
SemiPri70.66394.8
\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ODS HTML path = '/folders/myfolders/SAS_Notes/output/html/' (url = none)\n", " body = 'golf3.html'\n", " contents = 'golf3toc.html'\n", " frame = 'golf3frame.html';\n", " \n", "PROC PRINT data = phc6089.penngolf NOOBS;\n", " title 'Some Par 72 Pennsylvania Golf Courses';\n", " ID name;\n", " var year type yards;\n", " where par = 72;\n", "RUN;\n", " \n", "PROC REPORT data = phc6089.penngolf NOWINDOWS HEADLINE HEADSKIP;\n", " title 'Average Size of Some PA Courses';\n", " column type par yards;\n", " define type /group;\n", " define yards / analysis mean format = 6.1 width = 10;\n", " define par / analysis mean format = 4.1 width = 10;\n", "RUN;\n", " \n", "ODS HTML CLOSE;\n", "ODS HTML;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "

Since the code is almost identical to the previous program, the only code that needs explanation this time around is that first ODS HTML statement. The PATH= option tells SAS where to store the subsequent files. The BODY=, CONTENTS=, and FRAME= specify the name of the output html files which store the body (the actual tables), the table of contents, and a combined webpage of the body and table of contents in the folder specified by PATH. The url=none option in PATH= essentially tells SAS to form HTML files that use relative paths to reference links. In this case, the body and contents file must be in the same folder for the frame file to work.

\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you're not familiar with at least the concept of Hypertext Markup Language (HTML), then you would find this topic quite challenging. In short, HTML is the behind-the-scenes language that tells your web browser what to display. If you go to any web page, and view the page source, you'll see the HTML code that displays the web page that you are viewing. (Using Mozilla's Firefox browser, you can view the page source by selecting View and then Page Source. Using an Internet Explorer browser, you can view the page source by selecting Page and then View Source). This topic concerns the pathnames that SAS creates when it creates HTML output files for you. If the pathnames aren't well specified, then you would have trouble sharing your SAS-created HTML output files with others.\n", "\n", "### Producing Other Types of Output\n", "\n", "Thus far, we have used ODS statements to tell SAS to create HTML output. As mentioned earlier, we can also use ODS statements to tell SAS to create other kinds of output. In this section, we'll take a look at two examples in which we tell SAS to make different kinds of output. In the first example, we make RTF output that can be easily copied into Microsoft Word. In the second example, we make PDF output that can be then sent to a high-resolution printer.\n", "\n", "
\n", "

Example

\n", "

The following program tells SAS to print a subset of the penngolf data set and when doing so to send the output to an RTF destination:

\n", "
" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

Some Par 72 Pennsylvania Golf Courses

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
NameYearTypeYards
Toftrees1968Resort7018
Penn State Blue1921Public6525
Lewistown CC.Private6779
Sinking Valley CC1967SemiPri6755
Mount Airy Lodge1972Resort7123
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ODS HTML CLOSE;\n", "ODS RTF file = '/folders/myfolders/SAS_Notes/output/rtf/golf5.rtf'\n", " BODYTITLE;\n", " \n", "PROC PRINT data = phc6089.penngolf NOOBS;\n", " title 'Some Par 72 Pennsylvania Golf Courses';\n", " ID name;\n", " var year type yards;\n", " where par = 72;\n", "RUN;\n", " \n", "ODS RTF CLOSE;\n", "ODS HTML;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "

As you can see, to tell SAS to send output to the RTF destination, we simply use the RTF keyword in an ODS statement. By default, titles and footnotes are put into Word headers and footers. The BODYTITLE option in the ODS RTF statement tells SAS to instead put titles and footnotes in the main part of the RTF document. Note that the second-to-last ODS statement tells SAS to close the RTF destination, while the last ODS statement tells SAS again to re-open the HTML destination.

\n", "

If you are on Windows, then you may see a pop-up window asking to save or open the output rtf file. RTF files can be opened with word or pages.

\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "

Example

\n", "

The following program does exactly the same thing as the previous program, except the output here is sent to a PDF file:

\n", "
" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

Some Par 72 Pennsylvania Golf Courses

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
NameYearTypeYards
Toftrees1968Resort7018
Penn State Blue1921Public6525
Lewistown CC.Private6779
Sinking Valley CC1967SemiPri6755
Mount Airy Lodge1972Resort7123
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ODS HTML CLOSE;\n", "ODS PDF file = '/folders/myfolders/SAS_Notes/output/pdf/golf5.rtf';\n", " \n", "PROC PRINT data = phc6089.penngolf NOOBS;\n", " title 'Some Par 72 Pennsylvania Golf Courses';\n", " ID name;\n", " var year type yards;\n", " where par = 72;\n", "RUN;\n", " \n", "ODS PDF CLOSE;\n", "ODS HTML;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "

Pretty straightforward... as you can see, to tell SAS to send output to the PDF destination, we simply use the PDF keyword in an ODS statement. Note again that the second-to-last ODS statement tells SAS to close the PDF destination, while the last ODS statement tells SAS again to re-open the Listing destination.

\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Tracing and Selecting Procedure Output\n", "\n", "As discussed earlier, when ODS receives data from a procedure, it combines the data component with a table definition to create an output object. For many procedures, ODS creates just one output object, while for others it produces several. Procedures involving a BY statement, for example, typically produce an output object for each BY group. When a procedure does create more than one output object, you might not want SAS to include all of them in your output. You might instead want to tell SAS to select just one or two of the output objects. In this section, we learn how to use the ODS TRACE and ODS SELECT statements to choose the specific output objects that you want SAS to display in your output.\n", "\n", "The ODS TRACE ON statement tells SAS to print information in the log about the output objects created by all of the code in your program between the ODS TRACE ON statement and a closing ODS TRACE OFF statement.\n", "\n", "
\n", "

Example

\n", "

The following program uses ODS TRACE statements to capture information about the output objects created by the MEANS procedure on a data set called golfbypar, which is just a sorted version of the penngolf data set:

\n", "
" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

Pennsylvania Golf Courses by Par

\n", "
\n", "
\n", "

The MEANS Procedure

\n", "
\n", "
\n", "
\n", "

Par=70

\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
VariableNMeanStd DevMinimumMaximum
\n", "
\n", "
ID
\n", "
Year
\n", "
Yards
\n", "
Slope
\n", "
USGA
\n", "
\n", "
\n", "
\n", "
3
\n", "
3
\n", "
3
\n", "
3
\n", "
3
\n", "
\n", "
\n", "
\n", "
108.3333333
\n", "
1966.33
\n", "
6283.33
\n", "
126.0000000
\n", "
70.2333333
\n", "
\n", "
\n", "
\n", "
2.0816660
\n", "
6.5064071
\n", "
295.6692972
\n", "
6.0000000
\n", "
1.0692677
\n", "
\n", "
\n", "
\n", "
106.0000000
\n", "
1960.00
\n", "
6004.00
\n", "
120.0000000
\n", "
69.3000000
\n", "
\n", "
\n", "
\n", "
110.0000000
\n", "
1973.00
\n", "
6593.00
\n", "
132.0000000
\n", "
71.4000000
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "

Par=71

\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
VariableNMeanStd DevMinimumMaximum
\n", "
\n", "
ID
\n", "
Year
\n", "
Yards
\n", "
Slope
\n", "
USGA
\n", "
\n", "
\n", "
\n", "
3
\n", "
3
\n", "
3
\n", "
3
\n", "
3
\n", "
\n", "
\n", "
\n", "
105.3333333
\n", "
1934.33
\n", "
6416.67
\n", "
127.3333333
\n", "
71.3333333
\n", "
\n", "
\n", "
\n", "
2.5166115
\n", "
34.0196022
\n", "
63.6893502
\n", "
4.0414519
\n", "
0.5131601
\n", "
\n", "
\n", "
\n", "
103.0000000
\n", "
1909.00
\n", "
6369.00
\n", "
123.0000000
\n", "
70.9000000
\n", "
\n", "
\n", "
\n", "
108.0000000
\n", "
1973.00
\n", "
6489.00
\n", "
131.0000000
\n", "
71.9000000
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "

Par=72

\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
VariableNMeanStd DevMinimumMaximum
\n", "
\n", "
ID
\n", "
Year
\n", "
Yards
\n", "
Slope
\n", "
USGA
\n", "
\n", "
\n", "
\n", "
5
\n", "
4
\n", "
5
\n", "
5
\n", "
5
\n", "
\n", "
\n", "
\n", "
105.0000000
\n", "
1957.00
\n", "
6840.00
\n", "
131.8000000
\n", "
73.2600000
\n", "
\n", "
\n", "
\n", "
4.0620192
\n", "
24.0970261
\n", "
235.5546646
\n", "
5.7619441
\n", "
1.0830512
\n", "
\n", "
\n", "
\n", "
101.0000000
\n", "
1921.00
\n", "
6525.00
\n", "
125.0000000
\n", "
72.0000000
\n", "
\n", "
\n", "
\n", "
111.0000000
\n", "
1972.00
\n", "
7123.00
\n", "
140.0000000
\n", "
74.3000000
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PROC SORT data = phc6089.penngolf out = golfbypar;\n", " by par;\n", "RUN;\n", " \n", "ODS TRACE ON;\n", "PROC MEANS data = golfbypar;\n", " by par;\n", " title 'Pennsylvania Golf Courses by Par';\n", "RUN;\n", "ODS TRACE OFF;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "

The SORT procedure, of course, just sorts the permanent data set phc6089.penngolf by par and stores the sorted result in a temporary data set called golfbypar. Then, the ODS TRACE ON statement tells SAS to start capturing information about any output objects that are created. The MEANS procedure tells SAS to summarize the golfbypar data set for each level of par, that is, when par equals 70, 71 and 72. Finally, the ODS TRACE OFF statement tells SAS to stop capturing information about any output objects that are created.

\n", "

Launch and run the SAS program. You can go ahead and review the output from the MEANS procedure, but what we're really interested in here is the information SAS displays about the output objects in the log window:

\n", " \"SAS\n", "

As the log suggests, the MEANS procedure creates one output object for each BY group (par = 70, par = 71, and par = 72). The three output objects share the same name, label, and template, but different paths. The path for the par = 70 output object, for example, is called Means.ByGroup1.Summary, while the path for the par = 71 output objects is called Means.ByGroup2.Summary. Once we know the names of the output objects, we can use an ODS SELECT statement to tell SAS the specific output objects that we want displayed. To select specific output objects, simply place an ODS SELECT statement within the relevant procedure. By default, the ODS SELECT statement lasts only for the procedure in which it is contained.

\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "

Example

\n", "

The following program uses an ODS SELECT statement and what we learned from tracing our MEANS procedure to print just the portion of the output that pertains to the par 70 golf courses:

\n", "
" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

Par 70 Golf Courses

\n", "
\n", "
\n", "

The MEANS Procedure

\n", "
\n", "
\n", "
\n", "

Par=70

\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
VariableNMeanStd DevMinimumMaximum
\n", "
\n", "
ID
\n", "
Year
\n", "
Yards
\n", "
Slope
\n", "
USGA
\n", "
\n", "
\n", "
\n", "
3
\n", "
3
\n", "
3
\n", "
3
\n", "
3
\n", "
\n", "
\n", "
\n", "
108.3333333
\n", "
1966.33
\n", "
6283.33
\n", "
126.0000000
\n", "
70.2333333
\n", "
\n", "
\n", "
\n", "
2.0816660
\n", "
6.5064071
\n", "
295.6692972
\n", "
6.0000000
\n", "
1.0692677
\n", "
\n", "
\n", "
\n", "
106.0000000
\n", "
1960.00
\n", "
6004.00
\n", "
120.0000000
\n", "
69.3000000
\n", "
\n", "
\n", "
\n", "
110.0000000
\n", "
1973.00
\n", "
6593.00
\n", "
132.0000000
\n", "
71.4000000
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PROC MEANS data = golfbypar;\n", " by par;\n", " title 'Par 70 Golf Courses';\n", " ODS SELECT Means.ByGroup1.Summary;\n", "RUN;\n", "\n", "ODS SELECT ALL; *Reset selection to all output tables;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "

Launch and run the SAS program, and review the output to convince yourself that SAS displays only the portion of the MEANS procedure that pertains to the par 70 golf courses shown above.

\n", "

ODS SELECT (and likewise ODS EXCLUDE) have two special keywords ALL and NONE which all you to SELECT/EXCLUDE ALL or NONE of the output. Here, we use ODS SELECT ALL to reset SAS from only SELECTING the table Means.ByGroup1.Summary, so that subsequent SAS PROC class will produce output. Otherwise, only this currently selected table will show and all other output would be hidden.\n", "

" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Changing the Appearance of Your Output\n", "\n", " have good news and bad news for you about changing the appearance of your output. The good news is that if you had enough time to learn all of the ways in which you could change the appearance of your SAS output, you could create just about anything you wanted. The bad news is that we don't have enough time in this course to explore all of the possibilities. In fact, we'll barely nibble the surface. In this section, we will only investigate how to use the ODS HTML statement's STYLE= option to change the appearance of the default HTML output by using one of the many predefined style templates built into SAS.\n", " \n", "
\n", "

Example

\n", "

The following program uses the ODS HTML statement's STYLE= option to tell SAS to use the meadow style when displaying the HTML output created by printing a subset of the phc6089.penngolf data set:

\n", "
" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

Some of the penngolf data set variables

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
NameYearTypeParYards
Toftrees1968Resort727018
Penn State Blue1921Public726525
Centre Hills1921Private716392
Lewistown CC.Private726779
State College Elks1973SemiPri716369
Park Hills CC1966SemiPri706004
Sinking Valley CC1967SemiPri726755
Williamsport CC1909Private716489
Standing Stone GC1973SemiPri706593
Bucknell GC1960SemiPri706253
Mount Airy Lodge1972Resort727123
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ODS HTML file = '/folders/myfolders/SAS_Notes/output/html/golf9.html'\n", " style = meadow;\n", " \n", "PROC PRINT data = phc6089.penngolf NOOBS;\n", " title 'Some of the penngolf data set variables';\n", " ID name;\n", " var year type par yards;\n", "RUN;\n", " \n", "ODS HTML CLOSE;\n", "ODS HTML;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "

As you can see, telling SAS what style to use is as simple as adding the STYLE= option to the ODS HTML statement. Launch and run the SAS program, and review the output to see the appearance of the HTML output when created using the meadow style template.

\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Of course you are asking yourself \"how would I know that meadow is one of the available predefined styles?\" Fortunately, the answer is simple enough. The following TEMPLATE procedure produces a list of the predefined style templates that are available on your system:\n", "\n", "
\n",
    "PROC TEMPLATE;\n",
    "    LIST STYLES;\n",
    "RUN;\n",
    "
\n", "\n", "Launch and run the SAS code, and review the output to see the list of predefined styles that are shipped with SAS. You might want to try some of the styles out yourself. While I do find some of the styles rather nice, I personally find some of them rather hideous (and therefore useless to me)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Creating SAS DATA Sets from Procedure Output\n", "\n", "You may recall that earlier we used an OUTPUT statement in the MEANS procedure to create a data set containing summary statistics, such as means and standard deviations. We'll see in this section that we could have alternatively used ODS to first save the summary statistics and then send it to the OUTPUT destination. In fact, we can use ODS to save just about any part of any procedure's output!\n", "\n", "
\n", "

Example

\n", "

The following program uses an ODS OUTPUT statement to create a temporary SAS data set called summout from the Summary output object created by the MEANS procedure, and then prints the resulting summout data set:

\n", "
" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

Pennsylvania Golf Courses by Par

\n", "
\n", "
\n", "

The MEANS Procedure

\n", "
\n", "
\n", "
\n", "

Par=70

\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
Analysis Variable : Yards
NMeanStd DevMinimumMaximum
36283.33295.66929726004.006593.00
\n", "
\n", "
\n", "
\n", "
\n", "

Par=71

\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
Analysis Variable : Yards
NMeanStd DevMinimumMaximum
36416.6763.68935026369.006489.00
\n", "
\n", "
\n", "
\n", "
\n", "

Par=72

\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
Analysis Variable : Yards
NMeanStd DevMinimumMaximum
56840.00235.55466466525.007123.00
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "

The summout data set

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
ParYards_NYards_MeanYards_StdDevYards_MinYards_Max
7036283.3333333295.6692972460046593
7136416.666666763.68935023563696489
7256840235.5546645765257123
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PROC MEANS data = golfbypar;\n", " by par;\n", " var yards;\n", " title 'Pennsylvania Golf Courses by Par';\n", " ODS OUTPUT Summary = summout;\n", "RUN;\n", " \n", "PROC PRINT data = summout NOOBS;\n", " title 'The summout data set';\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "

Now, something that might not be obvious from this code is that the name of the output object, Summary, was determined from first tracing the MEANS procedure. If you refer back to the information SAS displayed in the log for the previous example using ODS TRACE, you can see that, since we want to capture all of the output from the MEANS procedure, the desired output object is called Summary. The ODS OUTPUT statement tells SAS that we want to save the data contained in the Summary output object in a data set called summout. Of course, the PRINT procedure then tells SAS to print the summout data set. Launch and run the SAS program, and review the output to convince yourself that the summout data set does indeed contain the data summarized by the MEANS procedure.

\n", "

You do need to be careful where you put ODS statements in your program. For example, if rather than putting the ODS OUTPUT statement just before the MEAN procedure's RUN statement, we had instead put it after the MEAN procedure's RUN statement and before the PRINT procedure's PROC PRINT statement, we would not have captured the Summary data set. Instead, we would get the following Warning message:

\n", "
\n",
    "    WARNING: Output 'Summary' was not created. Make sure that the output\n",
    "\tObject name, label, or path is spelled corectly. Also\n",
    "    Verify that the appropriate procedute options are used to\n",
    "    produce the request output object. For example, verify that\n",
    "    the NOPRINT option is not used\n",
    "    
\n", "

You might want to move the ODS statement as described, and re-run the SAS program just to see this for yourself.

\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using Plot Options to Control Graph Appearance\n", "\n", "The ODS Graphics procedures also enable you to control the appearance of particular graphics elements in a graph. Graphics elements include lines, bars, markers, text, and so on.\n", "\n", "Many ODS Graphics procedure statements have options and suboptions that control the appearance of different parts of a plot or graph. Default visual attributes of various graphics elements are derived from the specific style elements of the active style. By using appearance options in your procedure statements, you can change the appearance of one or more aspects of your graph without changing the overall style.\n", "\n", "For most of our examplew, we will use the child mortality dataset, indicatordeadkids35.csv." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

The SAS System

\n", "
\n", "
\n", "

The CONTENTS Procedure

\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
Data Set NameWORK.LONGObservations50038
Member TypeDATAVariables3
EngineV9Indexes0
Created10/02/2020 15:23:12Observation Length2632
Last Modified10/02/2020 15:23:12Deleted Observations0
Protection CompressedNO
Data Set Type SortedNO
Label   
Data RepresentationSOLARIS_X86_64, LINUX_X86_64, ALPHA_TRU64, LINUX_IA64  
Encodingutf-8 Unicode (UTF-8)  
\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
Engine/Host Dependent Information
Data Set Page Size131072
Number of Data Set Pages1022
First Data Page1
Max Obs per Page49
Obs in First Data Page48
Number of Data Set Repairs0
Filename/tmp/SAS_work760E00000C05_localhost.localdomain/long.sas7bdat
Release Created9.0401M6
Host CreatedLinux
Inode Number671607
Access Permissionrw-r--r--
Owner Namesasdemo
File Size128MB
File Size (bytes)134086656
\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
Alphabetic List of Variables and Attributes
#VariableTypeLenFormatInformat
1CountryChar2614$2614.$2614.
3mortsNum8  
2yearNum8  
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "

The SAS System

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
ObsCountryyearmorts
1Afghanistan1760.
2Afghanistan1761.
3Afghanistan1762.
4Afghanistan1763.
5Afghanistan1764.
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "FILENAME mortcsv '/folders/myfolders/SAS_Notes/data/indicatordeadkids35.csv';\n", "\n", "PROC IMPORT datafile = mortcsv \n", " out = mort(RENAME=(VAR1=Country)) \n", " dbms = CSV \n", " replace;\n", " getnames = yes;\n", " guessingrows = max;\n", "RUN;\n", "\n", "DATA long;\n", " SET mort;\n", " ARRAY years{*} '1760'n -- '2009'n '2010'n '2030'n '2050'n '2099'n;\n", " DO i = 1 to dim(years);\n", " year = INPUT(vname(years{i}), 4.);\n", " morts = years{i};\n", " OUTPUT;\n", " END;\n", " DROP i '1760'n -- '2009'n '2010'n '2030'n '2050'n '2099'n;\n", "RUN;\n", "\n", "PROC CONTENTS data = long;\n", "RUN;\n", "\n", "PROC PRINT data = long (obs = 5);\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The SG Plotting Procedures\n", "\n", "Let's quickly review how to make some basic plots using PROC SGPLOT before we move on to how to customize their appearance. PROC SGPLOT can be used to make many common plots by using the corresponding statement:\n", "\n", "* SCATTER - produces a scatterplot\n", "* VBOX/HBOX - produces a vertical/horizontal boxplot\n", "* SERIES - produces a lines/series plot\n", "* DENSITY - produces a density plot\n", "* HISTOGRAM - produces a histgram\n", "* REG - Adds a least squares regression line fit to the plot\n", "* LOESS - adds a nonparametric smoothing curve to the plot\n", "\n", "For example, we could make a scatterplot of year vs mortality rate. Let's do this for the data from Sweden." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "DATA sweden_long;\n", " SET long;\n", " WHERE country = \"Sweden\";\n", "RUN;\n", "\n", "PROC SGPLOT data = sweden_long;\n", " SCATTER Y = morts X = year;\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I could have also made a line plot by simply changing SCATTER to SERIES." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PROC SGPLOT data = sweden_long;\n", " SERIES Y = morts X = year;\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By using the LOESS statement, I can also add a nonparametric smoothing curve to this data and plot it ontop of the series plot." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PROC SGPLOT data = sweden_long;\n", " SERIES Y = morts X = year;\n", " LOESS Y = morts X = year / smooth = 0.45 CLM;\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let’s look at the mortality rates over time using line plots for each of the countries: United States, United Kingdom, Sweden, Afghanistan, Rwanda. To get a line for each country individually, we need to specify the GROUP= option and assign it to the country variable." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "DATA sub;\n", " SET long;\n", " WHERE country in (\"United States\" \"United Kingdom\" \n", " \"Sweden\" \"Afghanistan\" \"Rwanda\");\n", "RUN;\n", "\n", "PROC SGPLOT data = sub;\n", " SERIES Y = morts X = year / group = country;\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that we get a single plot with trajectory over time of the mortality rates for each of these five countries. PROC SGPLOT will automatically assign default colors to differentiate each group and provide a legend. We will learn how to modify these later.\n", "\n", "We could also make side by side boxplots of mortality rates by country by using the category= option in vbox/hbox." ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PROC SGPLOT data = sub;\n", " VBOX morts / category = country;\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Axes and Titles\n", "\n", "The first adjustment we might want to make to a plot is adding descriptive axis labels and a title. We can set the axis labels using the XAXIS and YAXIS LABEL= option (or we can set a LABEL to the variable in a DATA step). To set different levels of titles, we use the TITLE statements. The TITLE statement allows TITLE1-TITLE10 where as the TITLE number increase the size of the title decrease, so we can make subtitles. Note that the TITLE statements are global and not part of SGPLOT, so we will need to reset these or change them before making another plot in order to prevent the same titles from carrying over." ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PROC SGPLOT data = sub;\n", " SERIES Y = morts X = year / group = country;\n", " XAXIS LABEL = \"Year\";\n", " YAXIS LABEL = \"Mortality Rate\";\n", " TITLE \"Child Mortality Rates\";\n", " TITLE2 \"Stratified by Country\";\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The x and y axis limits can be adjusted by using the MIN= and MAX= options in the XAXIS and YAXIS statments. For example, we can zoom in on the years 1900-2000 for the bottom three lines where the mortality rates range from 0 to 1.5." ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PROC SGPLOT data = sub;\n", " SERIES Y = morts X = year / group = country;\n", " XAXIS LABEL = \"Year\" MIN = 1900 MAX = 2000;\n", " YAXIS LABEL = \"Mortality Rate\" MIN = 0 MAX = 1.5;\n", " TITLE \"Child Mortality Rates\";\n", " TITLE2 \"Stratified by Country\";\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We may also want to change the default tic marks on the x and or y axes. This can be done using the VALUES= option in the XAXIS and YAXIS statements." ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PROC SGPLOT data = sub;\n", " SERIES Y = morts X = year / group = country;\n", " XAXIS LABEL = \"Year\" VALUES = (1750 TO 2100 BY 50);\n", " YAXIS LABEL = \"Mortality Rate\" VALUES = (0 TO 5 BY 0.5);\n", " TITLE \"Child Mortality Rates\";\n", " TITLE2 \"Stratified by Country\";\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you want to change the text displayed at each of these tic marks to be different from the actual numbers, use the VALUESDISPLAY= option." ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PROC SGPLOT data = sub;\n", " SERIES Y = morts X = year / group = country;\n", " XAXIS LABEL = \"Year\" VALUES = (1750 TO 2100 BY 50)\n", " VALUESDISPLAY = ('a' 'b' 'c' 'd' 'e' 'f' 'g' 'h');\n", " YAXIS LABEL = \"Mortality Rate\" VALUES = (0 TO 5 BY 0.5);\n", " TITLE \"Child Mortality Rates\";\n", " TITLE2 \"Stratified by Country\";\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The XAXIS and YAXIS statements also provide other options to modify the axis and label text such as color, font, position, and rotation:\n", "\n", "* FITPOLICY= - specifies the method that is used to fit tick mark values on a horizontal axis when there is not enough room to draw them normally.\n", "* LABELATTRS= - specifies the appearance of the axis labels.\n", "* LABELPOS= - specifies the position of the axis label.\n", "* VALUEATTRS= - specifies the appearance of the axis tick value labels.\n", "* VALUESROTATE= specifies how the tick values are rotated on the axis with the possible options of DIAGONAL | DIAGONAL2 | VERTICAL (This only applies when axis text is overlapping. To force rotation, you must set FITPOLICY to ROTATEALWAYS but this only works in SAS 9.4M7 and on. Otherwise you must use an ANNOTATION data set.)\n", "\n", "We can specify similar text options in a TITLE statement to adjust the text appearance in a title." ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PROC SGPLOT data = sub;\n", " SERIES Y = morts X = year / group = country;\n", " * Note: The VALUESROTATE option only works here on SAS version 9.4M7 and later;\n", " XAXIS LABEL = \"Year\" FITPOLICY=ROTATEALWAYS VALUESROTATE = DIAGONAL2\n", " VALUEATTRS = (SIZE = 8) LABELATTRS = (SIZE = 15);\n", " YAXIS LABEL = \"Mortality Rate \" VALUEATTRS = (SIZE = 8)\n", " LABELATTRS = (SIZE = 15);\n", " TITLE HEIGHT = 1cm JUSTIFY = C BOLD COLOR = 'red' \"Child Mortality Rates\";\n", " TITLE2 HEIGHT = 0.5cm BOLD JUSTIFY = C \"Stratified by Country\";\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plotting Characters, Line Types and Their Colors\n", "\n", "For basic plots without a grouping variable, we can use the corresponding ATTRS option to modify these visual characteristics:\n", "\n", "* MARKERATTRS= - alter the visual appearance of plotting characters such as color, size, and plotting character (see Marker Attributes and Symbols for more details)\n", "* LINEATTRS - alter the visual appearance of a line such as color, line type and thickness (see Line Attributes and Patterns for more details)\n", "\n", "For example, let's adjust the plotting character in a scatterplot of mortality rate vs year for Sweden to be a red triangle." ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "TITLE ; *Reset the titles;\n", "TITLE2 ;\n", "PROC SGPLOT data = sweden_long;\n", " SCATTER Y = morts X = year \n", " / markerattrs=(color = 'red' symbol = Triangle);\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or we can change the straight blue line to a dashed orange line in a SERIES plot." ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PROC SGPLOT data = sweden_long;\n", " SERIES Y = morts X = year \n", " / linerattrs=(color = 'orange' pattern = dash);\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To set any of these attributes manually, we will need to use discrete attribute map dataset to pass in to the dattrmap= option and the attrid= option. This discrete attribute map dataset will define the mapping of our grouping variable to the different attributes such as color, line type or plotting character.\n", "\n", "To create a discrete attribute map dataset, we must build a dataset using a DATA step with the following (character) variables:\n", "\n", "* ID - this required string variable identifies all the rows that correspond to a single attribute map.\n", "* VALUE - this required string variable identifies the grouping variables value that is being mapped in the current row.\n", "* Other attribute variables - These will be reserved keywords such as linecolor and markercolor (see Data Attribute Map Datasets for a full list of reserved attribute keyword that can be used in a data map.)\n", "\n", "Note that the dataset requires the two variables ID and VALUE. You must include and use these names in your attribute map for the mapping dataset to work with PROC SGPLOT.\n", "\n", "For example, the following data attribute map will change the plotting characters and colors assigned to each country in the scatterplot of mortality rate versus year." ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "DATA marker_map;\n", " INPUT id $2. +1 VALUE $15. MARKERSYMBOL $8. MARKERCOLOR $10.;\n", " VALUE = strip(value);\n", " DATALINES;\n", "ms Afghanistan X blue\n", "ms Rwanda Diamond light blue\n", "ms Sweden plus green\n", "ms United Kingdom triangle purple\n", "ms United States circle red\n", ";\n", "RUN;\n", "\n", "PROC SGPLOT data = sub dattrmap = marker_map;\n", " SCATTER Y = morts X = year / group = country attrid = ms;\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For more information on SAS colors, see Color-Naming Schemes.\n", "\n", "Similarly, we can create an attribute map to alter line attributes." ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "DATA line_map;\n", " INPUT id $2. +1 VALUE $15. LINEPATTERN $9. +1 LINECOLOR $10.;\n", " VALUE = strip(value);\n", " DATALINES;\n", "lp Afghanistan Solid blue\n", "lp Rwanda Dash light blue\n", "lp Sweden Dot green\n", "lp United Kingdom LongDash purple\n", "lp United States ShortDash red\n", ";\n", "RUN;\n", "\n", "PROC SGPLOT data = sub dattrmap = line_map;\n", " SERIES Y = morts X = year / group = country attrid = lp;\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also combine multiple attribute maps into a single dataset to adjust different visual attributes such as both plotting characters and line types." ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "DATA my_map;\n", " INPUT id $2. +1 VALUE $15. LINEPATTERN $9. +1 LINECOLOR $10. +1 \n", " MARKERSYMBOL $8. MARKERCOLOR $10.;\n", " VALUE = strip(value);\n", " DATALINES;\n", "am Afghanistan Solid blue X blue\n", "am Rwanda Dash light blue Diamond light blue \n", "am Sweden Dot green plus green\n", "am United Kingdom LongDash purple triangle purple\n", "am United States ShortDash red circle red\n", ";\n", "RUN;\n", "\n", "PROC SGPLOT data = sub dattrmap = my_map;\n", " SCATTER Y = morts X = year / group = country attrid = am;\n", " SERIES Y = morts X = year / group = country attrid = am;\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Modifying a Legend\n", "\n", "To modify the legend in PROC SGPLOT, we can use the KEYLEGEND statement. For example, to change the title of a legend, we use the TITLE= option in KEYLENGND. To modify the labels made by a grouping variable, we need to apply a FORMAT." ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PROC FORMAT;\n", " VALUE $countryfmt\n", " \"Afghanistan\" = \"Country 1\"\n", " \"Rwanda\" = \"Country 2\"\n", " \"Sweden\" = \"Country 3\"\n", " \"United Kingdom\" = \"Country 4\"\n", " \"United States\" = \"Country 5\";\n", "RUN;\n", "\n", "PROC SGPLOT data = sub;\n", " SERIES Y = morts X = year / group = country;\n", " KEYLEGEND / TITLE = \"COUNTRY\";\n", " FORMAT country $countryfmt.;\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To hide the legend use the NOAUTOLEGEND option in PROC SGPLOT. Note if you manually set any legend options in the KEYLEGEND statement, then it will ignore the NOAUTOLEGEND option, so we need to remove the KEYLEGEND statement to suppress the legend." ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 63, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PROC SGPLOT data = sub NOAUTOLEGEND;\n", " SERIES Y = morts X = year / group = country;\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can alter the position of the legend by using the POSITION option in KEYLEGEND. POSITION can take the values bottom, bottomleft, bottomright, left, right, top, topleft, and topright. We can also move the legend inside the plotting region by specifying LOCATION=INSIDE. To set the number of rows and columns formed in the legend use the ACROSS= and/or DOWN= options." ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PROC SGPLOT data = sub NOAUTOLEGEND;\n", " SERIES Y = morts X = year / group = country;\n", " KEYLEGEND / POSITION = RIGHT;\n", "RUN;" ] }, { "cell_type": "code", "execution_count": 67, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 67, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PROC SGPLOT data = sub NOAUTOLEGEND;\n", " SERIES Y = morts X = year / group = country;\n", " KEYLEGEND / POSITION = TOPLEFT LOCATION=INSIDE ACROSS=2;\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can modify the appearance of the text in the legend by using the TITLEATTRS= and VALUEATTRS= options. These option take the different text properties such as color, size and font that we saw earlier." ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 68, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PROC SGPLOT data = sub NOAUTOLEGEND;\n", " SERIES Y = morts X = year / group = country;\n", " KEYLEGEND / POSITION = RIGHT ACROSS=2\n", " TITLEATTRS=(color='red' size=2cm)\n", " VALUEATTRS=(color='blue' style=ITALIC);\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Adding Text Annotations\n", "\n", "There are three ways to add text to a plot form PROC SGPLOT\n", "\n", "* INSET\n", "* TEXT\n", "* SGANNO\n", "\n", "The first example uses the INSET statement to add the correlation coefficient value to the a scatterplot." ] }, { "cell_type": "code", "execution_count": 92, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 92, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PROC SGPLOT DATA=mort;\n", " SCATTER Y = '2000'n X = '1980'n;\n", " INSET (\"r: 0.89\" = \"\") / Position = TOPLEFT TITLE = \"Pearson Correlation\" \n", " LABELALIGN = CENTER;\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The second way to add text is with the TEXT display. This displays text at an associatied (X,Y) location in the graph. " ] }, { "cell_type": "code", "execution_count": 120, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 120, "metadata": {}, "output_type": "execute_result" } ], "source": [ "DATA mort;\n", " set mort (drop=text);\n", " LENGTH text $17;\n", " IF '2000'n > 2.5 THEN text = (\"2000 Rate = \" || '2000'n);\n", "RUN;\n", "\n", "PROC SGPLOT DATA=mort;\n", " SCATTER Y = '2000'n X = '1980'n;\n", " TEXT Y = '2000'n X = '1980'n TEXT=text / Position = bottom TEXTATTRS=(SIZE=10);\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The third way to add text is by using SGANNO. To use this method, we create a separate ANNOTATION dataset with out text label(s) in it to be added to the plot." ] }, { "cell_type": "code", "execution_count": 147, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 147, "metadata": {}, "output_type": "execute_result" } ], "source": [ "DATA mort_anno;\n", " LENGTH label $50 x1space $20 y1space $20;\n", " INPUT function $ label $ x1 y1 width x1space $ y1space $\n", " textsize;\n", " INFILE DATALINES DSD;\n", " DATALINES;\n", "text,Pearson's correlation coefficient is 0.89,35,80,500,graphpercent,graphpercent,11\n", ";\n", "RUN;\n", "\n", "PROC SGPLOT DATA=mort sganno=mort_anno;\n", " SCATTER Y = '2000'n X = '1980'n;\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See Yaqi Jia's paper Three Ways to Add Text to Graphics in PROC SGPLOT for more details." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Drawing Multiple Plots in a Single Figure\n", "\n", "If we want to break down a dense plot into smaller separate plots by some grouping, we can use PROC SGPANEL. For example, if I want 5 separate scatterplots with loess lines for the mortality data subset sub, we could use PROC SGPLANEL to break these into 5 separate plots instead of one that are plotted in a single panel." ] }, { "cell_type": "code", "execution_count": 158, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 158, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PROC SGPANEL data = sub NOAUTOLEGEND;\n", " panelby country; \n", " SCATTER Y = morts X = year / group = country;\n", " LOESS Y = morts X = year / group = country;\n", " format country $20.;\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can adjust the layout and number of rows and columns by using the ROWS= and COLUMNS= options in PANELBY;" ] }, { "cell_type": "code", "execution_count": 160, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 160, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PROC SGPANEL data = sub NOAUTOLEGEND;\n", " panelby country / rows=1 columns=5; \n", " SCATTER Y = morts X = year / group = country;\n", " LOESS Y = morts X = year / group = country;\n", " format country $20.;\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To include several differnt plots into a single figure, we can use the GRIDDED LAYOUT feature of ODS GRAPHICS. To create the layout, we must first enable the grid layout and define the format by specifying the number of rows and columns. Then use the ODS REGION statement to break up what goes into each part of the grid." ] }, { "cell_type": "code", "execution_count": 164, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
Obs19802000residfitted
13.4903521392.8118582920.505362.30650
20.3344262150.1267462150.000020.12673
31.2166211710.216880367-0.519170.73605
43.1392919282.4844750490.420452.06403
50.3019249940.1403261670.036050.10428
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 164, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PROC GLM data = mort(keep='2000'n '1980'n) NOPRINT;\n", " MODEL '2000'n = '1980'n;\n", " OUTPUT out=fitstat residuals = resid predicted = fitted;\n", "RUN;\n", "\n", "PROC PRINT data = fitstat(obs=5);\n", "RUN;" ] }, { "cell_type": "code", "execution_count": 165, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "\"The\n", "
\n", "
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 165, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ODS LAYOUT GRIDDED columns = 2 rows = 2;\n", "ODS REGION;\n", "PROC SGPLOT data = fitstat;\n", " HISTOGRAM resid;\n", "RUN;\n", "ODS REGIONS;\n", "PROC SGPLOT data = fitstat;\n", " SCATTER Y = resid X = fitted;\n", " REFLINE 0 / AXIS = Y;\n", "RUN;\n", "ODS REGION;\n", "PROC SGPLOT data = fitstat;\n", " REG Y = '2000'n X = '1980'n;\n", "RUN;\n", "ODS REGION;\n", "PROC SGPLOT data = fitstat;\n", " SCATTER Y = resid X = '1980'n;\n", " REFLINE 0 / AXIS = Y;\n", "RUN;\n", "ODS LAYOUT END;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To save these plots, we will use the ODS destinations we learned earlier such as PDF or RTF to save these plots to an external file." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercises\n", "\n", "For these exercises, we will use the charm city circulator bus ridership dataset, Charm_City_Circulator_Ridership.csv. After modifying the path to the dataset on your computer, use the following code to read in and transform the dataset to be ready for use in plotting." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "SAS Output\n", "\n", "\n", "\n", "
\n", "
\n", "

The SAS System

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
Obsdaydatedailynumberroutetype
1Monday01/11/2010952877orangeBoardings
2Monday01/11/20109521027orangeAlightings
3Monday01/11/2010952952orangeAverage
4Monday01/11/2010952.purpleBoardings
5Monday01/11/2010952.purpleAlightings
6Monday01/11/2010952.purpleAverage
7Monday01/11/2010952.greenBoardings
8Monday01/11/2010952.greenAlightings
9Monday01/11/2010952.greenAverage
10Monday01/11/2010952.bannerBoardings
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "

The SAS System

\n", "
\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
Obsdaydatedailynumberroutetype
1Monday01/11/2010952952.0orangeAverage
2Tuesday01/12/2010796796.0orangeAverage
3Wednesday01/13/20101211.51211.5orangeAverage
4Thursday01/14/20101213.51213.5orangeAverage
5Friday01/15/201016441644.0orangeAverage
6Saturday01/16/20101490.51490.5orangeAverage
7Sunday01/17/2010888.5888.5orangeAverage
8Monday01/18/2010999.5999.5orangeAverage
9Tuesday01/19/201010351035.0orangeAverage
10Wednesday01/20/20101395.51395.5orangeAverage
\n", "
\n", "
\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "PROC IMPORT datafile = \"/folders/myfolders/SAS_Notes/data/Charm_City_Circulator_Ridership.csv\"\n", " out = circ dbms = csv replace;\n", " getnames = yes;\n", " guessingrows = max;\n", "RUN;\n", "\n", "DATA long;\n", " SET circ;\n", " ARRAY larray(*) orangeBoardings -- bannerAverage;\n", " DO i = 1 TO dim(larray);\n", " var = vname(larray(i));\n", " number = larray(i);\n", " var = tranwrd(var, 'Board', ' Board');\n", " var = tranwrd(var, 'Alight', ' Alight');\n", " var = tranwrd(var, 'Average', ' Average');\n", " route = scan(var, 1);\n", " type = scan(var, 2);\n", " OUTPUT;\n", " END;\n", " \n", " DROP i var orangeBoardings -- bannerAverage;\n", "RUN;\n", "\n", "DATA avg;\n", " SET long;\n", " WHERE type = 'Average' and number ne .;\n", "RUN;\n", "\n", "PROC PRINT data = long(obs = 10);\n", "RUN;\n", "\n", "PROC PRINT data = avg(obs=10);\n", "RUN;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "1. Plot average ridership (avg data set) by date using a scatterplot.\n", " a. Color the points by route (orange, purple, green, banner). Default colors are fine here.\n", " b. Add black smoothed curves (LOESS) for each route\n", " c. Color the points by day of the week\n", "2. Replot 1a where the colors of the points are the name of the route (with banner –> blue). Note: you will need to make a data attribute map.\n", "3. Plot a scatterplot of average ridership by date with one panel per route.\n", "4. Plot a scatterplpot of average ridership by date with separate panels by day of the week, colored by route.\n", "5. Plot a scatterplot of average ridership (avg) by date, colored by route (same as 1a). (do not take an average, use the average column for each route). Make the x-label \"Year\". Make the y-label \"Number of People\".\n", "6. Plot average ridership on the orange route versus date as a solid line, and add dashed “error” lines based on the boardings and alightings. The line colors should be orange." ] } ], "metadata": { "kernelspec": { "display_name": "SAS", "language": "sas", "name": "sas" }, "language_info": { "codemirror_mode": "sas", "file_extension": ".sas", "mimetype": "text/x-sas", "name": "sas" } }, "nbformat": 4, "nbformat_minor": 2 }