SAS and CMS Basic Operations

ECO 499 – Reed Olsen


Arithmetic Operators in SAS

  1. Addition: +
  2. Subtraction: -
  3. Multiplication: *
  4. Division: /
  5. raise to a power: **

Comparison Operators in SAS

  1. equal to: = or EQ
  2. not equal to: NE
  3. greater than: > or GT
  4. not greater than: NG
  5. less than: < or LT
  6. not less than: NL
  7. greater than or equal to: >= or GE
  8. less than or equal to: <= or LE

Logical Operators in SAS

  1. And or &
  2. Or
  3. Not

Logical operators are most commonly used in if-then statements to create new variables. For example:

If race = 2 then ethnic = 1; else ethnic = 0;

creates a dummy variable (a 0/1 variable) which equals 1 if the variable "race" has the value of 2 and equals 0 otherwise. If-then statements are extremely useful in SAS to create variables from pre-existing variables. The general form of if-then statements are as follows:

If (argument) then (operation);

If the argument is true then the statement carries out the operation. If then statements may be connected to an "else" statement in the following general format (similar to that given above):

If (argument) then (operation1); else (operation2);

Here if the argument is true then operation1 is carried out, otherwise operation2 is carried out.

Here are some examples of using the logical operators with if-then statements:

If x <= 25 and y > 32 then z = 1; else z = 0;

Here both of the arguments must be met before z is set equal to 1, otherwise z will be set equal to 0. However, the statement

If x <= 25 or y > 32 then z = 1; else z = 0;

has z being set equal to 1 if either argument is met, rather than requiring that both be met.

Functions in SAS

  1. ABS returns the absolute value
  2. max returns the largest value
  3. min returns the smallest value
  4. SQRT calculates the square root
  5. EXP raises e (2.71828) to a specified power.
  6. LOG calculates the natural logarithm (base e)
  7. LOG2 calculates the logarithm to the base 2
  8. LOG10 calculates the logarithm to the base 10 (decimal)
  9. SUM calculates the sum of the arguments

There exist more functions of course, but these are the basic ones and should get you going. Functions are used in the following standard way. Suppose that you want to create a variable, x, which is the natural log of a second variable y. You would do this in the following way:

X = log(y);

Hence, all arithmetic operators are used in that general format – Y = operator(argument).

The term:

Z = X + Y**2 – EXP(A);

Adds variable X to the square of variable Y and subtracts the exponent of variable A (i.e., e raised to the power represented by variable A.)

CMS Operations in SAS

CMS is the operating system on the VMA computer (similar to windows98 for an IBM PC.) Look at the other handout for some of the rules about using the CMS operating system. Additional rules are found here.

Each SAS program that loads data from outside the SAS program (as opposed to inputting it within the file) must have a CMS statement to tell the program where to find the file. For example:

CMS filedef sasdata disk class data a;
Data temp;
infile sasdata;
input X1 X2 X3 X4;

identifies a CMS data file called "class data a" and assigns it a SAS file identifyer (sasdata). Thereafter, whenever using this data file the sas program must refer to it as sasdata. The next three lines do the following in order, (1) creates a temporary sasdata set (any name can be used here), (2) tells sas to input data from the CMS data file to the temporary sas data set and (3) inputs 4 variables, X1, X2, X3, X4, from the file into the temporary sas data set. This is the procedure that you should use to input data that exists in a text file.

Here is an example that both inputs the same data but outputs the data to a permanent SAS data set (as opposed to the temporary one used in the above example.) The advantage of the permanent SAS data set is that you can save all the calculations and then use it later without redoing the same calculations.

CMS filedef sasdata disk class data a;
CMS filedef sasdat1 disk dummy dummy a;
Data sasdat1.name;
infile sasdata;
input X1 X2 X3 X4;
various calculations;

The above program is the same except that now we have a permanent data set that is called "name sasdat1 a" on your permanent disk. The four variables that you read, including any created in the calculations will be in this data set. If you wish to access the data set in future operations you need just write:

CMS filedef sasdat1 disk dummy dummy a; ¬ This line tells SAS where to find the data set.
Data temp; set sasdat1.name; ¬ This line tells SAS to put the permanent data set into a temporary one called "temp"

.

Disk Space on CMS – How to submit SAS jobs and have them run.

There are two ways to submit SAS jobs to run in CMS. The first is discussed in the first handout but let's just go over the basics.

  1. Create a sas program file on CMS. This file must have a filetype of "sas" (recall that CMS files have three identifiers – filename filetype filemode.) In the example from the previous section the data is input from a file called "class data a". The filemode for your programs will be "a". (The filemode only changes if you have multiple disks on the mainframe, which students never get.) The filetype is "data" which tells CMS that this file holds data (you can also use "input" as a filetype to hold data." The filename can be whatever name you wish to give to the file.

SAS programs must have a filetype of "sas", which identifies the file to CMS as a sas program. Hence, all of your sas programs will be called "name sas a" where only the filename can vary.

To create a sas program type "x filename sas a" at the ready prompt and hit the enter key. (If you wish to input data into a data file you would type "x filename data a" at the ready prompt and hit the enter key.) Follow the procedures on the first handout to input the program.

  1. Once the sas program is written, you can submit it one of two ways.
CMS filedef sasdata disk class data a;
CMS filedef sasdat1 disk dummy dummy t;
Data sasdat1.name;
infile sasdata;
input X1 X2 X3 X4;
various calculations;

The above creates a permanent sas data set on the temporary disk (whose filemode is "t") that you just created. The advantage of this is that it saves space on your own permanent disk space. It also gives you more space to run your SAS programs (sas programs use your disk space if you don't submit them as a batch job.) The disadvantage is that you lose anything saved on this disk once you logoff your vma account. Hence, you must copy any files that you wish to save permanently that were created on the temporary disk to your permanent disk before logging off.

How do I do various useful CMS operations?

  1. copying files. The general form of the copy command is to type the following at the ready prompt:

copy filename filetype filemode filename filetype filemode;

where the first fn ft fm equals the file to be copied and the second equals the file to be copied. For example,

copy work sasdata t work sasdata a;

copies the file "work sasdata t" from the temporary disk to the permanent disk.

  1. filelist command

The filelist command allows you to see the files on any of your disks, permanent or temporary. The general form of the command is:

filelist filename filetype filemode

You can abbreviate the filelist command to "fl". You can use it to look at all files or only some files on a disk. Just typing a "fl" without the rest of the command will look at all files on the "a" disk. If you wish to look at all files on your "t" or temporary disk you must type the following at the ready prompt and hit enter:

fl * * t

If you wish to look at only some files on your disk (this is useful if you have a large number of files on your disk) then you can type:

fl * sas* a

The above will list all files on disk a, regardless of their filename whose filetype begins with "sas". The same operation can be done with the filename (i.e., "fl tc* * a" lists all files on disk a whose filename begins with "tc".)

  1. What can one do in a filelist?

One can issue any commands regarding files, copying, printing, deleting, etc, that one can in the ready prompt without having to issue the entire command, as one must at the ready prompt. The files will be listed in a table with the following format:

Cmd Filename Filetype Filemode Format Lrecl Records Blocks Date Time

Each file will have a listing under each of these columns except for the first column, which is where CMS commands for that file, and that file only, can be issued. The other columns give information about the file, its filename, filetype and filemode, its format, how many records or lines it contains (Records), the length of each line (Lrecl), the blocks of memory it takes up (blocks) and the date and time it was created.

  1. You can issue any CMS command regarding the file in the CMD column next to each file. To do so tab down to the CMD column next to the file your are interested in and issue the command in the following format:

command / options

Here are some common commands

netprt / (dest glass ¬ prints the file at the glass hall printer

copy / filename filetype filemode ¬ copies the file to the name/location listed after the /

delete ¬ deletes the file (be careful with this one)

rename / filename filetype filemode ¬ renames the file to the name listed after the /

x ¬ edits the file

Basically, any command that can be issued at the ready prompt regarding a file can also be issued in the command column next to that file in the filelist screen with the exception of running programs (i.e., telling CMS to run the SAS program.) The advantage of doing this is that it saves time because you don't have to type in the filename filetype filemode of the file you wish to change, edit, or print. Another advantage is that you can do carry out the same operation on multiple files within the filelist simultaneously simply by type the command as described above in the command column of the first file that you wish the command to apply to and placing an = sign in the command column next to each file thereafter that you want the same command to apply to.

Some Useful SAS Commands

  1. Titles

You can issue titles for your SAS Output, which is often useful in keeping track of what exactly you are doing. The form of the command is as follows:

Title 'Content of Title goes here';

Every sas operation thereafter will have the printed title listed until another title command is issued.

  1. Comments and spaces

Sas allows you to issue comments within the program by leading with an asterix and ending with a semicolon. For example:

*This file runs various regressions on the class data set for assignment 2;

Be careful, because if you forget the semi-colon then SAS will consider everything until the next semi-colon to be a comment. It is extremely useful to insert comments to remind yourself what you're doing.

Spaces are useful just to keep programs ordered and to divide parts of the program from others. A line space is issued by just having a line with only a semi-colon.

  1. Data Step

Each SAS program must have a data step, as are given in the examples above. In general, SAS programs must include three general types of commands. I've discussed two of these above, the CMS commands that identify input and output files and the Data command that creates either permanent or temporary sas data sets. You must have a Data step in each program or else you can do nothing else. You must have the CMS command if you have data in a separate file (you don't need it if you're inputting data within the SAS program.) It is during the Data step that calculations on the data set are performed. All manipulation of variables must occur after a Data statement and before any SAS Proc statements.

Each SAS program should have some PROC statements. PROC statements (sas procedures) are things that you tell SAS to do with the data set that you just created. Here are some common PROC statements:

PROC Means ¬ gives summary statistics for variables in the current data set
PROC Reg ¬ runs an OLS regression (must be accompanied by a model statement.)
PROC Sort ¬ sorts the current data set (must be accompanied by a "by" statement.)
PROC Corr ¬ yields correlation coefficients for variables in the current data set
PROC Plot ¬ Plots the variables from the current data set

We'll mostly be using PROC means and Proc Reg.

  1. Labels

You can label your variables in a label statement. Label statements allow you to keep track of the definition of each variable that you are using. The format of the label statement is as follows:

Label Variable = label;

Any number of labels can be included in the same statement. For example:

Label X = sex

Y = race

Z = income;

The label can be up to 40 characters long, including blanks, and if it includes either a semicolon or a an equal sign, the label must be enclosed in either single or double quotes. For example:

Label X = sex

Y = "0 if Z=2; 1 otherwise"

Z = income;

  1. A Sample Program with all the steps. This is one of my actual programs.

*This file creates some variables from the Trinidad and Tobago Labor Force data set and runs various income regressions on subsets of the data;

;
CMS FILEDEF SASDATA DISK DUMMY DUMMY T;
DATA TEMP; SET SASDATA.CHGD;
;
TITLE 'FULL DATA SET - WORKERS ONLY';
IF WORKING2 = 0 THEN DELETE; ¬ deletes non workers from the sample
IF EXPED < 0 THEN DELETE; ¬ deletes workers with negative experience
EDIN3 = EDUCNEW * GOVT2; EDIN4 = EDUCNEW * GOVT3;
Label WORKING2 = "1 IF WORKING; 0 OTHERWISE"
         EXPED = NUMBER OF MONTHS WORKED IN LAST YEAR
         SEX = "1 IF MALE; 0 OTHERWISE";
PROC MEANS N MEAN MIN MAX;
;
DATA TEMP1; SET TEMP; ¬ defines new data set (temp1), sets old data set (temp)
IF SEX=1; ¬ deletes females from the sample
TITLE 'MALES - WORKERS ONLY';
PROC REG;
MODEL LINCTOT = EDUCNEW EDIN1 EDIN2 OJT INSTIT2 EXPED EXPEDSQ EXP1
LHRS DAYSHIFT URBAN MARRIED COMMNLAW HEAD GOVTT
OIL DISTRIB MANUFACT FINSERV AGRIC ELECTRIC
CONSTRUC COMMUNIC TECHNICN PROFESSN AGWORKER SERVICE
TRADE OPERATOR ELEMNTRY SENIOR;
;
DATA TEMP1; SET TEMP;
IF SEX=0; ¬ deletes males from the sample
TITLE 'FEMALES - WORKERS ONLY';
PROC REG;
MODEL LINCTOT = EDUCNEW EDIN1 EDIN2 EDIN3 EDIN4 OJT INSTIT2 EXPED
EXPEDSQ EXP1 LHRS DAYSHIFT URBAN MARRIED COMMNLAW
HEAD GOVT2 GOVT3
OIL DISTRIB MANUFACT FINSERV AGRIC ELECTRIC
CONSTRUC COMMUNIC TECHNICN PROFESSN AGWORKER SERVICE
TRADE OPERATOR ELEMNTRY SENIOR;
;
DATA TEMP1; SET TEMP;
IF SEX=1; IF AFRICAN = 1; ¬ deletes females and non-Africans from the sample
TITLE 'MALE AFRICANS - WORKERS ONLY';
PROC REG;
MODEL LINCTOT = EDUCNEW EDIN1 EDIN2 EDIN3 EDIN4 OJT INSTIT2 EXPED
EXPEDSQ EXP1 LHRS DAYSHIFT URBAN MARRIED COMMNLAW
HEAD GOVT2 GOVT3
OIL DISTRIB MANUFACT FINSERV AGRIC ELECTRIC
CONSTRUC COMMUNIC TECHNICN PROFESSN AGWORKER SERVICE
TRADE OPERATOR ELEMNTRY SENIOR;
;
DATA TEMP1; SET TEMP;
IF SEX=1; IF INDIAN = 1; ¬ deletes females and non-Indians from the sample TITLE 'MALE INDIANS - WORKERS ONLY';
PROC REG;
MODEL LINCTOT = EDUCNEW EDIN1 EDIN2 EDIN3 EDIN4 OJT INSTIT2 EXPED
EXPEDSQ EXP1 LHRS DAYSHIFT URBAN MARRIED COMMNLAW
HEAD GOVT2 GOVT3
OIL DISTRIB MANUFACT FINSERV AGRIC ELECTRIC
CONSTRUC COMMUNIC TECHNICN PROFESSN AGWORKER SERVICE
TRADE OPERATOR ELEMNTRY SENIOR;
;
DATA TEMP1; SET TEMP; IF SEX=1; ¬ deletes females from the sample
IF INDIAN = 1 OR AFRICAN = 1 THEN DELETE; ¬ deletes Indians and Africans from the sample
TITLE 'MALE OTHERS - WORKERS ONLY';
PROC REG;
MODEL LINCTOT = EDUCNEW EDIN1 EDIN2 EDIN3 EDIN4 OJT INSTIT2 EXPED
EXPEDSQ EXP1 LHRS DAYSHIFT URBAN MARRIED COMMNLAW
HEAD GOVT2 GOVT3
OIL DISTRIB MANUFACT FINSERV AGRIC ELECTRIC
CONSTRUC COMMUNIC TECHNICN PROFESSN AGWORKER SERVICE
TRADE OPERATOR ELEMNTRY SENIOR;

Back to Dr. Olsen's Curriculum Page