Free Essay

Cis230

In:

Submitted By 200122865
Words 3077
Pages 13
Chapter 1
1. Identify which of the following variable names are valid SAS names:
Height (valid)
HeightInCentimeters(valid)
Height_in_centimeters(valid)
Wt-Kg(invalid)
x123y456(valid)
76Trombones(invalid)
MiXeDCasE(valid)
2. In the following list, classify each data set name as valid or invalid:
Clinic (valid)
Clinic (valid)
Work (valid) hyphens-in-the-name(invalid) 123GO(invalid)
Demographics_2006(valid)
3. You have a data set consisting of Student ID, English, History, Math, and Science test scores on 10 students.
a. The number of variables is ___5_______
b. The number of observations is _10_________
4. True or false:
a. You can place more than one SAS statement on a single line. (true)
b. You can use several lines for a single SAS statement.(true)
c. SAS has three data types: character, numeric, and integer.(false)
d. OPTIONS and TITLE statements are considered global statements.(true)
5. What is the default storage length for SAS numeric variables (in bytes)?
8 bytes
Chapter 2
1. You have a text file called stocks.txt containing a stock symbol, a price, and the number of shares. Here are some sample lines of data:
File stocks.txt
AMGN 67.66 100
DELL 24.60 200
GE 34.50 100
HPQ 32.32 120
IBM 82.25 50
MOT 30.24 100
a. Using this raw data file, create a temporary SAS data set (Portfolio). Choose your own variable names for the stock symbol, price, and number of shares. In addition, create a new variable (call it Value) equal to the stock price times the number of shares. Include a comment in your program describing the purpose of the program, your name, and the date the program was written.
DATA Portfolio;
INFILE "R:\STOCKS.TXT";
INPUT SYMBOL $ NUMBER PRICE;
*COMPUTE VALUE;
*PROGRAMMER:XIAOYING XU;
*DATE WRITTEN:SEPTEMBER 8,2014;
VALUE=NUMBER*PRICE;
RUN;
b. Write the appropriate statements to compute the average price and the average number of shares of your stocks.
Title”listing of portfolio”;
TITLE "LISTING OF PORTFOLIO";
PROC PRINT DATA=PORTFOLIO;
RUN;
TITLE "AVERAGE PRICE AND NUMBER OF PORTFOLIO";
PROC MEANS DATA=PORTFOLIO;
VAR NUMBER PRICE;
RUN;

2. Given the program here, add the necessary statements to compute four new variables:
a. Weight in kilograms (1 kg = 2.2 pounds). Name this variable WtKg.

b. Height in centimeters (1 inch = 2.54 cm). Name this variable HtCm.
c. Average blood pressure (call it AveBP) equal to the diastolic blood pressure plus one-third the difference of the systolic blood pressure minus the diastolic blood pressure. d. A variable (call it HtPolynomial) equal to 2 times the height squared plus 1.5 times the height cubed.
Here is the program for you to modify: data prob2; input ID $
Height /* in inches */
Weight /* in pounds */
SBP /* systolic BP */
DBP /* diastolic BP */;
WtKg=Weight*2.54;
Htcm=Height/2.2;
AveBP=DBP+(SBP-DBP)/3;
HtPolynomial =2*Height**2+1.5*height**3; datalines; 001 68 150 110 70
002 73 240 150 90
003 62 101 120 80
;
title "Listing of PROB2"; proc print data=prob2; run; Note: This program uses a DATALINES statement, which enables you to include the input data directly in the program. You can read more about this statement in the next chapter.
3. You are given an equation to predict electromagnetic field (EMF) strength, as follows: EMF = 1.45 x V + (R/E) x V3 – 125.
If your SAS data set contains variables called V, R, and E, write a SAS assignment statement to compute the EMF strength.
EMF=1.45*V+(R/E)*V**3-125
4. What is wrong with this program?
001 data new-data;
002 infile”prob4data.txt”;
003 input x1 x2;
004 y1 = 3*(x1) + 2* (x2);
005 y2 = x1 / (x2);
006 new_variable_from_x1_and_x2 = x1 + x2 – 37;
007 run;
Note: Line numbers are for reference only; they are not part of the program.
Chapter 3
Solutions to odd-numbered problems are located at the back of this book and on the CD that accompanies this book. Solutions to all problems are available to professors. If you are a professor, visit the book’s companion Web site at http://support.sas.com/cody for information about how to obtain the solutions to all problems.
1. You have a text file called scores.txt containing information on gender (M or F) and four test scores (English, history, math, and science). Each data value is separated from the others by one or more blanks. Here is a listing of the data file:
File scores.txt
M 80 82 85 88
F 94 92 88 96
M 96 88 89 92
F 95 . 92 92
Data scores;
Infile “R:\scores.txt”;
Input Gender : $1.
English
History
Math
Science;
Average=(English+History+Math+Science)/4;
Title ”Listings of scores”;
Proc print data=scores;
Run;
a. Write a DATA step to read in these values. Choose your own variable names. Be sure that the value for Gender is stored in 1 byte and that the four test scores are numeric. Data marks;
Infile "R:\scores.txt";
Input Gender : $1.
English
History
Math
Science;
Average=(English + History + Math + Science)/4; run; Title ”Listings of scores”;
Proc print data=marks;
Run;
b. Include an assignment statement computing the average of the four test scores.
c. Write the appropriate PROC PRINT statements to list the contents of this data set. 2. You are given a CSV (comma-separated values) file called political.csv containing state, political party, and age. A listing of this file is shown here:
File political.csv
"NJ",Ind,55
"CO",Dem,45
"NY",Rep,23
"FL",Dem,66
"NJ",Rep,34
a. Write a SAS program to create a temporary SAS data set called Vote. Use the variable names State, Party, and Age. Age should be stored as a numeric variable;
State and Party should be stored as character variables.
b. Include a procedure to list the observations in this data set.
c. Include a procedure to compute frequencies for Party. data vote; infile "R:\POLITICAL.CSV" DSD; input State : $2.
Party : $3.
Age;
run; title "listing of Vote"; proc print data=vote; run; title "frequency of Vote"; proc freq data=vote; tables party / nocum; run; 3. You are given a text file where dollar signs were used as delimiters. To indicate missing values, two dollars signs were entered. Values in this file represent last name, employee number, and annual salary.
Here is a listing of this file:
File company.txt
Roberts$M234$45000
Chien$M74777$$
Walters$$75000
Rogers$F7272$78131
Using this data file as input, create a temporary SAS data set called Company with the variables LastName (character), EmpNo (character), and Salary (numeric). data company; infile 'R:\272assig\company.txt' dsd dlm='$'; input LastName $ EmpNo $ Salary; format Salary dollar10.; run; title 'listing of company'; proc print data=company noobs; run; 4. Repeat Problem 2 using a FILENAME statement to create a fileref instead of using the file name on the INFILE statements. filename learn 'R:\272assig\company.txt'; data company; infile learn dsd dlm='$'; input LastName $ EmpNo $ Salary; format Salary dollar10.; run; title 'listing of company'; proc print data=company noobs; run; 5. You want to create a test data set that uses a DATALINES statement to read in values for X and Y. In the DATA step, you want to create a new variable, Z, equal to
100 + 50X + 2X2 – 25Y + Y2. Use the following (X,Y) data pairs: (1,2), (3,6), (5,9), and (9,11). data Zvalue; input Y $ X;
Z=100+50*X**2-25*Y+Y**2;
datalines;
1 2
3 6
5 9
9 11
;
title 'listing of TESTDATA';
PROC PRINT DATA=Zvalue; run; 6. You have a text file called bankdata.txt with data values arranged as follows:
Variable Description Starting Column Ending Column Data Type
Name Name 1 15 Char
Acct Account number 16 20 Char
Balance Acct balance 21 26 Num
Rate Interest rate 27 30 Num
Create a temporary SAS data set called Bank using this data file. Use column input to specify the location of each value. Include in this data set a variable called Interest computed by multiplying Balance by Rate. List the contents of this data set using
PROC PRINT. Here is a listing of the text file:
File bankdata.txt
Philip Jones V1234 4322.32
Nathan Philips V1399 15202.45
Shu Lu W8892 451233.45
Betty Boop V7677 50002.78 data bank; infile 'R:\272assig\BANKDATA.TXT' pad; input NAME $1-15
ACCOUNT $16-20
BALANCE: 21-26
RATE 27-30;
RATE=BALANCE*RATE;
FORMAT Balance Interest Dollar10.2; run; PROC PRINT DATA=bank noobs;
RUN;
8. Repeat Problem 6 using formatted input to read the data values instead of column input. data bank; infile 'R:\272assig\BANKDATA.TXT' pad; input @1 NAME $15.
@16 ACCOUNT $5.
@21 BALANCE $6.
@27 RATE $4;
RATE=BALANCE*RATE;
FORMAT Balance Interest Dollar10.2; run; PROC PRINT DATA=bank noobs;
RUN;
10. You are given a text file called stockprices.txt containing information on the purchase and sale of stocks. The data layout is as follows:
Variable Description Starting
Column
Length Type
Stock Stock symbol 1 4 Char
PurDate Purchase date 5 10 mm/dd/yyyy
PurPrice Purchase price 15 6 Dollar signs and commas Number Number of shares 21 4 Num
SellDate Selling date 25 10 mm/dd/yyyy
SellPrice Selling price 35 6 Dollar signs and commas A listing of the data file is:
File stockprices.txt
IBM 5/21/2006 $80.0 10007/20/2006 $88.5
CSCO04/05/2005 $17.5 20009/21/2005 $23.6
MOT 03/01/2004 $14.7 50010/10/2006 $19.9
XMSR04/15/2006 $28.4 20004/15/2007 $12.7
BBY 02/15/2005 $45.2 10009/09/2006 $56.8
Create a SAS data set (call it Stocks) by reading the data from this file. Use formatted input.
Compute several new variables as follows:
Variable Description Computation
TotalPur Total purchase price Number times PurPrice
TotalSell Total selling price Number times SellPrice
Profit Profit TotalSell minus TotalPur
Print out the contents of this data set using PROC PRINT. data stocks; infile 'R:\272assig\stockprices.txt' pad; input @1 stock $4.
@5 PurDate mmddyy10.
@15 PurPrice dollar6.
@21 Number 4.
@25 SellDate mmddyy10.
@35 SellPrice dollar6.;
TotalPur=number*purprice;
TotalSel=number*sellprice;
Profit=TotalSel-TotalPur;
format purprice sellprice TotalPur totalsel profit dollar10. purdate selldate mmddyy10. run; title 'Listing of stocks';
Proc print data=stocks noobs; run; Chapter 3: Reading Raw Data from External Files 51
11. You have a CSV file called employee.csv. This file contains the following information: Variable Description Desired Informat
ID Employee ID $3.
Name Employee name $20.
Depart Department $8.
DateHire Hire date MMDDYY10.
Salary Yearly salary DOLLAR8.
Use list input to read data from this file. You will need an informat to read most of these values correctly (i.e., DateHire needs a date informat). You can do this in either of two ways. First is to include an INFORMAT statement to associate each variable with the appropriate informat. The other is to use the colon modifier and supply the informats directly in the INPUT statement. Create a temporary SAS data set
(Employ) from this data file. Use PROC PRINT to list the observations in your data set and the appropriate procedure to compute frequencies for the variable Depart.
A listing of the raw data file is:
File employee.csv
123,"Harold Wilson",Acct,01/15/1989,$78,123.
128,"Julia Child",Food,08/29/1988,$89,123
007,"James Bond",Security,02/01/2000,$82,100
828,"Roger Doger",Acct,08/15/1999,$39,100
900,"Earl Davenport",Food,09/09/1989,$45,399
906,"James Swindler",Acct,12/21/1978,$78,200 data employ;
INFORMAT ID: $3.
NAME: $20.
Depart: $8.
DateHire: mmddyy10.
Salary: dollar8.; infile 'R:\272assig\employee.csv' DSD; input ID
NAME
Depart
DateHire
Salary; format datehire DATE9. salary dollar8.;
RUN;
title 'Listing of EMPLOYEES';
PROC PRINT data=employ NOOBS; run; Chapter 4
Solutions to odd-numbered problems are located at the back of this book and on the CD that accompanies this book. Solutions to all problems are available to professors. If you are a professor, visit the book’s companion Web site at http://support.sas.com/cody for information about how to obtain the solutions to all problems.
1. Run the program here to create a permanent SAS data set called Perm. You will need to modify the program to specify a folder where you want to place this data set.
Run PROC CONTENTS on this data set and then use the SAS Explorer to investigate the properties of this data set as well. libname learn 'c:\your-folder-name'; data learn.perm; input ID : $3. Gender : $1. DOB : mmddyy10.
Height Weight; label DOB = 'Date of Birth'
Height = 'Height in inches'
Weight = 'Weight in pounds'; format DOB date9.; datalines; 001 M 10/21/1946 68 150
002 F 5/26/1950 63 122
003 M 5/11/1981 72 175
004 M 7/4/1983 70 128
005 F 12/25/2005 30 40
;

STAT272 Assignment 1 XuXiaoying 200122865

⍟Chapter 1
2. In the following list, classify each data set name as valid or invalid:
Clinic (valid)
Clinic (valid)
Work (valid) hyphens-in-the-name(invalid) 123GO(invalid)
Demographics_2006(valid)
4. True or false:
a. You can place more than one SAS statement on a single line. (true)
b. You can use several lines for a single SAS statement.(true)
c. SAS has three data types: character, numeric, and integer.(false)
d. OPTIONS and TITLE statements are considered global statements.(true)
⍟Chapter 2
2. Given the program here, add the necessary statements to compute four new variables:
a. Weight in kilograms (1 kg = 2.2 pounds). Name this variable WtKg.
b. Height in centimeters (1 inch = 2.54 cm). Name this variable HtCm.
c. Average blood pressure (call it AveBP) equal to the diastolic blood pressure plus one-third the difference of the systolic blood pressure minus the diastolic blood pressure. d. A variable (call it HtPolynomial) equal to 2 times the height squared plus 1.5 times the height cubed.
Here is the program for you to modify: data prob2; input ID $
Height /* in inches */
Weight /* in pounds */
SBP /* systolic BP */
DBP /* diastolic BP */;
WtKg=Weight*2.54;
Htcm=Height/2.2;
AveBP=DBP+(SBP-DBP)/3;
HtPolynomial =2*Height**2+1.5*height**3; datalines; 001 68 150 110 70
002 73 240 150 90
003 62 101 120 80
;
title "Listing of PROB2"; proc print data=prob2; run; Note: This program uses a DATALINES statement, which enables you to include the input data directly in the program. You can read more about this statement in the next chapter.
4. What is wrong with this program?
001 data new-data; (With a dash)
002 infile”prob4data.txt”; (No quotation marks)
003 input x1 x2;(No semicolon)
004 y1 = 3*(x1) + 2* (x2);(No *)
005 y2 = x1 / (x2);
006 new_variable_from_x1_and_x2 = x1 + x2 – 37;
007 run;
Note: Line numbers are for reference only; they are not part of the program.
⍟Chapter 3
2. You are given a CSV (comma-separated values) file called political.csv containing state, political party, and age. A listing of this file is shown here:
File political.csv
"NJ",Ind,55
"CO",Dem,45
"NY",Rep,23
"FL",Dem,66
"NJ",Rep,34
a. Write a SAS program to create a temporary SAS data set called Vote. Use the variable names State, Party, and Age. Age should be stored as a numeric variable;
State and Party should be stored as character variables.
b. Include a procedure to list the observations in this data set.
c. Include a procedure to compute frequencies for Party. data vote; infile "R:\POLITICAL.CSV" DSD; input State : $2.
Party : $3.
Age;
run; title "listing of Vote"; proc print data=vote; run; title "frequency of Vote"; proc freq data=vote; tables party / nocum; run; 6. You have a text file called bankdata.txt with data values arranged as follows:
Variable Description Starting Column Ending Column Data Type
Name Name 1 15 Char
Acct Account number 16 20 Char
Balance Acct balance 21 26 Num
Rate Interest rate 27 30 Num
Create a temporary SAS data set called Bank using this data file. Use column input to specify the location of each value. Include in this data set a variable called Interest computed by multiplying Balance by Rate. List the contents of this data set using
PROC PRINT. Here is a listing of the text file:
File bankdata.txt
Philip Jones V1234 4322.32
Nathan Philips V1399 15202.45
Shu Lu W8892 451233.45
Betty Boop V7677 50002.78 data bank; infile 'R:\272assig\BANKDATA.TXT' pad; input NAME $1-15
ACCOUNT $16-20
BALANCE: 21-26
RATE 27-30;
RATE=BALANCE*RATE;
FORMAT Balance Interest Dollar10.2; run; PROC PRINT DATA=bank noobs;
RUN;
10. You are given a text file called stockprices.txt containing information on the purchase and sale of stocks. The data layout is as follows:
Variable Description Starting
Column
Length Type
Stock Stock symbol 1 4 Char
PurDate Purchase date 5 10 mm/dd/yyyy
PurPrice Purchase price 15 6 Dollar signs and commas Number Number of shares 21 4 Num
SellDate Selling date 25 10 mm/dd/yyyy
SellPrice Selling price 35 6 Dollar signs and commas A listing of the data file is:
File stockprices.txt
IBM 5/21/2006 $80.0 10007/20/2006 $88.5
CSCO04/05/2005 $17.5 20009/21/2005 $23.6
MOT 03/01/2004 $14.7 50010/10/2006 $19.9
XMSR04/15/2006 $28.4 20004/15/2007 $12.7
BBY 02/15/2005 $45.2 10009/09/2006 $56.8
Create a SAS data set (call it Stocks) by reading the data from this file. Use formatted input.
Compute several new variables as follows:
Variable Description Computation
TotalPur Total purchase price Number times PurPrice
TotalSell Total selling price Number times SellPrice
Profit Profit TotalSell minus TotalPur
Print out the contents of this data set using PROC PRINT. data stocks; infile 'R:\272assig\stockprices.txt' pad; input @1 stock $4.
@5 PurDate mmddyy10.
@15 PurPrice dollar6.
@21 Number 4.
@25 SellDate mmddyy10.
@35 SellPrice dollar6.;
TotalPur=number*purprice;
TotalSel=number*sellprice;
Profit=TotalSel-TotalPur;
format purprice sellprice TotalPur totalsel profit dollar10. purdate selldate mmddyy10. run; title 'Listing of stocks';
Proc print data=stocks noobs;
run;

Similar Documents