Q)Name some options used with INFILE and INPUT
statement.
INFILE
Options
DSD DLM
Missover/Truncover/PAD
INPUT Options
@ and @@
Q)What
are the special input delimiters?
The input
delimiters are DLM and DSD.
Q)Reading Data from External file. Explain List Vs
column Input?
List Input
·
Reading values separated by blanks
Infile ‘C:\...’;
Input Gender $ Age Height Weight;
·
Reading values separated by commas
Infile ‘C:\...’ dsd;
Input Gender $ Age Height Weight;
DSD:
First, it changes the default delimiter from a blank to a comma. Next, if
there are two delimiters in a row, it assumes there is a missing value between.
Finally, if character values are placed in quotes (single or double quotes),
the quotes are stripped from the value.
·
Reading values separated by delimiter other
than commas/blanks
Infile ‘C:\...’ dsd dlm=‘:’;
Column Input
·
infile
'c:\books\learning\bank.txt';
input Subj $ 1-3 DOB $ 4-13 Gender $ 14 Balance 15-21;
·
infile
'c:\books\learning\bank.txt';
input @1 Subj $3. @4 DOB
mmddyy10. @14 Gender $1. @15 Balance 7.;
Using Informats with List Input
·
data list_example;
infile 'c:\books\learning\list.csv' dsd;
input
Subj : $3.
Name : $20.
DOB : mmddyy10.
Salary : dollar8.;
run;
·
colon (called an informat modifier) tells SAS to use the informat supplied but
to stop reading the value for this variable when a delimiter is encountered.
&(Informat modifier) can be used like colon but the delimiter is now 2 or
more blanks instead of just one
Q)Explain Flowover In Case of Column Input
Q)Explain flaws while reading a file with Flowover(default option) in case of
Column Input.
When the INPUT stmt reaches the end of non blank chars
without having completely filled all variables,
a new line is read into the Input buffer for each unfilled variable and
INPUT attempts to fill the unfilled/partially filled variable starting from
col one(in case of patially filled variable-new value will overwrite it)
In the below example
1)Record2-JobDesc=Driver..It is partially filled.So
next line is read into Input buffer and jobDesc is filled with new value
starting from col 1.
2)Record5-Empid and JobDesc ,both are empty.So for Empid a new line is
read into Inputbuffer which sets EMPID=WILLY.
Then a newline is read for empty jobdesc which sets JOBDESC=RAYAT
INPUT
Lavender |
Debby |
ABB11 |
Technician |
WOODS |
GUNILA |
ABB15 |
Driver |
LEHMAN |
DERRYL |
ACC21 |
|
PLANT |
THOMAS |
ACC25 |
Technician |
GUPTA |
RAMAN |
|
|
WILLY |
TORY |
ASS21 |
Plumber |
RAYAT |
SATNAM |
ASS25 |
Pilot |
PROGRAM
DATA test;
INFILE "c:\a.txt";
INPUT Surname $1-22 Firstn $ 23-32
Empid $33-37 Jobdesc $38-47;
RUN;
proc print data = test;
run;
OUTPUT
Lavender |
Debby |
ABB11 |
Technician |
WOODS |
GUNILA |
ABB15 |
Lehman |
PLANT |
THOMAS |
ACC25 |
Technician |
GUPTA |
RAMAN |
WILLY |
RAYAT |
Q)Explain
about INFILE options Flowover/Missover/Truncover
FLOWOVER
FLOWOVER is the
default option on INFILE statement. Here, when the INPUT statement reaches the
end of non-blank characters without having filled all variables, a new line is
read into the Input Buffer and INPUT attempts to fill the rest of the variables
starting from column one. The next time an INPUT statement is executed, a new
line is brought into the Input Buffer.
MISSOVER
When INPUT
reads a short line, MISSOVER option on INFILE statement does not allow it to move
to the next line. MISSOVER option sets all the variables without values to
missing.
TRUNCOVER
Causes the
INPUT statement to read variable-length records where some records are shorter
than the INPUT statement expects. Variables which are not assigned values are
set to missing.
Difference
between TRUNCOVER and MISSOVER
Both will assign missing values to variables if the data line ends before the
variable’s field starts. But when the data line ends in the middle of a
variable field, TRUNCOVER will take as much as is there, whereas MISSOVER will
assign the variable a missing value.
If reading a
variable length file with fixed input, how would you prevent SAS from reading
the next record if the last variable didn't have a value?
By using the option MISSOVER in the infile statement.If the input of some data
lines are shorter than others then we use TRUNCOVER option in the infile
statement
List Input |
Column Input |
|
Flowover |
Will not work
properly in some cases |
Will not work
properly in some cases |
Missover |
ok |
Will not work
properly in some cases |
Truncover |
ok |
ok |
PAD |
Will not work
properly in some cases |
ok |
Q)What will be
the output when flowover,missover and truncover is used with below dataset
For example, an external file with
variable-length records contains these records:
1
22
333
4444
55555
The following DATA step reads this data
to create a SAS data set. Only one of the input records is as long as the
informatted length of the variable TESTNUM.
infile 'external-file';
input testnum 5.;
run;
This DATA step creates the three observations from the
five input records because by default the FLOWOVER option is used to read the
input records.
If you use the MISSOVER option in the INFILE statement, then the DATA step
creates five observations.All the values that were read from
records that were too short are set to missing. Use the TRUNCOVER option
in the INFILE statement if you prefer to see what values were present in
records that were too short to satisfy the current INPUT statement.
infile 'external-file' truncover;
OBS |
FLOWOVER |
MISSOVER |
TRUNCOVER |
|
|
|
|
1 |
22 |
. |
1 |
2 |
4444 |
. |
22 |
3 |
55555 |
. |
333 |
4 |
|
. |
4444 |
5 |
55555 |
55555 |
Q)What SAS statements would you code to
read an external raw data file to a DATA step?
INFILE and INPUT statement.
Q)What statement you code to tell SAS that
it is to write to an external file?What statement do you code to write the
record to the file?
PUT and FILE statements.
Q)If reading an external file to produce an
external file, what shortcut to write record without coding every single
variable on the record?
Ans: Put _all _
Q)If you're not wanting any SAS output
from a data step, how would you code the data statement to prevent SAS from
producing a set?
Data _Null_
Q)How do you read in the variables that
you need?
Using Input statement with the column pointers like @5/12-17 etc.
Q)Comparison between trailing @ and
subsetting IF
When All variables are different then @ and IF may produce
same result
Subsetting
IF |
Trailing
@ |
Output |
data temp; input
Gender$ Name$ Age; if Gender
= 'F'; datalines; M
RAM 23 F
MEENA 30 M
SHAM 42 F SHEENA
48 F
LEENA 55; proc print
data=temp; run; |
data temp; input
Gender$ @; if Gender
= 'F'; input
Name$ Age; datalines; M RAM 23 F MEENA 30 M SHAM 42 F SHEENA 48 F LEENA 55; proc print
data=temp; run; |
Obs
Gender Name Age 1 F MEENA 30 2 F SHEENA
48 3 F LEENA 55 |
In case IF stmt total all 15 variables(5
obs * 3 variables)are read to produce output. |
In case of @ 1)First All gender variable are read=5 variable 2)Gender=’F’ in
3 obs. So 3*2=6 variable are read 3)Total 5+6=11 variable are read to produce the
output |
|
When All variables are not different then @ and IF produce
different result
Without
Trailing @ |
Output |
data temp; input @1
Gender $1. ; if Gender = 'F'
then input @3 Name $6. @10
Age 2.; else if Gender =
'M' then input @3 Name $6. @13
Age 2.; datalines; M RAM
23 F MEENA
30 M SHAM
42 F SHEENA 48
F LEENA 55; proc print
data=temp; run; |
Obs
Gender Name Age 1 M MEENA . 2 M SHEENA . This is
not expected result!!! |
With
Trailing @ |
Output |
data temp; input @1
Gender $1. @; if Gender = 'F'
then input @3 Name $6. @10
Age 2.; else if Gender =
'M' then input @3 Name $6. @13
Age 2.; datalines; M
RAM 23 F
MEENA 30 M
SHAM 42 F SHEENA
48 F
LEENA 55; proc print
data=temp; run; |
Obs
Gender Name Age 1 M RAM
23 2 F MEENA
30 3 M SHAM
42 4 F SHEENA
48 5 F LEENA
55 |
Q)How would you
create multiple observations from a single observation?
Q)What is the difference between @
and @@?
Trailing
@: Testing a Condition
before Creating an Observation
· By using @ without specifying a column, SAS will hold the line of data until it reaches either the end of the data step or an INPUT statement that does not end with the trailing@.
·
The trailing @ is also known as column pointer
By using the trailing@, in the INPUT statement
gives you ability to read a part of your raw data line and hold it in
the input buffer,while you t test for some specified condition, and then decide
how to read additional data from the
same record.
Double Trailing @@:Creating Multiple Observations from a Single Record
·
The double trailing @ not only prevents SAS from
reading a new record into the input buffer when a new INPUT statement is
encountered, but it also prevents the record from being released when the
program returns to the top of the DATA step.
The trailing @ does not hold a record in the input buffer across iterations of
the DATA step. @@ hold a record across iterations of data step.
SAS will hold the line of data until it reaches either the end of theline or an
INPUT statement that does not end with the trailing@@.