Tuesday, 5 August 2014

SAS_ProcSort

Q)What is a method for assigning first.VAR and last.VAR to the BY groupvariable on unsorted data?
A) In unsorted data you can't use First. or Last.

Q)How does SAS handle missing values in: assignment statements, functions, a merge, an update, sort order, formats, PROCs?
A) Missing values will be assigned as missing in Assignment statement. Sort order treats missing as second smallest followed by underscore.

Q)Difference between PROC SORT NODUPKEY and NODUP option?
Remove duplicate records(all fields are same) with same sort key
Remove other records(all fields may not be same) with same sort key

NODUPRECS
checks for and eliminates duplicate observations.If you specify this option, then PROC SORT compares all variable values for each observation to those for the previous observation that was written to the output data set.If an exact match is found, then the observation is not written to the output data set.

Nodup/Noduprecs

Duplicate records are not adjacent

 

data one;
  input X $ Y Z ;
dataliness;
R 3 9
R 3 3

R 3 9
R 3 1
R 3 9
;
proc sort nodup data=one;
by X Y;
run;
proc print data=one;
run;

Obs    X    Y    Z
 1     R     3       9
 2     R     3       3

3     R     3       9
 4     R     3       1
 5     R     3       9

 

Nodup/Noduprecs

Duplicate records are adjacent

Duplicate records are not adjacent but dataset is sorted on all variables

Output

data one;
  input X $ Y Z ;
dataliness;
R 3 9
R 3 9

R 3 9
R 3 3
R 3 1
;
proc sort nodup data=one;
by X Y;
run;
proc print data=one;
run;

data one;
  input X $ Y Z ;
dataliness;
R 3 9
R 3 3

R 3 9
R 3 1
R 3 9
;
proc sort nodup data=one;
by X Y Z; /* By _all_ */
run;
proc print data=one;
run;

Obs    X    Y    Z
 1     R     3       9
 2     R     3       3
 3     R     3       1


 

Nodupkey

Duplicate records are adjacent

Duplicate records are not adjacent

Output

data one;
  input X $ Y Z ;
dataliness;
R 3 9
R 3 9

R 3 9
R 3 3
R 3 1
;
proc sort nodupkey data=one;
by X Y;
run;
proc print data=one;
run;

data one;
  input X $ Y Z ;
dataliness;
R 3 9
R 3 3

R 3 9
R 3 1
R 3 9
;
proc sort nodupkey data=one;
by X Y ;
run;
proc print data=one;
run;

Obs    X    Y    Z
 1     R     3       9
 

 

 

1 comment:

  1. Thank you for sharing Valuable information .
    why dont you post some more, if possible....

    ReplyDelete