Monday, 2 June 2014

SAS_MergeVsModifyVsUpdate

Q)Descibe Update and Modify statements?
Q)Compare MERGE vs UPDATE vs MODIFY

Definition of Updating
Updating a SAS data set replaces the values of variables in master data set with values from transaction data set.
If the UPDATEMODE= option in UPDATE statement is set to MISSINGCHECK, then missing values in a transaction dataset don’t replace existing values in a master dataset.(default setting is MISSINGCHECK)

We update a data set by using the UPDATE statement along with a BY statement. Both of the input data sets must be sorted by the variable that we use in the BY statement.
DATA master;
Update master transaction;
By fieldname
RUN;

Definition of MODIFYING

Modifying a SAS data set replaces, deletes, or appends observations in an existing data set. Modifying a SAS data set is similar to updating a SAS data set, but some differences exist.
Modify statement Control the Update Process using REPLACE,REMOVE and OUTPUT statements.
When you use the MODIFY statement, there is an implied REPLACE statement at the bottom of the DATA step instead of an OUTPUT statement.

Using the MODIFY statement, we can update

every observation in a data set

DATA SAS-data-set;
MODIFY SAS-data-set;
existing-variable = expression;
RUN;

observations using a transaction data set and a BY statement

DATA SAS-data-set;
MODIFY SAS-data-set transaction-data-set;
BY key-variable;
RUN;

MODIFY master-data-set transaction-data-set
UPDATEMODE=MISSINGCHECK|NOMISSINGCHECK;

observations located using an index.

MODIFY SAS-data-set KEY=index-name;



Merge Vs Update Vs Modify

Criterion

MERGE

UPDATE

MODIFY

Can Create new dataset

When we submit a DATA step to create a SAS data set that is also named in a MERGE, UPDATE, or SET statement, SAS creates a second copy of the input data set.Once execution is done, SAS deletes the original copy of the data set. As a result,the original data set is replaced by the new data set.

When we submit a DATA step to create a SAS data set that is also named in a MERGE, UPDATE, or SET statement, SAS creates a second copy of the input data set.Once execution is done, SAS deletes the original copy of the data set. As a result,the original data set is replaced by the new data set.

when we submit a DATA step to create a SAS data set that is also

named in the MODIFY statement, SAS does not create a second copy of data but

instead updates the data set in place.

 

 

 

Can create or delete variables

Yes

 

The new data set can contain a different set of variables than the original data set and the attributes of the variables in the new data set can be different from those of the original data set.

Yes

 

The new data set can contain a different set of variables than the original data set and the attributes of the variables in the new data set can be different from those of the original data set.

No

 

Any variables can be added to PDV, but they are not written to the data set. So, the set of variables in the data set does not change when the data is modified.

Data sets must be sorted or indexed

Match-merge: Yes
One-to-one merge: No

Yes

No

 

BY values must be unique

No

Master data set: Yes
Transaction data set: No

No

Number of data sets combined

Any number

2

2

Processing missing values

 

Overwrites nonmissing values from first data set with missing values from second data set

Depends on the value of the UPDATEMODE= option

Depends on value of the UPDATEMODE= option

 

No comments:

Post a Comment