Statistical Analysis System (SAS) Programming Certification Practice Exam

Disable ads (and more) with a membership for a one time $2.99 payment

Prepare for the SAS Programming Certification. Study with our mock exam featuring multiple choice questions and detailed explanations. Ace your test!

Each practice test/flash card set has 50 randomly selected questions from a bank of over 500. You'll get a new set of questions each time!

Practice this question and more.


Which statement is true when using the BY statement with the SET statement?

  1. The data sets listed in the SET statement must be indexed or sorted.

  2. The DATA step automatically creates two variables, FIRST. and LAST., for each variable in the BY statement.

  3. FIRST. and LAST. identify the first and last observation in each BY group.

  4. FIRST. and LAST. are stored in the data set.

The correct answer is: The data sets listed in the SET statement must be indexed or sorted.

The correct answer highlights a fundamental aspect of how the BY statement functions in conjunction with the SET statement within a DATA step in SAS. When using a BY statement, it is necessary for the datasets referred to in the SET statement to be sorted or indexed by the same variables that are being used in the BY statement. This sorting or indexing ensures that SAS processes observations in the correct order, allowing it to correctly identify and manage the grouping of data within the specified BY groups. If the datasets are not sorted or indexed properly, the results can be inaccurate or unexpected since SAS may not correctly recognize which records belong to the same group. This accurate identification of groups enables the effective processing of data, analysis, and reporting. In contrast, while FIRST. and LAST. are indeed created to signify the first and last observations within each BY group, they are not automatically generated for every variable listed in the BY statement; rather, they pertain specifically to the groups defined by the BY variables. It is also true that FIRST. and LAST. identify the boundaries of these groups, but they are temporary while the data step is running and do not persist in the output dataset unless explicitly retained or saved, thus making the other statements less accurate in the context of the question