SAS Made Simple: How to prevent rounding errors when using SAS’s Proc SQL

Join us as we look at how to prevent rounding errors when using SAS’s Proc SQL; leveraging the default formatting of the SELECT INTO statement.

I recently encountered  some peculiar behaviour when using the “select into” function in SAS’s proc sql. It was rounding very large numeric values in macro variables to the nearest 5 significant figures.

On the project I was on, we had a “batches” table that kept track of datasets passing through our pipeline. In this batches table we kept record of the size of these datasets, however when we were processing very large datasets, the value was being rounded, e.g. a dataset with 123,456,789 rows was being rounded to 123,460,000.

After some extensive googling and digging into an 18 year old forum post (shout out to whoever Howard Schreier is), I discovered the reason behind this bug. So I thought it may be useful putting out something for anyone else who finds themself in the same boat.

Problem

In my pipeline there was a simple step which put the count of rows into a macro variable, and then added that macro variable to the dataset keeping track of processing:

/* Count number of rows into macro variable "nrows" */ 

proc sql noprint;

select count(RecordID) into :nRows

from work.dataset;

quit;

/* Add nRows to batches table */

data work.batches(drop=distinctFields );

set work.batches;

numberOfRows = &nRows.;

run;



I discovered that proc sql uses “best8” format by default when handling numeric values, which means it uses a maximum of 8 charcters. So when there are numbers longer than 8 digits, it converts the number into scientific notation to fit into the format, e.g. 1.2346E8.

So to expand our previous example:

  • There are 123,456,789 rows in our dataset.

  • When proc sql assigns this row count to a macro variable, nRows, it applies the BEST8. format by default.

  • Because the number has more than 8 digits, proc sql formats it as 1.2346E8 (scientific notation) to fit it within the 8-character limit.

  • The macro variable “nRows” is put into a field in the “batches” table as a numeric value, so it takes 1.2346E8 and converts into 123,460,000, resulting in our bug!

Solution

To get round this, you can explicitly specify the format when using the select into statement in proc sql simply as described below. So instead of the format being “best8”, I’ve specified the format to a larger width of “15.0”, meaning the proc step no longer converts the number to scientific notation.

/* Count number of rows into macro variable "nrows" */ 

proc sql noprint;

select count(RecordID) format 15.0 into :nRows

from work.dataset;

quit;

/* Add nRows to batches table */

data work.batches(drop=distinctFields );

set work.batches;

numberOfRows = &nRows.;

run;

I hope one day this helps someone as befuddled with this odd SAS bug as me, if it does, let me know!

Previous
Previous

Data Privacy Day 2025: Building Trust Through a Societal Compact for Health Data

Next
Next

SAS Fans 2024 - Understanding Analogies in Data - Part 2