Topic 1D – Importing Data Manually
This skill may be particularly useful if you are using UFApps and having difficulty importing data via the import wizard.
SAS Code Needed
Here is an image of the basic code we will use:
- We are constructing a DATA step so we begin with DATA and the name of the dataset to be created. Here I am storing it in my “bio” library with the name “testcsv” for the resulting SAS dataset.
- Then we have an INFILE statement. This tells SAS to look for the data in the DATALINES we will give below. The DSD and DLM=”,” options are related to correctly importing data separated by commas.
- Then we have an INPUT statement which contains the variable names of the variables to be imported from the datalines area.
- Here we have AGE and SMOKING (SAS is not case sensitive here, I have then in lower case in the SAS code image above).
- Since in the dataset, smoking is actual text instead of a coded number, we need to put the $ after the variable name SMOKING to indicate that SAS should expect character data.
- The $ will not be needed when categorical variables are provided in the raw data as coded numbers just as it is not used here for the variable AGE.
- Then we have the DATALINES statement on a single line
- Followed by our full dataset – here you would copy the data only from the CSV file you wish to import
- The variable names should not appear here, only in the INPUT statement above.
- You will not have the empty lines with periods – this is to indicate the rest of your data in a short image. See the full SAS code: Topic1D.sas
- Notice that there are no semicolons at the end of lines in the data area. The semicolon to end the data area should be on the line AFTER the last line of data. If you put the semicolon on the last line of data, the last line will not be read (and you may even get an error).
- Finally we close the DATA step with a RUN statement.
When you submit the code your log file should report something like:
Here is a print of the first 5 observations: