I have a dataset in SPSS, see example dataset below. This is just an example, the real one is provided by a separate external process and has more columns and rows. The empty values are set as " "
in the example but this is also how empty values are provided in SPSS, it's treated internally as null/empty/missing values.
data list list/FieldNam(a20) FormName(a20) FieldType(a20) Choices(a50) Required(F1) Identifier(a1) Minimum(f8) Maximum(f8).
begin data
"Field 1" "Form abc" "text" " " 1 "y" " " " "
"Field 2" "Form abc" "datetime" " " 1 "y" " " " "
"Field 3" "Form xyz" "radio" "0=never | 1=sometimes | 2=often | 3=always" " " " " " " " "
"Field 4" "Form xyz" "text" " " " " " " "1" "100"
"Field 5" "Form xyz" "radio" "0=no | 1=yes" " " " " " " " "
end data.
Then I use the following syntax to save it as a CSV text file.
SAVE TRANSLATE
/TYPE = CSV
/FIELDNAMES
/TEXTOPTIONS DELIMITER=',' QUALIFIER='"'
/OUTFILE = 'C:\Temp\my_csv_file.csv'
/ENCODING='Windows-1252'
/REPLACE.
And the resulting CSV file contains the following, with single spaces for the empty values
FieldNam,FormName,FieldType,Choices,Required,Identifier,Minimum,Maximum
Field 1,Form abc,text, ,1,y, ,
Field 2,Form abc,datetime, ,1,y, ,
Field 3,Form xyz,radio,0=never | 1=sometimes | 2=often | 3=always, , , ,
Field 4,Form xyz,text, , , ,1,100
Field 5,Form xyz,radio,0=no | 1=yes, , , ,
However, I would like the empty values to just be empty, like so:
FieldNam,FormName,FieldType,Choices,Required,Identifier,Minimum,Maximum
Field 1,Form abc,text,,1,y,,
Field 2,Form abc,datetime,,1,y,,
Field 3,Form xyz,radio,0=never | 1=sometimes | 2=often | 3=always,,,,
Field 4,Form xyz,text,,,,1,100
Field 5,Form xyz,radio,0=no | 1=yes,,,,
So my question is, is it possible to export the SPSS dataset like this?
The exported csv file will be used as input for another system, and it cannot handle the , ,
empty values. I know I can open it in Notepad and just do search-and-replace after the fact. But I want to automate it as much as possible because the export will be used more often, so this would save a lot of work.
Information from this page suggests one can invoke a script: https://www.ibm.com/docs/en/spss-statistics/23.0.0?topic=reference-script
SCRIPT
SCRIPT
runs a script to customize the program or automate regularly performed tasks. You can run a Basic script or a Python script.
SCRIPT 'filename' [(quoted string)]
This command takes effect immediately. It does not read the active dataset or execute pending transformations. See the topic Command Order for more information.
Release History
Release 16.0
Scripts run from the
SCRIPT
command now run synchronously with the command syntax stream.Release 17.0
Ability to run Python scripts introduced.
Example Python script to invoke after each export for release 17.0 or higher:
import fileinput
import os
filename = 'C:\Temp\my_csv_file.csv'
postfix = '.bak'
with fileinput.FileInput(filename, inplace=True, backup=postfix) as file:
for line in file:
print(line.replace(', ', ',').replace(' ,', ','), end='')
try:
os.remove(filename + postfix)
except FileNotFoundError as e:
pass
The script performs a simple search and replace. I've included code to automatically remove the temporary backup file even though the Python manual states it automatically removes the file. For me it consistently does not at the moment (thus the manual removing of the file). But you may remove that specific code if it works without it for you.
Of course you could also use Python's csv
module and iterate the rows and write it back to another csv, etc. See the documentation for that one here: https://docs.python.org/3/library/csv.html