Search code examples
csvexportspss

Export to CSV using SAVE TRANSLATE but empty values are exported as a single space


I have a dataset in SPSS, see example dataset below. This is just an example, the real one is provided by a separate external process and has more columns and rows. The empty values are set as " " in the example but this is also how empty values are provided in SPSS, it's treated internally as null/empty/missing values.

data list list/FieldNam(a20) FormName(a20) FieldType(a20) Choices(a50) Required(F1) Identifier(a1) Minimum(f8) Maximum(f8).
begin data
"Field 1" "Form abc" "text" " " 1 "y" " " " "
"Field 2" "Form abc" "datetime" " " 1 "y" " " " "
"Field 3" "Form xyz" "radio" "0=never | 1=sometimes | 2=often | 3=always" " " " " " " " "
"Field 4" "Form xyz" "text" " " " " " " "1" "100"
"Field 5" "Form xyz" "radio" "0=no | 1=yes" " " " " " " " "
end data.

Then I use the following syntax to save it as a CSV text file.

SAVE TRANSLATE
/TYPE = CSV
/FIELDNAMES
/TEXTOPTIONS DELIMITER=',' QUALIFIER='"'
/OUTFILE = 'C:\Temp\my_csv_file.csv'
/ENCODING='Windows-1252'
/REPLACE.

And the resulting CSV file contains the following, with single spaces for the empty values

FieldNam,FormName,FieldType,Choices,Required,Identifier,Minimum,Maximum
Field 1,Form abc,text, ,1,y, , 
Field 2,Form abc,datetime, ,1,y, , 
Field 3,Form xyz,radio,0=never | 1=sometimes | 2=often | 3=always, , , , 
Field 4,Form xyz,text, , , ,1,100
Field 5,Form xyz,radio,0=no | 1=yes, , , , 

However, I would like the empty values to just be empty, like so:

FieldNam,FormName,FieldType,Choices,Required,Identifier,Minimum,Maximum
Field 1,Form abc,text,,1,y,,
Field 2,Form abc,datetime,,1,y,,
Field 3,Form xyz,radio,0=never | 1=sometimes | 2=often | 3=always,,,,
Field 4,Form xyz,text,,,,1,100
Field 5,Form xyz,radio,0=no | 1=yes,,,,

So my question is, is it possible to export the SPSS dataset like this?

The exported csv file will be used as input for another system, and it cannot handle the , , empty values. I know I can open it in Notepad and just do search-and-replace after the fact. But I want to automate it as much as possible because the export will be used more often, so this would save a lot of work.


Solution

  • Information from this page suggests one can invoke a script: https://www.ibm.com/docs/en/spss-statistics/23.0.0?topic=reference-script

    SCRIPT

    SCRIPT runs a script to customize the program or automate regularly performed tasks. You can run a Basic script or a Python script.

    SCRIPT 'filename' [(quoted string)]

    This command takes effect immediately. It does not read the active dataset or execute pending transformations. See the topic Command Order for more information.

    Release History

    Release 16.0

    Scripts run from the SCRIPT command now run synchronously with the command syntax stream.

    Release 17.0

    Ability to run Python scripts introduced.

    Example Python script to invoke after each export for release 17.0 or higher:

    import fileinput
    import os
    filename = 'C:\Temp\my_csv_file.csv'
    postfix = '.bak'
    with fileinput.FileInput(filename, inplace=True, backup=postfix) as file:
        for line in file:
            print(line.replace(', ', ',').replace(' ,', ','), end='') 
    try:
        os.remove(filename + postfix)
    except FileNotFoundError as e:
        pass
    

    The script performs a simple search and replace. I've included code to automatically remove the temporary backup file even though the Python manual states it automatically removes the file. For me it consistently does not at the moment (thus the manual removing of the file). But you may remove that specific code if it works without it for you.

    Of course you could also use Python's csv module and iterate the rows and write it back to another csv, etc. See the documentation for that one here: https://docs.python.org/3/library/csv.html