Search code examples
cluster-analysisk-meansspssstatistics

Save Cluster Variables / Variable PSPP


I am using PSPP (NOT SPSS since I can't get that running on my Ubuntu machine) and having my set of ~100k records clustered with a k-means cluster. Now what I really need is a more detailed output than just how many records are in each cluster. I need the cluster variable saved i.e.

row 1 => cluster 1

row 2 => cluster 4

row 3 => cluster 1

etc...

Essentially I need the extra field that saves the resulting cluster affinity of each record. My current syntax is:

QUICK CLUSTER  cat1 cat2 cat3 cat4 cat5 cat6 cat7 cat8 cat9 cat10 cat11 cat12
/CRITERIA=CLUSTERS(12) MXITER(100000000).

SPSS and PSPP share a lot of the same syntax so if there is an option in SPSS it might work here too.


Solution

  • Statistics should run on Ubuntu, but the Statistics QUICK CLUSTER command has a subcommand

    /SAVE CLUSTER

    that should do what you want. You can optionally specify a variable name in parentheses after CLUSTER.