Search code examples
graphicsstatisticsanalysisspss

SPSS: How to show custom data in an organized way


I imported data into IBM's SPSS related to the creation of obstacles in a video game, which adapt to the players performance. I tested 30 players.

First, the Non Adaptive method creates 30 maps with different obstacle configuration on each one. Each user plays the maps and the results regarding his performance and obstacle configuration are stored.

Then, the Adaptive method asks the user to chose between "easy", "normal" or "hard" maps, and creates 10 maps with that difficulty which the user plays and his/her performance is evaluated. The game then asks again 2 more times and repeats the process in order to complete the 30 maps of the Adaptive method.

Here's an example of the data imported on SPSS pertinent to my current problem. The example shows 1 player:

enter image description here

This is the information that the example could tell:

Player 0 on the first 30 maps had poor performance so for the following 10 maps he chooses the easy difficulty. He performs better on those 10 maps so for the next 10 he chooses the hard difficulty. He performs average on those so for the final 10 he chooses normal difficulty.


The problem

I don't know exactly how to analyze the data. But I want to create a table, or a graph if possible, that shows if there is a relationship between the users previous average performance and their current chosen difficulty.

The previous table doesn't work because even if there are few players (30), each one plays 60 maps. Also I need to show the performance average or any aditional info I can. Showing the individual performance for each map doesn't work.


What I've manage to do so far

This isn't even close to the solution I want. It's just me messing with the data trying to show something similar to what I want.

  • First I added a new variable called CreationMethod2, which transforms the Adaptive method into "Adaptive 1", "Adaptive 2" or "Adaptive 3" considering which map is being played.
  • Then I modified the data deleting the map results for every player except 1.
  • And lastly I created a Crosstabulation Table with The ChosenDifficulty on rows, Performance on columns and CreationMethod2 on layer.

This is the table:

enter image description here

As you can see, I now have an organized table that shows me the method on the left, and next to it, the difficulty that was selected for those 10 maps. And on the top I have the performance of the maps.

I would like however, to show only the average of the performance for each method. And obviously I want the analysis to work without me deleting any data.

I have no problem however, in showing the results separately for each user, since 30 relatively low. However if there's an optimal way of showing the analysis with a graphic and also including all the users, then I'm all in.

Thanks for hearing me out.


Solution

  • It sounds like you want to aggregate your data to the PlayerId level. For each PlayerdId we want to know the average Performance associated with the four methods (NonAdaptive, Adaptive1, Adaptive2, Adaptive3).

    To do this we can use AGGREGATE, but first let's create the variables we want to use for our aggregation.

    IF (CreationMethod2 = 'Non Adaptive') Performance_NA = Performance .
    IF (CreationMethod2 = 'Adaptive 1') Performance_A1 = Performance .
    IF (CreationMethod2 = 'Adaptive 2') Performance_A2 = Performance .
    IF (CreationMethod2 = 'Adaptive 3') Performance_A3 = Performance .
    
    IF (CreationMethod2 = 'Non Adaptive') ChosenDiff_NA = ChosenDifficulty .
    IF (CreationMethod2 = 'Adaptive 1') ChosenDiff_A1 = ChosenDifficulty .
    IF (CreationMethod2 = 'Adaptive 2') ChosenDiff_A2 = ChosenDifficulty .
    IF (CreationMethod2 = 'Adaptive 3') ChosenDiff_A3 = ChosenDifficulty .
    EXE .
    

    We create these variables before aggregating because the Aggregate subcommands don't offer a conditional option for us to use directly on the Performance and ChosenDifficulty variables.

    AGGREGATE
     /OUTFILE = 'data by PlayerId.sav'
     /BREAK = PlayerId
     /Performance_NA = MEAN(Performance_NA)
     /Performance_A1 = MEAN(Performance_A1)
     /Performance_A2 = MEAN(Performance_A2)
     /Performance_A3 = MEAN(Performance_A3)
     /ChosenDiff_NA = FIRST(ChosenDiff_NA)
     /ChosenDiff_A1 = FIRST(ChosenDiff_A1)
     /ChosenDiff_A2 = FIRST(ChosenDiff_A2)
     /ChosenDiff_A3 = FIRST(ChosenDiff_A3) .
    

    From here, grab your new dataset and run MEANS or CROSSTABS to summarize the data.

    DATASET CLOSE * .
    GET FILE 'data by PlayerId.sav' .
    

    You'll still have to decide how to approach data summary for players that chose different difficulty settings (including the sequence of choices, if that's important). That said, if you plan to profile each player individually, you shouldn't run into any issues.