Search code examples
splitspotfire

split column values in the plot setting - spotfire


I am fairly new to spotfire and I have a basic question.

I have a table like this (2 columns, in the price column the values are separated by comma and I don't have for each supermarket the same number of prices):

supermarket  price
A            10,20,5,40,50
B            2,40,15
C            30,70,3,80,32,54,89,110
D            16,89,3,23,37,19

I would like to create a scatterplot, where on the x-axis there are the different supermarket and on the y-axis the different prices. I have no idea on how split these value (and convert them to numerical values) just in the plot (so using 'Edit Expression' feature). I looked at the spotfire documentation but I found only this information (https://docs.tibco.com/pub/sfire-cloud/latest/doc/html/en-US/TIB_sfire_bauthor-consumer_usersguide/bauthcons/topics/en-US/splitting_a_column.html) which is not useful for my task.

Thank you in advance for your help


Solution

  • From a similar question, creating a transformed table using a TERR script might help you:

    Prep data (separate/split cells into rows)

    If you don't want to use a TERR data function, there could be an alternative via an Iron Python script. I wanted to create the calculated columns then apply an unpivot transformation, unfortunately unpivot does not work with calculated columns. So my solution would be to create a set of calculated columns via the script, then also create a document property that contains the column names to use in a scatter plot (so you don't need the unpivot).

    The script (it looks complicated but it is simple really):

    -find the maximum possible number of prices hidden in the [price] column;

    -create as many calculated columns as the max number of prices (some will be empty);

    -fill a document property to contain the list of these new columns to use in the plot

    from Spotfire.Dxp.Data import CalculatedColumn, IndexSet, DataValueCursor
    
    #list of existing columns in data table called "data"
    mytablename='data'
    mytable = Document.Data.Tables[mytablename]
    cols = mytable.Columns
    
    #------------------------------
    #max_n_prices is calculated from all rows
    rowsToInclude = IndexSet(mytable.RowCount,True)
    n_values = list()
    cursor = DataValueCursor.CreateFormatted(mytable.Columns['price'])
    for row in mytable.GetRows(rowsToInclude,cursor):
        price = cursor.CurrentValue
        n_values.append (len(price.split(','))) #count the number of elements after splitting by comma
    
    max_n_prices = max(n_values)+1
    
    #------------------------------
    #add calculated columns returning different split positions
    #if we run out of commas on a specific row, it will simply return an empty cell
    new_cols=[]
    for i in range(1,max_n_prices):
        new_col_name = "price"+str(i)
        new_cols.append(new_col_name)   
        try:
            cols.AddCalculatedColumn(new_col_name,"Integer(split([price],',',"+str(i)+"))")     
        except:
            pass #if the column exists, do nothing
    
    #----------------------------
    #now create a document property with all the column names to plot
    new_cols1 = ["["+mytablename+"].["+x+"]" for x in new_cols]
    new_cols1=', '.join(new_cols1)
    
    #property must already exist
    Document.Properties['plotColumns']=new_cols1
    

    Run it, then use the document property 'plotColumns' as Y axis in the plot. The X axis needs to be <[supermarket] NEST [Axis.Default.Names]>