Search code examples
pythonarraysnumpyroot-frameworkrootpy

Python array of strings in a ROOT TTree


I'm doing some work with CERN's pyROOT module, and I'm trying to store an array of strings as a leaf in a binary tree. In order to do so, I have to pass it an array, obviously, using not lists or dictionaries, but the array module. The module supports standard C arrays, of characters, integers, and so forth, but does anyone know of a way I can nest them in order to have an array of strings, or, effectively, an array of character arrays? Or have I gone too far and I need to take a step back from the keyboard for a while :)?

Code:

import ROOT

rowtree = ROOT.TTree("rowstor", "rowtree")

ROOT.gROOT.ProcessLine(
    "struct runLine {\
    Char_t test[20];\
    Char_t test2[20];\
    };" );
from ROOT import runLine
newline = runLine()
rowtree.Branch("test1", newline, "test/C:test2")

newline.test = ["AbcDefgHijkLmnOp","aaaaaaaaaaaaaaaaaaa"]

rowtree.Fill()

Error:

python branchtest
Traceback (most recent call last):
  File "branchtest", line 14, in <module>
    newline.test = ["AbcDefgHijkLmnOp","aaaaaaaaaaaaaaaaaaa"]
TypeError: expected string or Unicode object, list found

I'm wondering if it's possible to turn the list shown in this example into an array of strings.


Solution

  • A char array and a Python list of Python strings are two very different things.

    If you want a branch containing a char array (one string) then I suggest using Python's built-in bytearray type:

    import ROOT
    # create an array of bytes (chars) and reserve the last byte for null
    # termination (last byte remains zero)
    char_array = bytearray(21)
    # all bytes of char_array are zeroed by default here (all b'\x00')
    
    # create the tree
    tree = ROOT.TTree('tree', 'tree')
    # add a branch for char_array
    tree.Branch('char_array', char_array, 'char_array[21]/C')
    # set the first 20 bytes to characters of a string of length 20
    char_array[:21] = 'a' * 20
    # important to keep the last byte zeroed for null termination!
    tree.Fill()
    tree.Scan('', '', 'colsize=21')
    

    The output of tree.Scan('', '', 'colsize=21') is:

    ************************************
    *    Row   *            char_array *
    ************************************
    *        0 *  aaaaaaaaaaaaaaaaaaaa *
    ************************************
    

    So we know the tree is accepting the bytes correctly.

    If you want to store a list of strings, then I suggest using a std::vector<std::string>:

    import ROOT
    
    strings = ROOT.vector('string')()
    
    tree = ROOT.TTree('tree', 'tree')
    tree.Branch('strings', strings)
    strings.push_back('Hello')
    strings.push_back('world!')
    tree.Fill()
    tree.Scan()
    

    The output of tree.Scan() is:

    ***********************************
    *    Row   * Instance *   strings *
    ***********************************
    *        0 *        0 *     Hello *
    *        0 *        1 *    world! *
    ***********************************
    

    In a loop you would want to strings.clear() before filling with a new list of strings in the next entry.

    Now, the rootpy package (also see the repository on github) provides a better way of creating trees in Python. Here is an example of how you can use char arrays in a "friendlier" way with rootpy:

    from rootpy import stl
    from rootpy.io import TemporaryFile
    from rootpy.tree import Tree, TreeModel, CharArrayCol
    
    class Model(TreeModel):
        # define the branches you want here
        # with branchname = branchvalue
        char_array = CharArrayCol(21)
        # the dictionary is compiled and cached for later
        # if not already available
        strings = stl.vector('string')
    
    # create the tree inside a temporary file
    with TemporaryFile():
        # all branches are created automatically according to your model above
        tree = Tree('tree', model=Model)
    
        tree.char_array = 'a' * 20
        # attemping to set char_array with a string of length 21 or longer will
        # result in a ValueError being raised.
        tree.strings.push_back('Hello')
        tree.strings.push_back('world!')
        tree.Fill()
        tree.Scan('', '', 'colsize=21')
    

    The output of tree.Scan('', '', 'colsize=21') is:

    ***********************************************************************
    *    Row   * Instance *            char_array *               strings *
    ***********************************************************************
    *        0 *        0 *  aaaaaaaaaaaaaaaaaaaa *                 Hello *
    *        0 *        1 *  aaaaaaaaaaaaaaaaaaaa *                world! *
    ***********************************************************************
    

    See another example of using TreeModels with rootpy here:

    https://github.com/rootpy/rootpy/blob/master/examples/tree/model_simple.py