Search code examples
regexmatlabstrsplit

How to write REGEXP for multiple groups? (MATLAB)


I am trying to parse a text document in MATLAB and split it into sections using strsplit. Each section is demarcated by a string that looks like this:

483.3731    EXP     New trial (rep=0, index=1): {u'selectPlayer': 1, u'neutralStim4': u'PSYCHOLOGIST', u'neutralStim2': u'PONG', u'neutralStim': u'YEN', u'positiveStim': u'PEACEFUL', u'neutralStim3': u'DASH', u'positiveStim3': u'HAPPY', u'positiveStim2': u'HONORABLE', u'negativeStim': u'BETRAYAL', u'negativeStim4': u'GUNPOINT', u'negativeStim3': u'JEALOUS', u'negativeStim2': u'MISTRUST', u'positiveStim4': u'AWESOME'}

Each section contains some variation of a number followed by 'EXP' then 'New trial' and a variable string i.e. (rep=1, index=1: {u'SelectPlayer': 2, ..).

I currently have the following code to try to parse this document but I can't get it to work!

expr = '\n\d+\s*EXP\s*New Trial\s*\w+\n';
filecontents = fileread('LAILA_exp1_noCBB_fMRIsync_2014_Oct_28_1239.log');
filecontents = strsplit(filecontents,expr,'DelimiterType','RegularExpression');

I've tried multiple variations of this regular expression but I just keep getting a single cell array containing the entire file as a string. Could anyone lend a pointer on how to write a regular expression for a string containing multiple groups such as this one?

Thanks, Shady


Solution

  • \d+(?:\.\d+)?\s+EXP\s+New\strial\s\([^)]*\):\s+{[^}]+}
    

    You can use this to match this sort of string.See demo.

    http://regex101.com/r/hQ9xT1/33