Search code examples
pythonregexnuke

Regex for string before curly brackets with specific white-spacing


I very much suck at regex and have been banging my head against a wall for a few days. I'm trying to pull out info from a .nk nuke script with Python. The formatting is as follows:

> BackdropNode {
 inputs 0
 name BackdropNode6
 tile_color 0x555555ff
 label RDN
 note_font_size 42
 xpos -1136
 ypos -4272
 bdwidth 451
 bdheight 529
}
Write {
 file "\[value project_directory]/_output_/\[string range \[file tail \[value root.name]] 0 20].mov"
 colorspace sRGB
 raw true
 file_type mov\{(.*?)\}
 mov64_format "mov (QuickTime / MOV)"
 mov64_codec AVdh
 mov64_dnxhd_codec_profile "DNxHD 422 8-bit 145Mbit"
 mov64_dnxhr_codec_profile "SQ 4:2:2 8-bit"
 mov64_pixel_format {{0} "yuv420p\tYCbCr 4:2:0 8-bit"}
 mov64_quality High
 mov64_advanced 1
 mov64_write_timecode true
 mov64_gop_size 12
 mov64_b_frames 0
 mov64_bitrate 20000
 mov64_bitrate_tolerance 40000000
 mov64_quality_min 2
 mov64_quality_max 31
 render_order 3
 checkHashOnRead false
 version 132
 in_colorspace scene_linear
 out_colorspace scene_linear
 name Write2
 xpos -1025
 ypos 2230
}

There is a one word string, space, open curly then everything til the new line curly bracket. I want to match the first string and then everything inside the curly bracket.

I've gotten here:

^([a-zA-Z]+\s)+(\{[^}]*.*\})$

which gets me to the first close curly bracket. I'd like it to go all the way until the last curly bracket.

So match 1 would be BackdropNode {...} with group 1 of BackdropNode and group 2 of {...}. Then match 2 would be Write {...} with group 1 of Write and group 2 of {...}.


Solution

  • You can use

    (?sm)^([a-zA-Z]+)\s*\{(.*?)}(?=\n[a-zA-Z]+\s*{\n|\Z)
    

    See the regex demo.

    Details:

    • (?sm) - re.S / re.DOTALL + re.M / re.MULTILINE inline modifier
    • ^ - start of a line
    • ([a-zA-Z]+) - Group 1: one or more letters
    • \s* - zero or more whitespaces
    • \{ - a { char
    • (.*?) - Group 2: any zero or more chars as few as possible
    • } - a } char
    • (?=\n[a-zA-Z]+\s*{\n|\Z) - a positive lookahead that requires either a newline and then one or more letters, zero or more whitespaces, { and a newline, or end of the whole string.