Search code examples
macossed

Sed command with regex: "unterminated regular expression". Max length on regex?


I try to do a sed command with a very long regex

But I get the error

sed: 1: "/LOCK TABLES .(long_tab ...": unterminated regular expression

long_regex='/LOCK TABLES .(long_table_name_here_01|long_table_name_here_02|long_table_name_here_03|long_table_name_here_04|long_table_name_here_05|long_table_name_here_06|long_table_name_here_07|long_table_name_here_08|long_table_name_here_09|long_table_name_here_10|long_table_name_here_11|long_table_name_here_12|long_table_name_here_13|long_table_name_here_14|long_table_name_here_15|long_table_name_here_16|long_table_name_here_17|long_table_name_here_18|long_table_name_here_19|long_table_name_here_20|long_table_name_here_21|long_table_name_here_22|long_table_name_here_23|long_table_name_here_24|long_table_name_here_25|long_table_name_here_26|long_table_name_here_27|long_table_name_here_28|long_table_name_here_29|long_table_name_here_30|long_table_name_here_31|long_table_name_here_32|long_table_name_here_33|long_table_name_here_34|long_table_name_here_35|long_table_name_here_36|long_table_name_here_37|long_table_name_here_38|long_table_name_here_39|long_table_name_here_40|long_table_name_here_41|long_table_name_here_42|long_table_name_here_43|long_table_name_here_44|long_table_name_here_45|long_table_name_here_46|long_table_name_here_47|long_table_name_here_48|long_table_name_here_49|long_table_name_here_50|long_table_name_here_51|long_table_name_here_52|long_table_name_here_53|long_table_name_here_54|long_table_name_here_55|long_table_name_here_56|long_table_name_here_57|long_table_name_here_58|long_table_name_here_59|long_table_name_here_60|long_table_name_here_61|long_table_name_here_62|long_table_name_here_63|long_table_name_here_64|long_table_name_here_65|long_table_name_here_66|long_table_name_here_67|long_table_name_here_68|long_table_name_here_69|long_table_name_here_70|long_table_name_here_71|long_table_name_here_72|long_table_name_here_73|long_table_name_here_74|long_table_name_here_75|long_table_name_here_76|long_table_name_here_77|long_table_name_here_78|long_table_name_here_79|long_table_name_here_80|long_table_name_here_81|long_table_name_here_82|long_table_name_here_83|long_table_name_here_84|long_table_name_here_85|long_table_name_here_86|long_table_name_here_87|long_table_name_here_88|long_table_name_here_89|long_table_name_here_90|long_table_name_here_91|long_table_name_here_92|long_table_name_here_93|long_table_name_here_94|long_table_name_here_95|long_table_name_here_96|long_table_name_here_97|long_table_name_here_98|long_table_name_here_99|long_table_name_here_100). WRITE;/,/^UNLOCK TABLES;/d'

echo 'teststring' | sed "$long_regex"

If I reduce the length of the regex, I have no problem. So I assume there is a max length or something. Any ideas?

I'm on a Mac and my sed version is 13.4.1


Solution

  • I guess I found that sed has limitations on my Mac:

    https://www.gnu.org/software/sed/manual/sed.html#Limitations

    8 GNU sed’s Limitations and Non-limitations For those who want to write portable sed scripts, be aware that some implementations have been known to limit line lengths (for the pattern and hold spaces) to be no more than 4000 bytes. The POSIX standard specifies that conforming sed implementations shall support at least 8192 byte line lengths. GNU sed has no built-in limit on line length; as long as it can malloc() more (virtual) memory, you can feed or construct lines as long as you like.
    However, recursion is used to handle subpatterns and indefinite repetition. This means that the available stack space may limit the size of the buffer that can be processed by certain patterns.

    I solved my problem by using another command. The "awk" command with "gsub"

    # All the tables I want out of my sql
    all_tables='table_name_here_01|table_name_here_02|much_more_tables'
    
    # Put those in the full regex
    long_regex="/LOCK TABLES .($allTables1). WRITE;(.|\\\n)*?UNLOCK TABLES;/"
    
    # Use that regex with a find and replace with awk
    echo 'the_sql_data_string' | awk -v regex="$long_regex" '{gsub(regex, "");}1'
    

    It was not easy to use a variable in the awk command, but the example works!