Search code examples
regexsedvim

Regex to remove ANSI escape sequences from script output


I'm trying to write a regex to remove all that starts with '[' and ends with 'm' or ';' so that the following script will be purged of all its special chars:

Script started on 2023-09-16 10:06:45-04:00 [TERM="xterm-256color" TTY="/dev/pts/1" COLUMNS="204" LINES="55"]
^[[?2004h^[]0;matt@SERENITY: ~^G^[[01;32mmatt@SERENITY^[[00m:^[~^[[00m$ ls^M
^[[?2004l^M^[[0m^[Desktop^[[0m  ^[[01;34mDocuments^[[0m  ^[[01;34mDownloads^[[0m  ^[[01;34mmarkdown^[[0m  ^[[01;34mMusic^[[0m  ^[[01;34mnotes^[[0m  ^[[01;34mPictures^[[0m  ^[[01;34mPublic^[[0m  ^[[01;34mscripts^[[0m  ^[[01;34msnap^[[0m  ^[[01;34mTemplates^[[0m  typescript  ^[[01;34mVideos^[[0m^M
^[[?2004h^[]0;matt@SERENITY: ~^G^[[01;32mmatt@SERENITY^[[00m:^[~^[[00m$ ls^M
^[[?2004l^M^[[0m^[Desktop^[[0m  ^[[01;34mDocuments^[[0m  ^[[01;34mDownloads^[[0m  ^[[01;34mmarkdown^[[0m  ^[[01;34mMusic^[[0m  ^[[01;34mnotes^[[0m  ^[[01;34mPictures^[[0m  ^[[01;34mPublic^[[0m  ^[[01;34mscripts^[[0m  ^[[01;34msnap^[[0m  ^[[01;34mTemplates^[[0m  typescript  ^[[01;34mVideos^[[0m^M
^[[?2004h^[]0;matt@SERENITY: ~^G^[[01;32mmatt@SERENITY^[[00m:^[~^[[00m$ ls^M
^[[?2004l^M^[[0m^[Desktop^[[0m  ^[[01;34mDocuments^[[0m  ^[[01;34mDownloads^[[0m  ^[[01;34mmarkdown^[[0m  ^[[01;34mMusic^[[0m  ^[[01;34mnotes^[[0m  ^[[01;34mPictures^[[0m  ^[[01;34mPublic^[[0m  ^[[01;34mscripts^[[0m  ^[[01;34msnap^[[0m  ^[[01;34mTemplates^[[0m  typescript  ^[[01;34mVideos^[[0m^M
^[[?2004h^[]0;matt@SERENITY: ~^G^[[01;32mmatt@SERENITY^[[00m:^[~^[[00m$ exit^M
^[[?2004l^Mexit^M

Script done on 2023-09-16 10:06:49-04:00 [COMMAND_EXIT_CODE="0"]

for now my command looks something like:

%s:/^M\|^G\|^[//g | %s/\[01;34m//g | %s/\[01;32m//g 

and while I'm still working on it, it removes a good chunk but it's quite repetitive which defeats the purpose of regex

Anyone know a quicker way to remove every substring that starts with '[' and ends with 'm' or ';'?

Output should look like:

Script started on 2023-09-16 10:06:45-04:00 [TERM="xterm-256color" TTY="/dev/pts/1" COLUMNS="204" LINES="55"]
matt@SERENITY: ~matt@SERENITY:~$ ls
Desktop  Documents  Downloads  markdown  Music  notes  Pictures  Public  scripts  snap  Templates  typescript  Videos 
matt@SERENITY: ~matt@SERENITY:~$ ls
Desktop  Documents  Downloads  markdown  Music  notes  Pictures  Public  scripts  snap  Templates  typescript  Videos 
matt@SERENITY: ~matt@SERENITY:~$ ls
Desktop  Documents  Downloads  markdown  Music  notes  Pictures  Public  scripts  snap  Templates  typescript  Videos 
matt@SERENITY: ~matt@SERENITY:~$ exit
exit

Script done on 2023-09-16 10:06:49-04:00 [COMMAND_EXIT_CODE="0"]

Solution

  • This should remove all ANSI codes, assuming you have properly constructed codes (escape character is literal and not its printable representation ^[):

    %s/\e\[[0-9;?]*[a-zA-Z]
    

    The same can be done with sed:

    sed 's,\x1B\[[0-9;?]*[a-zA-Z],,g'