I'm trying to extract some values from discord message in Zapier. The content of the message should look somewhat like this (almost like YAML):
channel: <#1234567890123456789>
as: Bot nicname
image: http://example.com/
content:
Hello, world!
Where image
and as
fields are optional.
I have created 2 regular expressions to fulfill this task:
Python:
import re
r = re.compile(r"(?:channel:)? ?<#(?P<channel>\d+)>\n+(?:as: ?(?P<as>.+)\n+)?(:?image: ?(?P<image>.+)\n+)?content:\n*(?P<content>[\s\S]+)")
JS:
let r = /(?:channel:)? ?<#(?<channel>\d+)>\n+(?:as: ?(?<as>.+)\n+)?(?:image: ?(?<image>.+)\n+)?content:\n*(?<content>[\s\S]+)/;
Testing regexes in regexr and pythex. Both work fine for me.
Then I entered them into Zapier:
_matched: false
output from the python code:
groups: null
id: <ID>
runtime_meta:
memory_used_mb: 57
duration_ms: 3
logs:
1. re.compile('(?:channel:)? ?<#(?P<channel>\\d+)>\\n+(?:as: ?(?P<nick>.+)\\n+)?(:?image: ?(?P<image>.+)\\n+)?content:\\n*(?P<content>[\\s\\S]+)')
2. 'channel: <#1234567890123456> \nas: bot nickname\ncontent:\nHello, world!'
3. None
(and later in the Run JavaScript with similar result)
When trying to debug it i removed image
part of the regexp (in the Text->Extract expression):
(?:channel:)? ?<#(?P<channel>\d+)>\n+(?:as: ?(?P<as>.+)\n+)?content:\n*(?P<content>[\s\S]+)
With the input:
channel: <#1234567890123456>
as: INFO
content:
Hello, world!
And the result was as expected:
output:
0: 1234567890123456
1: INFO
2: Hello, world!
_end: 68
_matched: true
_start: 0
as: INFO
channel: 1234567890123456
content: Hello, world!
You can match 0 or more whitespaces except newline using [^\S\r\n]*
. Use \s*
to match 0+ times any whitespace char including a newline.
Using ?
optionally matches a single space.
Depending on the string that you want to match, you can determine to match spaces on the same line, or if it is also accepted to cross newlines.
You might update the pattern to:
(?:channel:)?[^\S\r\n]*<#(?P<channel>\d+)>\s*\n(?:as:[^\S\r\n]*(?P<as>.+)\n+)?(:?image:[^\S\r\n]*(?P<image>.+)\n+)?content:\n*(?P<content>[\s\S]+)