Search code examples
regexautohotkeyregex-groupregex-negation

Extract text starting from negated set up til (but not including) first occurance of @


good day community.

Say I have the following line:

    [ ] This is a sentence about apples. @fruit @tag

I wish to create a regex that can generically extract the portion: "This is a sentence about apples." only.

That is, ignore the [ ] before the sentence, and ignore @fruit @tag after.

What I have so far is: ([^\s*\[\s\]\s])(.*@)

Which is creating the following match: This is a sentence about apples. @fruit @

How would I match up to, but not including the first occurrence of @ symbol, while still negating [ ] pattern with ([^\s*\[\s\]\s]) group?

EDIT: Thanks to Wiktor Stribiżew for the critical piece to help:

RegExMatch(str, "O)\[\s*]\s*([^@]*[^@\s])", output)

Final code:

; Zim Inbox txt file
FileEncoding, UTF-8
File := "C:\Users\dragoon\Desktop\anki_cards.txt"

; sleep is necessary

;;Highlight line and copy
#IfWinActive ahk_exe zim.exe
{
clipboard=
sleep, 500
Send ^+c
ClipWait
Send ^{Down}
clipboardQuestion := clipboard
FoundQuestion := RegExMatch(clipboardQuestion,"O)\[\s*]\s*([^@]*[^@\s])",outputquestion)

clipboard=
sleep, 500
Send ^+c
ClipWait
clipboardAnswer := clipboard
FoundAnswer := RegExMatch(clipboardAnswer,"O)\[\s*]\s*([^@]*[^@\s])",outputanswer)

quotedQuestionAnswer := outputquestion[1] """" outputanswer[1] """"

Fileappend, %quotedQuestionAnswer%, %File%
}

What it does: In Zim Wiki notebook, on Windows, press Win+V hotkey over Question? in the following structure:

[ ] Question Header
    [ ] Question?
        [ ] Answer about dogs @cat @dog

This will result in the text being formatted as such in an external file:

Question?"Answer about dogs"

This is an acceptable format for Anki card importing, and can be used to quickly make cards from a review structure. Thanks again for all the help on my first SO question.


Solution

  • You can use

    \[\s*]\s*\K[^@]*[^@\s]
    

    See the regex demo. Details:

    • \[\s*]\s* - [, zero or more whitespaces, ], zero or more whitespaces
    • \K - "forget" what has just been matched
    • [^@]* - zero or more chars other than @
    • [^@\s] - a char other than @ and whitespace.

    Note that in AutoHotKey, you can also capture the part of a match if use Object mode:

    RegExMatch(str, "O)\[\s*]\s*([^@]*[^@\s])", output)
    

    The string you want to use is captured with Group 1 pattern (defined with a pair of unescaped parentheses) and you can access it via output[1]. See documentation:

    Object mode. [v1.1.05+]: This causes RegExMatch() to yield all information of the match and its subpatterns to a match object in OutputVar. For details, see OutputVar.