I'm trying to write a script to detect and remove a #ifdef BUILD_FLAG ... #endif block from files with an optional comment block if it occurs right before it. So something like this will be removed
//this is s comment block
/**
*and a nested comment block
*/
//this is another comment block
#ifdef BUILD_FLAG
...
#endif
I'm trying to do it with this code
block_comment_pattern = r'/\*[\s\S]*?\*/|\/\/.*?$|\/\/.*?(\n|$)'
conditional_block_pattern = rf'#ifn?def\s+{re.escape(build_flag)}[\s\S])*?)\s*#endif'
pattern = rf'({block_comment_pattern})?{conditional_block_pattern}'
MATCHER = regex.compile(pattern, re.M)
However, it's only able to detect a part of the comment block and the conditional block
//this is another comment block
#ifdef BUILD_FLAG
...
#endif
When I tested the comment block pattern separately it was able to capture the whole comment block but not when combined with the conditional pattern. What is the better way/pattern to capture the example as mentioned above.
This is the demo of what I want to capture https://regex101.com/r/LLJV5i/1. In this demo, the whole comment block occurred right before the conditional block, both of them should be captured. (the comment block is optional, the conditional block is required)
Note:
Example 1:
//-------------------------------------------------------------------
// this whole comment block and the conditional block below should be captured
//-------------------------------------------------------------------
/**
* @brief Some comment
*
*
* @return
*/
//-------------------------------------------------------------------
#ifdef BUILD_FLAG
...
#endif
Example 2
//---------------------------------------------------------
// this whole comment block and the conditional block below should be captured
// --------------------------------------------------------
#ifdef BUILD_FLAG
...
#endif
Example 3
//----------------------------------------------------------------------
/**
* @comment: this whole block comment should be captured.
* @{
*/
#ifdef BUILD_FLAG
...
#endif
You could replace matches of the following regular expression (with g
and m
flags set) with empty strings:
(?:^ *(?:/\*\*[^/\n]*\r?\n(?:[^/\n]*\r?\n)*[^/\n]*\*/ *|//.*)\r?\n)+(?: *[\r?\n])*#ifdef BUILD_FLAG\r?\n[\s\S]*?^#endif\r?\n
The expression can be broken down as follows (as well, hover the cursor over each part of the expression at the link to obtain an explanation of its function).
(?: # begin non-capture group
^ # match beginning of line
[ ]* # match >= 0 spaces
(?: # begin non-capture group
/\*\*[^/\n]*\r?\n # match '/**' followed by >= 0 chars other than '/' and
# newlines followed by the line terminator
(?: # begin a non-capture group
[^/\n]*\r?\n # match >= 0 chars other than '/' and newline then line term
) # end capture group
* # match the preceding non-capture group >= 0 times
[^/\n]* # match >= 0 chars other than '/' and newline
\*/[ ]* # match '*/' followed by >= 0 spaces
| # or
//.* # match '//' followed by >= 0 chars other than line terms
) # end non-capture group
\r?\n # match line terminator
) # end non-capture group
+ # match preceding non-capture group >= 1 times
(?:[ ]*[\r?\n])* # match >= 0 lines containing zero or more spaces
#ifdef BUILD_FLAG\r?\n # match literal line
[\s\S]* # match >= 0 any chars
? # match as few preceding tokens as possible
#endif\r?\n # match literal at end of line
Note: I've represented spaces as character classes containing a space ([ ]
) to make them visible.