Search code examples
regexbashgrepgnu

grep regex match only if a line doesn't start by


i put a test file at the bottom of this post with a expected result

What i want


I have some file in folder like that for example :

src
├── app
│   ├── app.controller.ts
│   ├── app.module.ts
│   ├── app.service.ts
│   └── interceptor
│       └── json-api.interceptor.ts
├── auth
│   ├── auth.controller.ts
│   ├── auth.module.ts
│   ├── auth.service.ts
│   ├── decorator
│   │   ├── auth-user.decorator.ts
│   │   ├── is-secret.decorator.ts

I want a script that retrieves my jsdoc into this file plus the name for the function concerned by the jsdoc and insert the result into a .md file.

The jsdoc start always in a new line by /** and end always in a new line by */.

The line between the start & the end is always started by a * .

And after the end line of the jsdoc we have @ or [A-Za-z]. I want to match this line only if the pattern is not equal to @.

Example :

/** << start
 *  << the line between
 */ << end
@ xxxxxx << possible negative value
function xxxxx << possible positive value
const xxxx << possible positive value
xxxxx << possible positive value

but i don't want to retrieve the following pattern :

/** something */
@ or xxxxxx

My research


I started with the jsdoc :

grep -Pro "(\/\*\*$)|(^\s+\*\s.*)|(^\s+\*\/$)" test.txt

the result is ok :

➜ grep -Pro "(\/\*\*$)|(^\s+\*\s.*)|(^\s+\*\/$)" test.txt
/**
   * check if the user level is super admin
   * @returns {boolean} true if the user has the right to access super admin endpoints
   */
/**
   * check if the user level is super admin
   * @returns {boolean} true if the user has the right to access super admin endpoints
   */
/**
   * check if the user level is super admin
   * @returns {boolean} true if the user has the right to access super admin endpoints
   */

Now i want to have the line after if this line not start by @, for that i have made this regex

((\s*)([^@]|\w)(.*))

but this not work at all.

If i use a negative lookahead (\s*)(?![@])(.+) the console tell me event not found: [@])(.+).

I'm pretty lost, if you have any idea how to do that thanks. If you want more infos tell me.

test file & expected result


➜ cat test.txt 
// case 1
  /**
   * check if the user level is super admin
   * @returns {boolean} true if the user has the right to access super admin endpoints
   */
  @UseGuards(AuthGuard('jwt'), LevelsGuard)
  @Levels(LevelEnum.superadmin)
  @Get('check/superadmin')
  @ApiBearerAuth()
  checkSuperAdminLevel(): boolean {
    return true;
  }

// case 2
  /**
   * check if the user level is super admin
   * @returns {boolean} true if the user has the right to access super admin endpoints
   */
  @Get('check/superadmin')
  @ApiBearerAuth()
  checkSuperAdminLevel(): boolean {
    return true;
  }

// case 3
  /**
   * check if the user level is super admin
   * @returns {boolean} true if the user has the right to access super admin endpoints
   */
  checkSuperAdminLevel(): boolean {
    return true;
  }

// case 4
  /** lorem ipsum */

// case 5
  lorem ipsum

the expected result

// case 1
/**
 * check if the user level is super admin
 * @returns {boolean} true if the user has the right to access super admin endpoints
 */
checkSuperAdminLevel(): boolean {

// case 2
/**
 * check if the user level is super admin
 * @returns {boolean} true if the user has the right to access super admin endpoints
 */
checkSuperAdminLevel(): boolean {

// case 3
/**
 * check if the user level is super admin
 * @returns {boolean} true if the user has the right to access super admin endpoints
 */
checkSuperAdminLevel(): boolean {

// case 4
nothing

// case 5
nothing

Solution

  • Since you are using a GNU grep, it is possible to achieve what you want extending your regex a bit more.

    First, add -z option, it will allow slurping the file into a single string input (grep pattern will "see" line breaks).

    Second, you need to make sure the $ anchors are matching end of lines, not just the whole string, so you need the multiline modifier, (?m).

    Third, linebreaks need to be matched, too, to have line breaks in the output, thus, right at the end of each alternative, you need to put \n?, an optional newline.

    Fourth, as this is a PCRE pattern, it will support \h construct, that matches any horizontal whitespace. This is a handy pattern when your regex can match across lines. Note \s matches line breaks, and this might result in unwelcome matches. Hence, all \s are replaced with \h.

    Fifth, as the pattern will consume the */ line, and you want to start looking for a line not starting with @ only right below that line, you need a positive lookbehind, a non-consuming pattern.

    So, the grep command will look like

    grep -zroP '(?m)/\*\*$\n?|^\h+\*\h.*$\n?|^\h+\*/$\n?|(?<=\*/\n)(?:\h+@.*\n)*\K.+\n?' test.txt
    

    The (?<=\*/\n)(?:\h+@.*\n)*\K.+\n? alternatives does this:

    • (?<=\*/\n) - finds the position that is immediately preceded with */ and a newline char
    • (?:\h+@.*\n)* - matches and consumes any zero or more repetitions of
      • \h+ - one or more horizontal whitespaces
      • @ - a @ char
      • .*\n - the rest of the line with a newline (LF) char
    • \K - match reset operator that discards text matched so far from the overall match memory buffer
    • .+ - the non-empty line
    • \n? - an optional LF (line feed) char.

    See the regex demo.