Search code examples
regexawksed

Matching words that are not inside any comment with regular expressions and Unix tools


How can I use sed, awk or something else to match for the occurences of the word type, where the lines containing type do not have any occurences of ! before the occurence of type? Then replace type with something else?

So I am asking to match the occurences of word type which aren't inside Fortran 90 comments.

EDIT

  • Multiple occurences of the word on the same line, before ! should also be replaced.
  • ! does not function as comment character when inside single or double quotes, so occurences following "!", '!', "'!'" should also be replaced. I think this makes the task quite complicated.
  • Words that contain type should not be changed, like footype.

Possible solution:

awk -F '!' -v OFS='!' '{ gsub("\\<type\\>", "replacement", $1) } 1' file

seems to solve the issue, but it still cannot handle the ! inside quotes.

Minimal example

  type = 2 +type
  type type 
  lel = "!" type
  lel = '!' type
  lel = "'!'" type
  ! type=2
type
footype

Should turn into:

  replacement = 2 +replacement
  replacement replacement 
  lel = "!" replacement
  lel = '!' replacement
  lel = "'!'" replacement
  ! type=2
replacement
footype

Solution

  • EDIT: Given the new constraints, and assuming your comment !'s are always encapsulated by spaces and the quoted !'s are not, a minor change to tripleee's answer would work:
    Include the encapsulating spaces in the field separator.

    Test case: ( with random occurrences of the conditions thrown in )

     odd= (/ '!'1,3,5,7,9 /)  ! array assignment
     even=(/ 2,4,6,8,10 /) ! array assignment
     a=1"'!'"write         ! testing: write
     b=2
     c=a+b+e      ! element by element assignment
     c(odd)=c(even)-1  ! can use arrays of indices on both sides
     d=sin(c)     ! element by element application of intrinsics
     write(*,*)d
     write(*,*)abs(d)  ! many intrinsic functions are generic 
     write(*,*)abs(d)write  ! many intrinsic functions are generic
     write(c=a+b+e)      ! element by write element assignment
     write(*,*)abs("!"d)write  ! many intrinsic functions are generic
    

    Command and output:

    $ awk -F ' ! ' -v OFS=' ! ' '{ gsub("write", "replacement", $1) } 1' type
    
     odd= (/ '!'1,3,5,7,9 /)  ! array assignment
     even=(/ 2,4,6,8,10 /) ! array assignment
     a=1"'!'"replacement         ! testing: write
     b=2
     c=a+b+e      ! element by element assignment
     c(odd)=c(even)-1  ! can use arrays of indices on both sides
     d=sin(c)     ! element by element application of intrinsics
     replacement(*,*)d
     replacement(*,*)abs(d)  ! many intrinsic functions are generic 
     replacement(*,*)abs(d)replacement  ! many intrinsic functions are generic
     replacement(c=a+b+e)      ! element by write element assignment
     replacement(*,*)abs("!"d)replacement  ! many intrinsic functions are generic