Search code examples
lualua-patternsworld-of-warcraft

How to find a duplicate string with Pattern Matching?


I have a string similar to this:

[13:41:25] [100:Devnull]: 01:41:20, 13:41:21> |Hunit:Player-3693-07420299:DevnullYour [Chimaera Shot] hit |Hunit:Creature-0-3693-1116-3-87318-0000881AC4:Dungeoneer's Training DummyDungeoneer's Training Dummy 33265 Nature. 

In case you wonder, it's from World of Warcraft.

I'd like to end with something like this:

[13:41:25] [100:Devnull]: 01:41:20, 13:41:21> Your [Chimaera Shot] hit Dungeoneer's Training Dummy 33265 Nature. 

If you notice, "Dungeoneer's Training Dummy" is printed twice. I've managed to get rid of the first "|Hunit" portion with something like this:

str = "[13:41:25] [100:Devnull]: 01:41:20, 13:41:21> |Hunit:Player-3693-07420299:DevnullYour [Chimaera Shot] hit |Hunit:Creature-0-3693-1116-3-87318-0000881AC4:Dungeoneer's Training DummyDungeoneer's Training Dummy 33265 Nature."
str = string.gsub(str, "|Hunit:.*:.*Your", "Your")

Which returns this:

print(str)    # => [13:41:25] [100:Devnull]: 01:41:20, 13:41:21> Your [Chimaera Shot] hit |Hunit:Creature-0-3693-1116-3-87318-0000881AC4:Dungeoneer's Training DummyDungeoneer's Training Dummy 33265 Nature.

I then add a second gsub:

str = string.gsub(str, "|Hunit:.*:", "")
print(str) # => [13:41:25] [100:Devnull]: 01:41:20, 13:41:21> Your [Chimaera Shot] hit Dungeoneer's Training DummyDungeoneer's Training Dummy 33265 Nature.

But the double "Dungeoneer's Training Dummy" string is repeated, obviously.

How could I get rid of the duplicated string? This string can be anything else, in this case is "Dungeoneer's Training Dummy", but it can be the name of any other target.


Solution

  • You can try something like this:

    str = "[13:41:25] [100:Devnull]: 01:41:20, 13:41:21> Your [Chimaera Shot] hit Dungeoneer's Training DummyDungeoneer's Training Dummy 33265 Nature."
    -- find a string that starts with 'hit', has some number of non-digits
    -- and ends with one or more digit and one or more characters.
    -- these characters will be "captured" into three strings,
    -- which are then passed to the "replacement" function.
    -- the returned result of the function replaces the value in the string.
    str = str:gsub("(hit%s+)([^%d]+)(%d+.+)", function(s1, s2, s3)
        local s = s2:gsub("%s+$","") -- drop trailing spaces
        if #s % 2 == 0 -- has an even number of characters
        and s:sub(0, #s / 2) -- first half
        == -- is the same
        s:sub(#s / 2 + 1) -- as the second half
        then -- return the second half
          return s1..s:sub(#s / 2 + 1)..' '..s3
        else
          return s1..s2..s3
        end
      end)
    print(str)
    

    This prints: [13:41:25] [100:Devnull]: 01:41:20, 13:41:21> Your [Chimaera Shot] hit Dungeoneer's Training Dummy

    This code will attempt to extract the name of the target and check if the name is a full duplicate. If the match fails, it returns the original string.