Search code examples
typescriptloops

Typescript how to extract multiple literal from a string


type X<T>=T extends `${string}${'*('}${infer A}${')+'}${string}${'*('}${infer A}${')+'}${string}`?A:never

type Y=X<'g*(a12)+gggggg*(h23)+'> // 'a12' | 'h23'

type z=X<'g*(a12)+gggggg*(h23)+gggggg*(5hgf)+'> // 'a12' | 'h23' but without '5hgf'

My objective is to extract all literals from a string based on certain pattern. In the code above, I wanna extract literals with prefix '*(' and postfix ')+'. Hence, type Y = 'a12' | 'h23'.

But the problem is a string can have any number of matched-pattern-literal and I do not know how to let typescript to extract all the matched literals.

For the code example, Typescript only extracts 2 literals because I write two times of ${string}${'*('}${infer A}${')+'}. If I write type z=X<'g*(a12)+gggggg*(h23)+gggggg*(5hgf)+'>, I still get type Z = 'a12' | 'h23''. Ideally, I should get type Z = 'a12' | 'h23'|''5hgf''

How to make typescript to 'iterate' through the string to get all the desired literals?

Thank you!


Solution

  • There's no iterative approach to this, but you can use an equivalent recursive approach. And if you write it as a tail-recursive form, then it will work for quite long strings, if you have them. Here's one way to do it:

    type X<T extends string, A extends string = never> =
        T extends `${string}${'*('}${infer U}${')+'}${infer R}` ?
        X<R, A | U> : A;
    

    We're still writing X<T> to parse string via template literal types, but now there's an accumulator type parameter A to keep track of the union of matches we've already collected. If you are given a string that matches your pattern, the type U will be the first matching portion, and the type R will be the rest of the string. Then we recurse with X<R, A | U>, which adds U to the existing A accumulator. Otherwise there are no matches, and we return A.

    Let's test it out:

    type Y = X<'g*(a12)+gggggg*(h23)+'>
    //   ^? type Y = "a12" | "h23"
    type Z = X<'g*(a12)+gggggg*(h23)+gggggg*(5hgf)+'>
    //   ^? type Z = "a12" | "h23" | "5hgf"
    

    Looks good.

    Playground link to code