Search code examples
regexbashawkprintfsubdomain

Separate multiple subdomains into all possible subdomain combinations using bash and awk


I'm trying to separate multiple subdomains into all possible subdomain combinations using bash.

For example if subdomains.txt has:

www.ir.example.com
www.it.api4.qa.example.com
www.api.example2.com

The expected output has to be:

example.com
ir.example.com
www.ir.example.com
qa.example.com
api4.qa.example.com
it.api4.qa.example.com
example2.com
api.example2.com
www.api.example2.com

I think that the best idea is to use the . to separate the subdomains without breaking the original domain but i'm not sure how to achieve this, any help it would be great.


Solution

  • Using awk:

    awk 'BEGIN{FS=OFS="."}           # Set the input and output field separator to a dot
         {
            for(i=1;i<NF;i++) {      # Number of domains to print
              for(j=i;j<NF;j++)      # For each domain element
                d=d $j OFS;          # d is the domain
              a[d $NF]               # store it in the array a
              d=""                   # Reset the domain
            }
         }
         END{
           for(i in a)               # Loop through each element of the array a
             print i                 # and print it
         }' file
    

    Note the use of the array a is for having unique domain name (and not twice example.com).

    Note also the domain are not sorted, you may pipe the command through sort if needed.