How to remove redundant www subdomain in a shell script (sed/awk/etc)?

I need to remove redundant "www." prefix in an ever-growing HUGE list of domains. Here's the sample:

# Type 1
domain1.tld
# Type 2
domain2.tld
www.domain2.tld
# Type 3
www.domain3.tld
sub.domain3.tld
foo.domain3.tld
www.sub.domain3.tld

# Expected
domain1.tld
domain2.tld
www.domain3.tld
sub.domain3.tld
foo.domain3.tld

The only thing that worked took forever since the list already contains more than 2 million lines.

cp 1.txt 2.txt
while read line; do
  sed "/www.$line/d" -i 2.txt
done < 1.txt

I'm using GNU utils and already fooled around with sed, awk, comm to no avail.

How can this be done?

Solution

#! /bin/bash

awk -F. '{
    if($1 != "www")
    {
        arr[$0]=1
    }
    else
    if(arr[substr($0,5)] == 1)
    {
        next
    }
    print
}' file

Check this out, although I am not sure how it would work in case of 2 million records.

UPDATE:

Explanation: The awk expression uses . as field separator, so suppose if line is www.sub.domain3.tld, $1=www, $2=sub …

It flags all lines which don't start with www by making them index in array arr. Suppose line is sub.domain3.tld, it will make it index in arr[sub.domain3.tld] and stores e in it. Now for every line starting with www., it strips the www. and checks if the remaining line is stored in array, if yes, the line is not printed.

UPDATE:

This would produce the result independent of the order in which input is supplied, although the output is in jumbled sequence:

#! /bin/bash

awk -F. '{
    if ($1 != "www") {
        domains["www."$0]=0
        domains[$0]=1
    }
    else {
        if (domains[$0] == ""){ domains[$0]=1 }
    }
}
END {
    for (domain in domains) {
        if (domains[domain]) { print domain }
    }
}' file

This should produce the result in correct sequence independent of the order in which input is supplied:

#! /bin/bash

awk -F. '{
    if ($1 != "www") {
        redundant_domains["www."$0]=1
    }
    domains[NR]=$0
}
END {
    for (i=1 ; i < NR ; ++i) {
        if (!redundant_domains[domains[i]]) { print domains[i] }
    }
}' file