Search code examples
pythoncdndetect

How can I filter the domains served by a CDN from a list of domain names?


I have a list of domains and I need to filter the domains served by a CDN(Content Delivery Network). I am going to use python script to do that. At the first I was thinking I can identify them from the domain name. But not all of the domain names have cdn keyword.

Is there any reason or any feature in the CDN served domains which I use that for identifying CDN served domains?


Solution

  • First of all, you can't do it with 100% accuracy.

    But you can identify domains using popular cloud providers in many cases by tracking CNAME records which would lead the respective provider's servers. I.e. here's a doc on Amazon CloudFront http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/CNAMEs.html

    In CloudFront, an alternate domain name, also known as a CNAME, lets you use your own domain name (for example, www.example.com) for links to your objects instead of using the domain name that CloudFront assigns to your distribution

    Example:

    dig -t CNAME c.amazon-adsystem.com
    c.amazon-adsystem.com.  896     IN      CNAME   d1ykf07e75w7ss.cloudfront.net.