Search code examples
amazon-web-servicesamazon-s3amazon-cloudfrontcdn

Is there any way to know more about "Not specified" CloudFront referrers? Are they a bad sign?


When looking at CloudFront logs, the most common referrer for my photo service CDN is Not Specified. I'm trying to understand more about why this is and possibly where the requests are coming from.

We are storing cloudFront logs for a photo service CDN. A client has requested to know where their data charges are coming from exactly. I was able to run a report through Athena which tells us each unique referrer and the amount of data they were delivered in the last month.

The #2 most costly referrer in terms of bytes is -. I believe CloudFront calls this referrer Not specified, and it is our #1 top referrer in terms of request count when checking CloudFront.

I am assuming users are not accessing these urls directly at this rate.

I have a few questions about this.

  1. Is there any way to get more information about where these particular requests originate from?

  2. Is there a common known source that would not provide a referrer? Am I misunderstanding what the referrer means here?

  3. Is this a bad sign?


Solution

  • Is there a common known source that would not provide a referrer? Am I misunderstanding what the referrer means here?

    I'll start here because this is important...

    referer (the RFC uses this misspelling) just means that the resource was accessed from a link, such as an anchor, or img tag. There won't be a referer header in many cases, including when a website specifies a referrer-policy that instructs the browser not to send the referer header when retrieving resources or opening links.

    You don't have control over whether the referer will be sent by browsers visiting other websites or not. The header may also be omitted when the resource is accessed directly. If you allow content to be accessed by POST requests, that may also cause the header to be missing.

    Some browsers (or browser extensions) will omit or obscure this header for privacy reasons.

    See also: Referer header: privacy and security concerns (Mozilla)

    Is this a bad sign?

    If by "bad sign" you mean an indication of malicious actors, no, not necessarily. If by "bad sign" you mean you can't provide robust and complete analytics for your site, yes.

    Is there any way to get more information about where these particular requests originate from?

    If you have it, you can reverse-lookup/geolocate the source IP address. See also Headers for determining the viewer's location.