Search code examples
nginxloggingservernginx-reverse-proxynginx-config

Nginx - Hide/mask/change value in logs based on Regex match


MY API requires that an email address is used in the path of a request as the identifier of the resource, for example:

/api/users/[email protected]

Due to concerns with Personally Identifiable Information (PII) being stored in the logs, I'm looking for a method to turn the following log:

{ip_address} - - [04/Mar/2021:11:22:22 +0000] "GET /api/users/[email protected] HTTP/1.1" 200 961 "{request_from}" etc.

Into something like the following:

{ip_address} - - [04/Mar/2021:11:22:22 +0000] "GET /api/users/*email_redacted* HTTP/1.1" 200 961 "{request_from}" etc.

I mentioned in the title doing this based on a Regex match as it seems like the obvious way to detect that an email is in the initial log.

I am very new to Nginx so straight-forward concise responses would be really appreciated. Many thanks in advance!


Solution

  • The access log is controlled by the access_log and log_format directives (see this document for details).

    By default, the access log records the value of the $request variable, which contains the string you wish to change.

    You can use a map statement to change the text of the $request variable, and use a log_format statement to define a new format for the logfile which uses the redacted value. See this document for details.

    For example:

    map $request $redacted {
        default $request;
        ~^(?<prefix>.*)pattern(?<suffix>.*)$ $prefix*email_redacted*$suffix;
    }
    
    log_format redacted '$remote_addr - $remote_user [$time_local] '
        '"$redacted" $status $bytes_sent '
        '"$http_referer" "$http_user_agent" "$gzip_ratio"';
    
    access_log /var/log/nginx/access.log redacted;
    

    Replace "pattern" above with a regular expression that matches any legal email address.