MY API requires that an email address is used in the path of a request as the identifier of the resource, for example:
/api/users/[email protected]
Due to concerns with Personally Identifiable Information (PII) being stored in the logs, I'm looking for a method to turn the following log:
{ip_address} - - [04/Mar/2021:11:22:22 +0000] "GET /api/users/[email protected] HTTP/1.1" 200 961 "{request_from}" etc.
Into something like the following:
{ip_address} - - [04/Mar/2021:11:22:22 +0000] "GET /api/users/*email_redacted* HTTP/1.1" 200 961 "{request_from}" etc.
I mentioned in the title doing this based on a Regex match as it seems like the obvious way to detect that an email is in the initial log.
I am very new to Nginx so straight-forward concise responses would be really appreciated. Many thanks in advance!
The access log is controlled by the access_log
and log_format
directives (see this document for details).
By default, the access log records the value of the $request
variable, which contains the string you wish to change.
You can use a map
statement to change the text of the $request
variable, and use a log_format
statement to define a new format for the logfile which uses the redacted value. See this document for details.
For example:
map $request $redacted {
default $request;
~^(?<prefix>.*)pattern(?<suffix>.*)$ $prefix*email_redacted*$suffix;
}
log_format redacted '$remote_addr - $remote_user [$time_local] '
'"$redacted" $status $bytes_sent '
'"$http_referer" "$http_user_agent" "$gzip_ratio"';
access_log /var/log/nginx/access.log redacted;
Replace "pattern" above with a regular expression that matches any legal email address.