When CopyLeaks sends the completed webhook to my server, I then send an export request like the following (note that I'm specifying "headers")
{
"completionWebhook": "***REDACTED***/copyleaks/webhook/exported/56",
"crawledVersion": {
"endpoint": "***REDACTED***/copyleaks/webhook/crawled/56",
"headers": [
[
"Authorization",
"PRECI ***REDACTED***"
]
],
"verb": "POST"
},
"results": [
{
"endpoint": "***REDACTED***/copyleaks/webhook/result/56/2a1b402420",
"headers": [
[
"Authorization",
"PRECI ***REDACTED***"
]
],
"id": "2a1b402420",
"verb": "POST"
}
]
}
but I don't see that "Authorization" header when CopyLeaks submits the result to the indicated URL. The result requests are being received by my Django website. This is what Django shows for headers:
{'UNIQUE_ID': 'YO5jaCH4wN9w5X3ozii3nQAAAAY', 'SSL_TLS_SNI': 'dev.earlychildhoodeducator.com', 'GATEWAY_INTERFACE': 'CGI/1.1', 'SERVER_PROTOCOL': 'HTTP/1.1', 'REQUEST_METHOD': 'POST', 'QUERY_STRING': '', 'REQUEST_URI': '/copyleaks/webhook/crawled/56', 'SCRIPT_NAME': '', 'PATH_INFO': '/copyleaks/webhook/crawled/56', 'PATH_TRANSLATED': '***REDACTED***/wsgi.py/copyleaks/webhook/crawled/56', 'HTTP_CONNECTION': 'keep-alive', 'HTTP_KEEP_ALIVE': '600', 'HTTP_ACCEPT_ENCODING': 'gzip', 'CONTENT_LENGTH': '179', 'HTTP_HOST': 'dev.earlychildhoodeducator.com', 'SERVER_SIGNATURE': '', 'SERVER_SOFTWARE': 'Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips mod_wsgi/4.6.2 Python/3.6', 'SERVER_NAME': 'dev.earlychildhoodeducator.com', 'SERVER_ADDR': '10.68.195.10', 'SERVER_PORT': '443', 'REMOTE_ADDR': '35.239.30.65', 'DOCUMENT_ROOT': '/var/data/websites/dev.earlychildhoodeducator.com/static', 'REQUEST_SCHEME': 'https', 'CONTEXT_PREFIX': '', 'CONTEXT_DOCUMENT_ROOT': '/var/data/websites/dev.earlychildhoodeducator.com/static', 'SERVER_ADMIN': 'tech@earlychildhoodeducator.com', 'SCRIPT_FILENAME': '/var/data/websites/dev.earlychildhoodeducator.com/main/wsgi.py', 'REMOTE_PORT': '39827', 'mod_wsgi.script_name': '', 'mod_wsgi.path_info': '/copyleaks/webhook/crawled/56', 'mod_wsgi.process_group': 'pacrimdev', 'mod_wsgi.application_group': 'dev.earlychildhoodeducator.com|', 'mod_wsgi.callable_object': 'application', 'mod_wsgi.request_handler': 'wsgi-script', 'mod_wsgi.handler_script': '', 'mod_wsgi.script_reloading': '1', 'mod_wsgi.listener_host': '', 'mod_wsgi.listener_port': '443', 'mod_wsgi.enable_sendfile': '0', 'mod_wsgi.ignore_activity': '0', 'mod_wsgi.request_start': '1626235752847504', 'mod_wsgi.request_id': 'YO5jaCH4wN9w5X3ozii3nQAAAAY', 'mod_wsgi.queue_start': '1626235752847648', 'mod_wsgi.daemon_connects': '1', 'mod_wsgi.daemon_restarts': '0', 'mod_wsgi.daemon_start': '1626235752847734', 'mod_wsgi.script_start': '1626235752847820', 'wsgi.version': (1, 0), 'wsgi.multithread': True, 'wsgi.multiprocess': False, 'wsgi.run_once': False, 'wsgi.url_scheme': 'https', 'wsgi.errors': <_io.TextIOWrapper name='<wsgi.errors>' encoding='utf-8'>, 'wsgi.input': <mod_wsgi.Input object at 0x7f7e765b3dc0>, 'wsgi.input_terminated': True, 'wsgi.file_wrapper': <class 'mod_wsgi.FileWrapper'>, 'apache.version': (2, 4, 6), 'mod_wsgi.version': (4, 6, 2), 'mod_wsgi.total_requests': 4, 'mod_wsgi.thread_id': 3, 'mod_wsgi.thread_requests': 0}
Is there a problem with my export request to CopyLeaks? Or does maybe CopyLeaks not bother sending the requested HTTP headers when using the sandbox?
It turned out to be a Django issue: Django removes all headers with an underscore in them: https://code.djangoproject.com/ticket/25048 (and my HTTP headers did). I think it also removes the HTTP header "Authorization" as it's assumed that's only used by the webserver (eg Apache) and so shouldn't be passed onto Django. So I changed my HTTP header to "PRECI" and I can now see it is getting received in my Django app from the CopyLeaks webhook request.