phpwordpress.htaccess

How to set a header field when response from another site is empty?


Currently we are using GTranslate to translate our WordPress site.

It has a plugin that will add the following rule to .htaccess:

### BEGIN GTranslate config ###
RewriteRule ^(af|sq|am|ar|hy|az|eu|be|bn|bs|bg|ca|ceb|ny|zh-CN|zh-TW|co|hr|cs|da|nl|en|eo|et|tl|fi|fr|fy|gl|ka|de|el|gu|ht|ha|haw|iw|hi|hmn|hu|is|ig|id|ga|it|ja|jw|kn|kk|km|ko|ku|ky|lo|la|lv|lt|lb|mk|mg|ms|ml|mt|mi|mr|mn|my|ne|no|ps|fa|pl|pt|pa|ro|ru|sm|gd|sr|st|sn|sd|si|sk|sl|so|es|su|sw|sv|tg|ta|te|th|tr|uk|ur|uz|vi|cy|xh|yi|yo|zu)/(af|sq|am|ar|hy|az|eu|be|bn|bs|bg|ca|ceb|ny|zh-CN|zh-TW|co|hr|cs|da|nl|en|eo|et|tl|fi|fr|fy|gl|ka|de|el|gu|ht|ha|haw|iw|hi|hmn|hu|is|ig|id|ga|it|ja|jw|kn|kk|km|ko|ku|ky|lo|la|lv|lt|lb|mk|mg|ms|ml|mt|mi|mr|mn|my|ne|no|ps|fa|pl|pt|pa|ro|ru|sm|gd|sr|st|sn|sd|si|sk|sl|so|es|su|sw|sv|tg|ta|te|th|tr|uk|ur|uz|vi|cy|xh|yi|yo|zu)/(.*)$ /$1/$3 [R=301,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(af|sq|am|ar|hy|az|eu|be|bn|bs|bg|ca|ceb|ny|zh-CN|zh-TW|co|hr|cs|da|nl|en|eo|et|tl|fi|fr|fy|gl|ka|de|el|gu|ht|ha|haw|iw|hi|hmn|hu|is|ig|id|ga|it|ja|jw|kn|kk|km|ko|ku|ky|lo|la|lv|lt|lb|mk|mg|ms|ml|mt|mi|mr|mn|my|ne|no|ps|fa|pl|pt|pa|ro|ru|sm|gd|sr|st|sn|sd|si|sk|sl|so|es|su|sw|sv|tg|ta|te|th|tr|uk|ur|uz|vi|cy|xh|yi|yo|zu)/(.*)$ /wp-content/plugins/gtranslate/url_addon/gtranslate.php?glang=$1&gurl=$2 [L,QSA]
RewriteRule ^(af|sq|am|ar|hy|az|eu|be|bn|bs|bg|ca|ceb|ny|zh-CN|zh-TW|co|hr|cs|da|nl|en|eo|et|tl|fi|fr|fy|gl|ka|de|el|gu|ht|ha|haw|iw|hi|hmn|hu|is|ig|id|ga|it|ja|jw|kn|kk|km|ko|ku|ky|lo|la|lv|lt|lb|mk|mg|ms|ml|mt|mi|mr|mn|my|ne|no|ps|fa|pl|pt|pa|ro|ru|sm|gd|sr|st|sn|sd|si|sk|sl|so|es|su|sw|sv|tg|ta|te|th|tr|uk|ur|uz|vi|cy|xh|yi|yo|zu)$ /$1/ [R=301,L]
### END GTranslate config ###

I check it and it seems the codes just rewrite the multilanguage URL like https://www.example.com/de/ to https://www.example.com/wp-content/plugins/gtranslate/url_addon/gtranslate.php?glang=de&gurl=https://www.example.com

Then gtranslate.php will contact the GTranslate server to do the translation and return the translated page, below is the response header:

enter image description here

However, sometimes the GTranslate server will return blank page. In such a case, in the response header, the x-gt fields will all missing, as below:

enter image description here

Since we are using Cloudflare for our website, we want to add a rule in .htaccess, so that not to cache the translated page when the response header does not contain the x-gt fields, like below:

Header set CDN-Cache-Control "no-cache, must-revalidate, max-age=0" "xxx"
Header set Cloudflare-CDN-Cache-Control "no-cache, must-revalidate, max-age=0" "xxx"

But I don't know how to write the condition for the case when the x-gt fields are missing

Update:

I follow the instructions from to modify the codes, as below:

Original codes from gtranslate.php:

$response_headers = explode(PHP_EOL, $header);
//print_r($response_headers);
$headers_sent = '';
foreach($response_headers as $header) {
    if(!empty(trim($header)) and !preg_match('/Content\-Length:|Transfer\-Encoding:|Content\-Encoding:|Link:/i', $header)) {

        if(preg_match('/^(Location|Refresh):/i', $header)) {
            $header = str_ireplace($host, $_SERVER['HTTP_HOST'] . '/' . $glang, $header);
            $header = str_ireplace('Location: /', 'Location: /' . $glang . '/', $header);
            $header = str_replace('/' . $glang . '/' . $glang . '/', '/' . $glang . '/', $header);
        }

        // woocommerce cookie path fix
        if(preg_match('/^Set-Cookie:/i', $header) and strpos($header, 'woocommerce') !== false) {
            $header = preg_replace('/path=\/.*\/;/', 'path=/;', $header);
        }

        $headers_sent .= $header;
        header($header, false);
    }
}
//echo $headers_sent;

Modified codes from gtranslate.php(Changed parts are NOT indented):

//  2023-11-08: Check if both the header & html are valid
//
//  For header, check whether the data are retrieved from GTranslate server or not.
//  Based on our test, X-GT-OrigURL or x-gt-delivered-by may NOT always appear.
//  But there will always be one field contains 'x-gt-', so we will use this as the creteria
//
//  For html content, we will check if it contains <body> tag
//
$has_valid_contents = (stripos($header, 'x-gt-') !== false) and (stripos($html, '<body>') !== false);
                
        $response_headers = explode(PHP_EOL, $header);
        //print_r($response_headers);
        $headers_sent = '';
        foreach($response_headers as $header) {
//  2023-11-08: If no valid contents, then we will skip Cache-Control, CDN-Cache-Control and Cloudflare-CDN-Cache-Control header fields
//  2023-11-09: Based on https://stackoverflow.com/questions/77447697/how-header-in-php-replace-a-header-that-has-been-sent, it is better
//  NOT to rely on replace=true to replace the header fields, instead, we will use logic to prevent send the similar header fields
if(!$has_valid_contents and !empty(trim($header)) and (preg_match('/Cache\-Control:|CDN\-Cache\-Control:|Cloudflare\-CDN\-Cache\-Control:/i', $header) === 1)) {
    continue;
}
        
                    //  2023-11-08: Processing the other headers as before
                if(!empty(trim($header)) and !preg_match('/Content\-Length:|Transfer\-Encoding:|Content\-Encoding:|Link:/i', $header)) {

                    if(preg_match('/^(Location|Refresh):/i', $header)) {
                        $header = str_ireplace($host, $_SERVER['HTTP_HOST'] . '/' . $glang, $header);
                        $header = str_ireplace('Location: /', 'Location: /' . $glang . '/', $header);
                        $header = str_replace('/' . $glang . '/' . $glang . '/', '/' . $glang . '/', $header);
                    }

                    // woocommerce cookie path fix
                    if(preg_match('/^Set-Cookie:/i', $header) and strpos($header, 'woocommerce') !== false) {
                        $header = preg_replace('/path=\/.*\/;/', 'path=/;', $header);
                    }

                    $headers_sent .= $header;
                    header($header, false);
                }
            }

//  2023-11-08: If has not valid contents, then we will add the following 3 header fields to prevent it from being cached
if(!$has_valid_contents) {
    //  Set the optional replace parameter to true to replace the previous Cache-Control, CDN-Cache-Control and Cloudflare-CDN-Cache-Control.
    //  As replace may not work if the header has been sent, we also use logic to prevent send similar header fields.
    header('Cache-Control: no-cache, must-revalidate, max-age=0', true);
    header('CDN-Cache-Control: no-cache, must-revalidate, max-age=0', true);
    header('Cloudflare-CDN-Cache-Control: no-cache, must-revalidate, max-age=0', true);
}

            //echo $headers_sent;

//  2023-11-08: Dump headers_sent to logs.txt
$dump_logs = 1;

if (!$has_valid_contents and $dump_logs)    {
    $fh_logs = fopen(dirname(__FILE__).'/logs.txt', 'a');
    fwrite($fh_logs, 'has_valid_contents:'.var_export($has_valid_contents, true).PHP_EOL);
    fwrite($fh_logs, 'headers_list:'.print_r(headers_list(), true).PHP_EOL);
    fwrite($fh_logs, 'html:'.(empty($html) ? 'empty' : print_r($html, true)).PHP_EOL.PHP_EOL);
    fclose($fh_logs);
}

Based on my test with logs.txt, there will be many cases that has_valid_contents will be false.


Solution

  • I do not think the .htaccess file could directly interact with outgoing HTTP responses, especially those generated by an external server like GTranslate.

    That means you cannot conditionally set a header based on the response header from GTranslate within .htaccess. You would need a server-side code.

    You can use WordPress hooks to examine the HTTP response from GTranslate and conditionally set the CDN-Cache-Control and Cloudflare-CDN-Cache-Control headers based on whether the x-gt fields are missing or not. You could use wp_remote_get or wp_remote_post to call the GTranslate service. Examine the headers of the returned HTTP response, and conditionally set the cache control headers using header().

    // Make a request to GTranslate service
    $response = wp_remote_get('GTranslate API URL');
    
    // Check if the request was successful
    if (is_wp_error($response)) {
        // Handle error
        return;
    }
    
    // Retrieve the headers from the response
    $headers = wp_remote_retrieve_headers($response);
    
    // Check if the x-gt fields are missing
    if (!isset($headers['x-gt-somefield'])) {
        // Conditionally set the headers for Cloudflare not to cache this response
        header('CDN-Cache-Control: no-cache, must-revalidate, max-age=0');
        header('Cloudflare-CDN-Cache-Control: no-cache, must-revalidate, max-age=0');
    }
    
    // Output the response body
    $body = wp_remote_retrieve_body($response);
    echo $body;
    

    Since you would be doing this from your PHP code, you would get full control over the HTTP response, including the ability to conditionally set headers based on the content of the response from GTranslate.

    See "WordPress Filter Hooks" to include this into your WordPress setup.

    Meaning this solution is intended to be used in a custom WordPress hook or in your theme's functions.php file, separate from the GTranslate plugin code.

    For example, you could use the wp_loaded action hook to execute your custom code after WordPress itself and all plugins are fully loaded:

    In your functions.php file, you could add something like:

    add_action('wp_loaded', 'my_custom_function');
    
    function my_custom_function() {
        // Your custom logic here, including making HTTP requests, inspecting the responses, 
        // and conditionally setting headers
    }
    

    In my_custom_function, you could use the wp_remote_get or wp_remote_post functions to communicate with the GTranslate API and then examine the response headers as outlined above.


    1. The original post, such as datanumen.com/outlook-repair does not requires to contact GTranslate, but your hook function will always contact GTranslate server, which may slow down the page loading.
    2. It is possible that when your hook function get a valid page(with x-gt fields), but then the future call to gtranslate.php get a blank page. That occurs randomly

    Given your constraints and the possibility of inconsistent behavior with the GTranslate service, a solution that include direct modifications in the plugin's code gtranslate.php becomes increasingly likely as the best approach.

    If you are concerned about slowing down the page load and potential inconsistencies between different calls to the GTranslate server, you could use a two-step approach:

    1. Use a WordPress hook that is only triggered on posts that you know will involve translation (if such information is available before contacting GTranslate). In this initial phase, do not contact the GTranslate server. Instead, use it to flag the request for further processing.

    2. Modify gtranslate.php to look for this flag. If the flag is set, proceed with contacting the GTranslate service and conditionally set the CDN-Cache-Control and Cloudflare-CDN-Cache-Control headers based on the presence or absence of the x-gt fields.

    The key is to set some sort of flag or condition that can be checked without making an HTTP request to GTranslate. That flag would be checked in your modified gtranslate.php file to decide whether or not to make the HTTP request to GTranslate and set the cache control headers accordingly.


    I finally get time to modify the codes in gtranslate.php. As I am not familiar with PHP, I post the original codes and the modified version in the update of the original post, to see if there are any errors. Also, if the modified version is correct, then others using Gtranslate can also benefit from it.

    A few comments on the new code:

    • The variable $header is used to represent both the entire header string and each header line in the loop.

      $response_headers = explode(PHP_EOL, $header);
      foreach($response_headers as $header) {
           // logic using $header ...
      }
      
    • The regular expression used could be optimized to avoid repetition.

      if(!$has_valid_contents and !empty(trim($header)) and (preg_match('/Cache\-Control:|CDN\-Cache\-Control:|Cloudflare\-CDN\-Cache\-Control:/i', $header) === 1)) {
           continue;
      }
      
    • The check for 'x-gt-' should be done before splitting headers or on individual headers.

      $has_valid_contents = (stripos($header, 'x-gt-') !== false) and (stripos($html, '<body>') !== false);
      
    • The method for validating the response from GTranslate is not entirely reliable.

      $has_valid_contents = (stripos($header, 'x-gt-') !== false) and (stripos($html, '<body>') !== false);
      
    • The code writes to logs.txt without checking if it is a debugging mode.

      if (!$has_valid_contents and $dump_logs) {
           // logic to write to logs.txt
      }
      
    • Headers are being replaced based on the $has_valid_contents variable, but the previous sending of headers is not checked properly.

      if(!$has_valid_contents) {
           header('Cache-Control: no-cache, must-revalidate, max-age=0', true);
           header('CDN-Cache-Control: no-cache, must-revalidate, max-age=0', true);
           header('Cloudflare-CDN-Cache-Control: no-cache, must-revalidate, max-age=0', true);
      }
      

    Raking into account those points, your new code would be:

    // Determine whether the response from GTranslate is valid
    $has_valid_contents = (stripos($response_headers, 'x-gt-') !== false) && (stripos($html, '<body>') !== false);
    
    $response_headers_array = explode(PHP_EOL, $response_headers);
    $headers_sent = '';
    
    foreach($response_headers_array as $response_header) {
        // Skip specific cache-related headers if the content is invalid
        if(!$has_valid_contents && !empty(trim($response_header)) && preg_match('/(Cache\-Control|CDN\-Cache\-Control|Cloudflare\-CDN\-Cache\-Control):/i', $response_header)) {
            continue;
        }
    
        // Processing other headers
        if(!empty(trim($response_header)) && !preg_match('/Content\-Length:|Transfer\-Encoding:|Content\-Encoding:|Link:/i', $response_header)) {
            // existing code for other headers
        }
    }
    
    // If the content is invalid, send headers to prevent caching
    if(!$has_valid_contents) {
        header('Cache-Control: no-cache, must-revalidate, max-age=0', true);
        header('CDN-Cache-Control: no-cache, must-revalidate, max-age=0', true);
        header('Cloudflare-CDN-Cache-Control: no-cache, must-revalidate, max-age=0', true);
    }
    
    // Optional logging
    $dump_logs = 1; // That should be set to 0 in a production environment
    
    if ($dump_logs && !$has_valid_contents) {
        $fh_logs = fopen(dirname(__FILE__).'/logs.txt', 'a');
        fwrite($fh_logs, 'has_valid_contents:'.var_export($has_valid_contents, true).PHP_EOL);
        fwrite($fh_logs, 'headers_list:'.print_r(headers_list(), true).PHP_EOL);
        fwrite($fh_logs, 'html:'.(empty($html) ? 'empty' : 'present').PHP_EOL.PHP_EOL); // Do not dump entire HTML content, it is inefficient
        fclose($fh_logs);
    }
    

    I check your new version and it seems there is no change based on following comments

    1. The method for validating the response from GTranslate is not entirely reliable.
    2. The code writes to logs.txt without checking if it is a debugging mode.
    3. but the previous sending of headers is not checked properly.

    On your three points:

    1. The Method for Validating the Response from GTranslate Is Not Entirely Reliable:

      • The current method checks for the presence of 'x-gt-' in headers and <body> in HTML. This is a basic check and might not cover all scenarios where the response from GTranslate is not valid.
      • A more robust validation would require a deeper understanding of the expected response structure from GTranslate. For instance, checking for specific values in the x-gt- headers or the completeness/integrity of the HTML content.
      • However, without specific documentation or guidelines from GTranslate on what constitutes a valid response, this remains a heuristic approach.
    2. The Code Writes to logs.txt Without Checking if It Is a Debugging Mode:

      • The current code writes to logs.txt based on the $dump_logs variable, which is manually set. Ideally, this should be linked to a debugging mode that can be toggled on or off.
      • A better approach might be to use a constant or a setting that can be easily changed without modifying the core logic of the script, like define('GTRANSLATE_DEBUG_MODE', true); at the beginning of your script, and then check this constant before logging.
    3. Previous Sending of Headers Is Not Checked Properly:

      • The script uses the header() function with replace set to true, which should replace any previous headers of the same name. However, if headers have already been sent (e.g., by another part of the WordPress system or plugins), the header() function won't work as expected.
      • For a better way to handle this, you would need to check if headers have already been sent using headers_sent() before attempting to set new headers. Note that headers_sent() will only tell you if headers have been sent, not which specific headers.

    The script would have those modifications:

    // Define a constant for debug mode at the beginning of your script
    define('GTRANSLATE_DEBUG_MODE', false); // Set to true for debugging
    
    // [rest of the script]
    
    // Improved validation logic (as an example)
    $has_valid_gtranslate_headers = stripos($response_headers, 'x-gt-') !== false;
    $has_body_tag = stripos($html, '<body>') !== false;
    $has_valid_contents = $has_valid_gtranslate_headers && $has_body_tag;
    
    // [processing headers]
    
    // Send no-cache headers if the content is invalid and headers have not been sent
    if(!$has_valid_contents && !headers_sent()) {
        header('Cache-Control: no-cache, must-revalidate, max-age=0', true);
        header('CDN-Cache-Control: no-cache, must-revalidate, max-age=0', true);
        header('Cloudflare-CDN-Cache-Control: no-cache, must-revalidate, max-age=0', true);
    }
    
    // Debug logging
    if (GTRANSLATE_DEBUG_MODE && !$has_valid_contents) {
        $fh_logs = fopen(dirname(__FILE__).'/logs.txt', 'a');
        fwrite($fh_logs, 'has_valid_contents:'.var_export($has_valid_contents, true).PHP_EOL);
        fwrite($fh_logs, 'headers_list:'.print_r(headers_list(), true).PHP_EOL);
        fwrite($fh_logs, 'html:'.(empty($html) ? 'empty' : 'present').PHP_EOL.PHP_EOL);
        fclose($fh_logs);
    }
    

    Now, the debugging mode is controlled by a constant, which makes it easier to turn on and off without altering the logic of the script.
    The validation of the response from GTranslate is still basic and heuristic-based, as a more detailed validation would require specific details about the expected response structure from GTranslate.
    And the code now checks if headers have already been sent before attempting to set new ones.