Search code examples
phpphp-curletsy

curl_multi_exec() works with 7 calls, but I've got over 1000 calls to get through


This is my first time using curl_multi_init() so I'm probably misunderstanding something. Learning to use it properly is more important to me than solving my problem because this particular function will solve a lot of my problems in future.

This particular call is for uploading Etsy photos. Etsy documentation for this call here.

It works fine in Postman. The code snippet Postman generates for "PHP - cURL" works fine. It keeps working fine even after my edits to it.

Trouble is, I've got well over a thousand high resolution images to upload, so running the entire snippet from start to finish, then looping it a thousand times will time out no matter how generous my php.ini settings.

So, line by line I merged the existing code with a synchronous snippet and, I must have done something wrong. This example is almost exactly the live code. I've just deleted/simplified irrelevant things and redacted personal information. (Hopefully I didn't delete/simplify the bug.):

Edit

This code works when limited to 7 calls. This is a very recent discovery, but absolutely critical to solving the question overall.

<?php
include_once 'databaseStuff.php';
include_once 'EtsyTokenStuff.php';
$result = mysqli_query($conn, "SELECT product, listing_id, alt_text, dataStuff;");
$multiCurl = [];
$multiResult = [];
$multiHandle = curl_multi_init();
if (mysqli_num_rows($result) > 0){
    while ($row = mysqli_fetch_assoc($result)){
        for($image = 1; $image <=2; $image++){
            $multiCurl[$row['product'] . "_" . $image] = curl_init();
            curl_setopt_array($multiCurl[$row['product'] . "_" . $image], 
                array(
                    CURLOPT_URL => "https://openapi.etsy.com/v3/application/shops/$myShopNumber/listings/" . $row['listing_id'] . "/images",
                    CURLOPT_RETURNTRANSFER => true,
                    CURLOPT_ENCODING => '',
                    CURLOPT_MAXREDIRS => 10,
                    CURLOPT_TIMEOUT => 0,
                    CURLOPT_FOLLOWLOCATION => true,
                    CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
                    CURLOPT_CUSTOMREQUEST => 'POST',
                    CURLOPT_POSTFIELDS => array(
                        "image" => new CURLFILE(
                            [
                                1 => "img/imagePathStuff/" . $row['product'] . ".jpg",
                                2 => "img/differentImagePathStuff/" . $row['product'] . ".jpg"
                            ][$image]
                        ),
                        // "listing_image_id" =>,
                        "rank" => $image,
                        "overwrite" => true,
                        // "is_watermarked" =>,
                        "alt_text" => $row['alt_text']
                    ),
                    CURLOPT_HTTPHEADER => array(
                        "x-api-key: $myAPIKey",
                        "authorization: Bearer {$etsyAccessToken}"
                    ),
                )
            );
            curl_multi_add_handle($multiHandle, $multiCurl[$row['product'] . "_" . $image]);
        }
    }
    $index = null;
    do {
        curl_multi_exec($multiHandle, $index);
    } while($index > 0);
    foreach($multiCurl as $k => $curlHandle){
        $multiResult[$k] = curl_multi_getcontent($curlHandle);
        curl_multi_remove_handle($multiHandle, $curlHandle);
    }
    curl_multi_close($multiHandle);
}

Once it starts working I'll probably block it out into functions, but I prefer to edit broken code in this format and add the function calls later.

Newer Insights

Having never worked with these functions before, I'm not sure how they're supposed to behave but the behaviour I've noticed:

  • If I limit the number of images uploaded to 7, everything works as intended. But if I run this code, no limit, even the first 7 images won't connect with the server. When I limit to 8 or higher, I hit an internal service error, but I suspect that might be an issue with my sloppy code. I need to look over it a few more times to see why it always crashes at the exact same point.
  • No, it wasn't sloppy code. Commenting out curl_multi_exec removes the error. Commenting out everything below except curl_multi_exec and its loop does not remove the error. Max calls seems to be at 7, no matter which code snippet I borrow and replace. I can't even cause it to reduce to 6 with deliberately sloppy snippets. It's always 7.
  • Opening php.ini and changing memory_limit = 256M to memory_limit = 512M not only fails to fix the problem, but makes the problem worse. Sending 7 results in an Internal Service Error. This was tested in the live environment, so I quickly reverted back to memory_limit = 256M. All damage caused was instantly repaired. I won't be testing that much further if I don't have to.

Older insights

  • The number of loops for the do-while loop varies from hundreds of thousands to millions while trying to upload 4 images. I suspect this is the correct number of loops, since everything else seems to work when it behaves this way. So now I know.
  • This exact code has an Etsy specific problem. Ignore this if you aren't developing code for Etsy's API, but Etsy doesn't like it when you upload two photos to the same listing at the same time. Photos to different listings at the same time, however, is okay. So a loop that covers a single listing will not work.
  • Following the advice of @Kazz, while (false !== ($info = curl_multi_info_read($multiHandle))) { print_r($info); } returns Array ( [msg] => 1 [result] => 7 [handle] => Resource id #1009 ) for each item (with +1 to each Resource id for each result following). 7 corresponds with the error "CURLE_COULDNT_CONNECT".

Earlier insights

  • Although almost every change seems inconsequential, changing the URL to https://google.com causes everything to time out. Therefore, my code at least has access to the internet.
  • Visiting the correct url in browser gives an authentication error, as I'd expect.
  • All of the code executes, start to finish, no fatal errors.
  • The do-while loop executes once then loops once more. (Maybe it's supposed to or maybe it's supposed to loop once per photo. Couldn't get that clarified anywhere.)
  • It's supposed to update photos. Unfortunately the first test was on very minor edits, but trying again including a deliberately wrong photo I at least know that that particular photo didn't update, so probably none of them updated.
  • curl_multi_getcontent($curlHandle) always returns an empty string
  • curl_multi_exec($multiHandle, $index) always returns 0 (previous claim that it was 1002 was incorrect. 1002 was actually the value of the second argument $index after running the function.)
  • This particular call normally has very detailed responses for 201 and at least returns the error for 400, 401, 403, 404, 409, and 500, but I don't think my code is even going far enough to make the call. I haven't even figured out how to get the response codes at all.
  • For a script that transfers well over one thousand high resolution images from my server to Etsy's server, it certainly executes very fast.
  • The $multiHandle seems to work as intended. At the very least, a var_dump($multiHandle) reveals all the correct file names in there.

Here is a list of diagnostic functions I've tried and their outputs, again thanks to @Kazz for the functions.

  • while (false !== ($info = curl_multi_info_read($multiHandle))) { print_r($info); } returns Array ( [msg] => 1 [result] => 7 [handle] => Resource id #1009 ) for each item (with +1 to each Resource id for each result following)
  • print_r(dns_get_record('openapi.etsy.com', DNS_A)); returns Array ( [0] => Array ( [host] => e8520.b.akamaiedge.net [class] => IN [ttl] => 0 [type] => A [ip] => 104.127.77.191 ) )
  • var_dump(exec('ping -c 3 openapi.etsy.com')); returns string(0) ""
  • exec('ping -c 3 openapi.etsy.com', $output); var_dump($output); returns array(0) { }
  • exec('ping -n 3 openapi.etsy.com', $output); var_dump($output); returns array(0) { }
  • this whole thing returns "TCP/IP Connection OK. Attempting to connect to '104.127.77.191' on port '80'...OK. Sending HTTP GET request...OK. Reading response: HTTP/1.1 301 Moved Permanently Server: AkamaiGHost Content-Length: 0 Location: openapi.etsy.com Date: Tue, 21 Feb 2023 02:34:15 GMT Connection: close Closing socket...OK."

It wouldn't surprise me if it's a minor typo causing this. What is it?


Solution

  • So, the first answer I posted improved things from 0 to 7, but now things are improved from 7 to... 90ish? I don't think it consistently fails on the same number, and I'm not investigating further because only the first 10 calls per second are received by the server anyway.

    The second improvement came from updating PHP from version 7 to version 7.4.

    So, limiting curl_multi_exec to 10 calls, I can now loop out all the calls until it eventually hits a 503 error.

    Luckily, @Booboo's solution fixed it.

    So, in summary, the correct solutions are:

    • Limit the number of calls manually
    • Update PHP to at least 7.4. (More recent versions are unavailable for me to test, but more recent may be better.)
    • Use @Booboo's snippet to fix 503 errors. (Give it an upvote if you use it. It's only fair.)