Search code examples
xmltypescriptfirebasegoogle-cloud-functionsvpc

Can not receive large xml from external service with scheduled firebase function


I am using Typescript as backend language and have scheduled a function for requesting large xml file(75MB) from external service and parsing it. I have to set up a soap client for receiving xml.

export const updateList: Function =
    functions.region('europe-west3').pubsub.schedule('1 0 1 * *').onRun((_context) => {

const url = 'https://website.org';
const wsdl_options = {
    ntlm: true,
    username: 'username',
    password: 'password',
    domain: "domain",
    workstation: "workstation"
};

soap.createClient(url, { wsdl_options }, (error, client) => {
    if (error) {
        console.log('Error in making soap client', error);
    }
    else {
        client.setSecurity(new soap.NTLMSecurity(wsdl_options.username, wsdl_options.password, wsdl_options.domain));
        const getXml = client.Service.HttpBinding_Service.GetXML;
        try {
            getXml((error, dataXml) => {
                console.log('Inside xml');
                if (error) {
                    console.log(error);
                }
                else {
                    console.log('...done.')
                    processData(dataXml);
                }
            });
        } catch (error) {
            console.log('Something wrong' + error);
        }
    }
})
                
});

after calling getXml it stops. When calling same code locally from PC it works well and takes not more than 20sec for receiving the xml and parsing it. The external service has a whitelist range of IPs, so local WiFi network was added. I have set up static IP address for Firebase function and the external service provider has added it to whitelist IP. I have set up a Cloud NAT, VPC network, serverless VPC access, VPC connector, etc.

So the traffic from the scheduled function goes through the static IP address correctly and the connection is established with no errors, but it can not receive the large xml. I have tried to fetch xml 5MB, it received it after 5 mins when the function was called. (I was checking console logs in firebase functions)

How can I solve the issue? any ideas?

UPD. Updated the code example with full code on 20.06


Solution

  • As suspected, you've started the function correctly, but because you haven't returned a Promise to "keep the function alive", the instance executing your function gets put into an "inactive" state where network requests are blocked and CPU processing is severely throttled. Because your system isn't subject to such an aggressive instance management system, you won't see a similar drop in performance when developing locally. Importantly, when your function is "inactive" it can be terminated at any time, so make sure to do any body of work before resolving the returned Promise.

    This behaviour is covered in more detail in the documentation and its linked videos.

    Based on your code, you appear to be making use of the soap package, which conveniently has both a callback-based and Promise-based API. This allows us to subsitute:

    soap.createClient(url, { wsdl_options }, (error, client) => {
        if (error) {
            console.log("Error in making soap client", error);
        } else {
            // do something with client
        }
    });
    

    with just

    const client = await soap.createClientAsync(url, { wsdl_options });
    
    // do something with client
    

    Making use of these Promise-based APIs, you code becomes something similar to:

    import * as functions from "firebase-functions";
    import * as soap from "soap";
    
    async function processData(data) {
      // do something with data
    }
    
    export const updateList = functions
      .region("europe-west3")
      .pubsub.schedule("1 0 1 * *")
      .onRun(async () => { // <- async added here
        const url = "https://website.org";
        const wsdl_options = {
          ntlm: true,
          username: "username",
          password: "password",
          domain: "domain",
          workstation: "workstation",
        };
    
        const client = await soap.createClientAsync(url, { wsdl_options });
    
        client.setSecurity(
          new soap.NTLMSecurity({
            username: wsdl_options.username,
            password: wsdl_options.password,
            domain: wsdl_options.domain
          })
        );
    
        console.log("SOAP client initialized, obtaining XML data...");
    
        const xmlData = await client.Service.HttpBinding_Service.GetXMLAsync();
        // if the above isn't available, you could use:
        // const xmlData = new Promise((resolve, reject) =>
        //   client.Service.HttpBinding_Service.GetXML(
        //     (err, result) => err ? reject(err) : resolve(result)
        //   )
        // );
    
        console.log("obtained XML data, now processing...");
    
        await processData(xmlData);
    
        console.log('Success!');
      });
    

    Note: When I was piecing this together, TypeScript threw an error about NTLMSecurity taking 3 arguments when it should be a single object. So it was rectified above.


    You can read more about Promises in this blog post or by checking out other questions here on StackOverflow.