Search code examples
oracleperformance-testingdatabase-performance

Performance and Implications of a PL/SQL Procedure snippet with many HTTP Requests


I have a PL/SQL procedure in an Oracle database that processes data from the 'PAYMENTS' table using a cursor and performs HTTP requests to an external api to retrieve related data as xml. The procedure then parses the XML and saves data into another table. Here is a simplified version of the code:

CREATE OR REPLACE PROCEDURE ... IS
  CURSOR TRANSACTIONS IS SELECT ... FROM PAYMENTS WHERE ...
  C1 TRANSACTIONS%ROWTYPE;
  URL VARCHAR2(...);
  L_XMLTYPE ...
BEGIN
  URL := 'http://...'
   
  OPEN TRANSACTIONS;
  
  LOOP -- COULD BE 10000 ROWS OR MANY MORE
    FETCH TRANSACTIONS INTO C1;
    EXIT WHEN TRANSACTIONS%NOTFOUND;  
      
    MY_SCHEMA.MY_PACKAGE.MAKE_HTTP_REQUEST_TO_API(URL, C1.ID, RESPONSE_XML);

    -- PARSE RESPONSE XML
    L_XMLTYPE := XMLTYPE(LV_RESPONSE_DATA);

    SELECT
       EXTRACTVALUE(L_XMLTYPE, ...) AS ...
       ...
       INTO ...
    FROM DUAL;

    -- SAVE DATA FROM RESPONSE XML

    INSERT ...

  END LOOP;
        
  CLOSE TRANSACTIONS;
END;

I would like to understand the potential performance implications of this procedure and any other related considerations. How can I measure them?

Since the procedure makes HTTP requests and waits for responses before continuing with the next row. what specific problems could happen?

Note: The API is designed to return the data per row and can't be changed.

I have found tools like wireshark to measure network problems, but I still don't know how to use it effectively. I have read about async and sync http calls and how the main thread could be blocked. I am not sure how it applies to oracle context though.


Solution

    1. A network call can easily be blocked by firewall or network blocking software or a routing problem and never return. You'd want to test this to see whether sqlnet timeouts will work, but even if so this cannot be fully trusted. I've seen plenty of orphaned sessions that got lost on external network calls. This will result in indefinite hanging on to locks, undo and other resources. You'd probably want a separate asynchronous monitor process to detect this and kill the process if it gets stuck.

    2. While Oracle is waiting for a http request to return, it will report its event as TCP Socket (KGAS). You can see this in v$session for currently waiting sessions or in ASH (v$active_session_history) for past ones. You can either sample v$session or use Oracle's own 1-second sampling in ASH grouped by the wait event to see how much of your procedure is tied up waiting on network vs. CPU and other work inside the database.

    3. Whenever there is a dependency on something heterogeneous outside of the Oracle database, like a network call, it is unadvisable to hold transactions open between these external calls. If you get orphaned or the outside agent simply takes forever to respond, you could be blocking others perhaps indefinitely. I would suggest adding a COMMIT inside the loop right after that INSERT. If you need transaction control so that an error should rollback all the inserts (unlikely I'm guessing), then a safe workaround is to use a temporary table, committing as you go, and then you can do a single INSERT SELECT into your final table.

    4. It's also advisable to do as little as possible local work between external calls so that you can get past your dependency on the remote source as quickly as possible. This is true of database links between Oracle databases as well. The probability of a network hiccup raises the risk level of that portion of your code, so you want to use the network quickly and then be done with it. Try to postpone any heavy local work to afterwards so it is no longer vulnerable to a network problem.

    5. Consider adding exception handling. If the http request (or the XML parsing) fails and raises and exception of some kind, do you want the whole program to abort, or do you want to log it and move on the next record and process what does work? If the latter you'll need to wrap your calls in an anonymous block with an exception handler:

      LOOP
        BEGIN
          // http call
          // XML parse
          // something else that could fail
      
          COMMIT;
        EXCEPTION
          WHEN OTHERS THEN
            NULL; -- here you may log the error and the record that caused it. the loop will continue without aborting
        END;
      END LOOP;
      
    6. With normal network traffic to/from a client and to/from a database link, you also have session statistics in v$sesstat that provide bytes transmitted to/from these two sources. But I don't believe TCP sockets through UTL_TCP and its higher-level packages like UTL_HTTP, UTL_SMTP, etc.. are so instrumented. My data suggests that there isn't any v$sesstat statistic that will track the amount of traffic. You could of course place a monitoring shell script on the OS and run netstat once a minute and gather TCP stats that way at the system level, but of course you wouldn't be able to distinguish between your application's work and anything else using network. You could use SYSTIMESTAMP before and after your http requests together with getting the LENGTH of each response and create your own metric to monitor throughput with from inside the procedure.