The phenomenon of AWS Lambda function running time not affected by cold starts

I have a script to keep two Lambda functions warm:

PING_DELAY = 60
last_ping = -PING_DELAY
while True:
    try:
        timer = Timer()
        timestamp_str = \
            datetime.now().strftime("%d.%m.%Y %H:%M:%S.%f")[:-3]
        print(f"Ping at {timestamp_str} ", end='', flush=True)
        response = lambda_client.invoke(
            FunctionName='myfunc',
            InvocationType = 'RequestResponse',
            Payload=json.dumps({'command': 'ping'}))
        payload = json.load(response['Payload'])
        if 'errorMessage' in payload:
            raise Exception(payload['errorMessage'])
        else:
            my_time = timer.stop()
            stats = payload['stats']
            print(f"took {my_time}ms.   n_cold: {stats['n_cold']}   total_init: {stats['total_init']}ms", flush=True)
    except Exception as e:
        print(f"AWS Lambda submit failed: {e}", flush=True)
    
    time.sleep(PING_DELAY)

Every minute, it invokes the myfunc Lambda function, which in turn invokes in parallel 109 instances of another Lambda function. Based on the responses, myfunc reports in its response how many instances experienced a cold start and the total initialization time. The relevant code for both functions is shown in the Appendix below. Here is part of the output of this script:

Ping at 27.11.2023 11:01:23.180 took 581ms.   n_cold: 0   total_init: 0ms
Ping at 27.11.2023 11:02:23.821 took 527ms.   n_cold: 2   total_init: 11531ms
Ping at 27.11.2023 11:03:24.408 took 486ms.   n_cold: 0   total_init: 0ms
Ping at 27.11.2023 11:04:24.954 took 511ms.   n_cold: 0   total_init: 0ms
Ping at 27.11.2023 11:05:25.525 took 592ms.   n_cold: 1   total_init: 6416ms
Ping at 27.11.2023 11:06:26.119 took 525ms.   n_cold: 0   total_init: 0ms
Ping at 27.11.2023 11:07:26.704 took 502ms.   n_cold: 0   total_init: 0ms
Ping at 27.11.2023 11:08:27.236 took 605ms.   n_cold: 0   total_init: 0ms
Ping at 27.11.2023 11:09:27.901 took 546ms.   n_cold: 0   total_init: 0ms
Ping at 27.11.2023 11:10:28.503 took 497ms.   n_cold: 1   total_init: 6396ms
Ping at 27.11.2023 11:11:29.060 took 489ms.   n_cold: 0   total_init: 0ms

What does not make sense to me, is that myfunc returns a response in less than a second even when some of the instances experience a cold start, which ensues about six seconds of initialization. Only on rare occasions do I get what I would expect when there are cold starts:

Ping at 27.11.2023 12:09:04.775 took 6525ms.   n_cold: 6   total_init: 36829ms

The question: How can we explain the phenomenon of AWS Lambda function running time seemingly not affected by cold starts?

Appendix. For reference, the myfunc script processes the stats in the responses as follows:

stats = response['stats']
init = stats['init_time']
if stats['cold_start']: 
    n_cold += 1
    assert(init > 0)
    total_init += init
else:
    assert(init == 0)

The Lambda function invoked by myfunc is as follows:

import json
import time
import pickle

def ms_now():
    return int(time.time_ns() / 1000000)

class Timer():
    def __init__(self):
        self.start = ms_now()

    def stop(self):
        return ms_now() - self.start

timer = Timer()
started_init = ms_now()
cold_start = True
init_time = timer.stop()

... # Some heavy machine learning models are loaded here

def compute(event):
    ... # The actual computation is here

def lambda_handler(event, context):
    global cold_start
    global init_time
    global started_init
    
    stats = {'cold_start': cold_start,
             'started_init': started_init,
             'init_time': init_time}
    cold_start = False
    init_time = 0
    started_init = 0

    stats['started'] = ms_now()
    result = compute(event)
    stats['finished'] = ms_now()
    return {
        'statusCode': 200,
        'headers': {
            'Content-Type': 'application/json'
        },
        'result': result, 
        'stats': stats
    }

P.S. The question at re:Post

Solution

It looks to me as the result of the proactive initialization of the execution environment for your Lambda functions. It's a feature that is not necessarily widely known, but it has been researched and described in great detal by AJ Stuyvenberg: https://aaronstuyvenberg.com/posts/understanding-proactive-initialization

Consider we’re working with a function which at steady state experiences 100 concurrent invocations. When you deploy a change to your function (or function configuration), AWS can make a pretty reasonable guess that you’ll continue to invoke that same function 100 times concurrently after the deployment finishes. Instead of waiting for each invocation to trigger a cold start, AWS will automatically re-provision (roughly) 100 sandboxes to absorb that load when the deployment finishes. Some users will still experience the full cold start duration, but some won’t (depending on the request duration and when requests arrive).

The above scenario seems to be very similar to the one you're describing. For confirmation that this is the case, you might want to use the Python example from the linked blog post that detects proactive initializations.