Search code examples
pythonunit-testingpython-asynciopool

How to ensure that the result was given by async, not by pool


functions needed to be tested (means I can't see the code, I can only import them):

the file async_data.py

import asyncio
import socket
import aiohttp


async def get_json(client, uid):
    json_url = 'https://jsonplaceholder.typicode.com/todos/{uid}'.format(uid=uid)
    resp = await client.request('GET', json_url)
    data = await resp.json()
    return data


async def main_async(range_max):
    conn = aiohttp.TCPConnector(family=socket.AF_INET, verify_ssl=True)
    async with aiohttp.ClientSession(trust_env=True, connector=conn) as client:
        tasks = [get_json(client, x) for x in range(range_max)]
        data = await asyncio.gather(*tasks, return_exceptions=True)
        return data

second (the same task in sync mode or using pool) sync_data.py

import json
import urllib.request
from multiprocessing import Pool


def get_json_url(uid):
    json_url = 'https://jsonplaceholder.typicode.com/todos/{uid}'.format(uid=uid)
    jsondata = {}
    try:
        with urllib.request.urlopen(json_url) as url:
            jsondata = json.loads(url.read().decode())
    except urllib.error.HTTPError:
        pass
    return jsondata


def main_sync(range_max):
    return [get_json_url(uid) for uid in range(range_max)]


def main_pool(range_max):
    with Pool() as pool:
        result = pool.map(get_json_url, range(range_max))
    return result

the main block ,here the functions main_async,main_sync,main_pool looks like in the black box, run the tests:

import time
import asyncio
from async_data import main_async
from sync_data import main_sync, main_pool

def main():
    total_cnt = 200
    # async block
    async_start = time.clock()
    loop = asyncio.get_event_loop()
    try:
        async_data = loop.run_until_complete(main_async(total_cnt))
    finally:
        loop.close()
    async_time = time.clock() - async_start
    # pool block
    pool_start = time.clock()
    pool_data = main_pool(total_cnt)
    pool_time = time.clock() - pool_start
    # sync block
    sync_start = time.clock()
    sync_data = main_sync(total_cnt)
    sync_time = time.clock() - sync_start
    # assert data
    sorted_async = sorted([x.get('id', -1) for x in async_data])
    sorted_pool = sorted([x.get('id', -1) for x in pool_data])
    sorted_sync = sorted([x.get('id', -1) for x in sync_data])
    assert sorted_async == sorted_pool
    assert sorted_async == sorted_sync
    assert sync_time > async_time
    assert sync_time > pool_time
    # AND here i want to be ensure that the result was given by async not pool

if __name__ == '__main__':
    main()

simple way to test if the data was received by async or sync method, is to check execution time. But which way can you test if the code is using pool or async?


Solution

  • You can try some mocking for your tests:

    import multiprocessing.pool
    from unittest.mock import patch
    
    ...
    
    with patch(
        'multiprocessing.pool.ApplyResult.get',
        autospec=True,
        wraps=multiprocessing.pool.ApplyResult.get
    ) as patched:
        async_start = time.clock()
        loop = asyncio.get_event_loop()
        try:
            async_data = loop.run_until_complete(main_async(total_cnt))
        finally:
            loop.close()
        async_time = time.clock() - async_start
        patched.assert_not_called()
    
        ...
    
        pool_start = time.clock()
        pool_data = main_pool(total_cnt)
        pool_time = time.clock() - pool_start
        patched.assert_called()
    

    pool.ApplyResult.get is the method which is called before returning the value from pool.map (as well as from apply, join, so if you aren't sure what exact method from the multiprocessing the second tested module uses, you can stick to pool.ApplyResult.get).

    Then the unittest.mock.patch object: it's a tool used in testing, and it's purpose is substitute some method or object either in standard library or in third-party libraries. Normally, it prevents patched method from being called and just returns some predefined value mimicking the work of the original method.

    But you can use in a different manner, with wraps parameter. If you pass the original method to this parameter, the original method will be called in process. Still, the pool.ApplyResult.get will contain the patched object instead of the original get method. But the original get is called when the patched object processes the call. So you can have both the result of that method and some extra statistics provided by the unittest library, like assert_called.