Today, I wrote a simple script that permitted me to benchmark an openstack swift server:
import swiftclient
import uuid
from concurrent.futures import ThreadPoolExecutor
def create():
client = swiftclient.client.Connection(
user='', key='',
authurl='https://auth/', auth_version='2.0',
tenant_name='',
os_options={'tenant_id': '',
'region_name': ''})
while True:
uid = str(uuid.uuid4())
client.put_object(container='', obj=uid, contents=b'\x00')
executor = ThreadPoolExecutor(max_workers=100)
for _ in range(100):
executor.submit(create)
This goes well, but I noticed a strange thing, the process where spiking at more than 400% of CPU usage. How is that happening since the GIL shouldn't allow the usage of more than 100% of the CPU?
The GIL only prevents two python commands from running simultaneously (causing it to only utilize a single CPU at a time). But any python code that calls out to C has the potential of releasing the GIL until the C code has to interface with the Python SDK again, usually when it returns and marshals the results back into Python values. So it's possible to have highly threaded python applications if they make heavy use of C libraries.
From the python wiki on the GIL
Note that potentially blocking or long-running operations, such as I/O, image processing, and NumPy number crunching, happen outside the GIL. Therefore it is only in multithreaded programs that spend a lot of time inside the GIL, interpreting CPython bytecode, that the GIL becomes a bottleneck.