beginner question about python multiprocessing?

I have a number of records in the database I want to process. Basically, I want to run several regex substitution over tokens of the text string rows and at the end, and write them back to the database.

I wish to know whether does multiprocessing speeds up the time required to do such tasks. I did a

multiprocessing.cpu_count

and it returns 8. I have tried something like

process = []
for i in range(4):
    if i == 3:
        limit = resultsSize - (3 * division)
    else:
        limit = division

    #limit and offset indicates the subset of records the function would fetch in the db
    p = Process(target=sub_table.processR,args=(limit,offset,i,))
    p.start()
    process.append(p)
    offset += division + 1

for po in process:
    po.join()

but apparently, the time taken is higher than the time required to run a single thread. Why is this so? Can someone please enlighten is this a suitable case or what am i doing wrong here?

Solution

Here are a couple of questions:

In your processR function, does it slurp a large number of records from the database at one time, or is it fetching 1 row at a time? (Each row fetch will be very costly, performance wise.)
It may not work for your specific application, but since you are processing "everything", using database will likely be slower than a flat file. Databases are optimised for logical queries, not seqential processing. In your case, can you export the whole table column to a CSV file, process it, and then re-import the results?

Hope this helps.