Search code examples
mysqlpython-3.xpymysql

How to divide unique data from an sql server to two different computers


I am looking to use multiple computers to run a python script. I'd like to feed unique data from mysql to each computer running the script. I have a semi-working solution but the problem is when 2 scripts run it at the same, before either can update the status column to "processing" it will choose the same data.

I've tried:

"SELECT * FROM table WHERE status IS NULL FOR UPDATE"

but that just seemed to lock my second connection out of the database completely without allowing it to fetch the data below it for processing.

I have also tried the code below, which works, but only if 2 scripts don't try to access the db at the same time.

vids = []
ids = []
c.execute('SELECT video_id,url FROM videos WHERE status IS NULL LIMIT 100;')
data = c.fetchall()
for row in data:
    vids.append((row[1],row[0]))
    ids.append(row[0])

c.executemany('UPDATE videos SET status="processing" WHERE video_id=%s;', ids)
db.commit()

I'd like for each computer to grab unique sets of data for processing. Script 1 grabs 1-100, script 2 grabs 101-200, script 3 grabs 201-300, etc.

Thanks for the help! Have a great day!


Solution

  • Here's my suggestion

    you can use mysql lock tables, but you need to update your query to:

    c.execute('lock tables videos; UPDATE videos SET status="processing" WHERE video_id in (select t1.video_id from (select video_id, row_number() over (order by video_id) as rn from videos where coalesce(status, '') = '') as t1 where rn <= 100); unlock tables;')
    

    this solution is only applicable for mysql versionn 8.0 up.

    you don't need the for loop on this.