python-2.7 sql-server-2005 rabbitmq pyodbc python-pika

Select nth to nth row while table still have values unselected with python and pyodbc

I have a table with 10,000 rows and I want to select the first 1000 rows and then select again and this time, the next set of rows, which is 1001-2001.

I am using the BETWEEN clause in order to select the range of values. I can also increment the values. Here is my code:

count = cursor.execute("select count(*) from casa4").fetchone()[0]    
ctr = 1
ctr1 = 1000
str1 = ''
while ctr1 <= count:
    sql = "SELECT AccountNo FROM ( \
        SELECT AccountNo, ROW_NUMBER() OVER (ORDER BY Accountno) rownum \
        FROM  casa4 ) seq \
        WHERE seq.rownum BETWEEN " + str(ctr) + " AND " + str(ctr1) + ""
    ctr = ctr1 + 1
    ctr1 = ctr1 + 1000
    cursor.execute(sql)
    sleep(2) #interval in printing of the rows.

for row in cursor:
    str1 = str1 + '|'.join(map(str,row)) + '\n'
print "Records:" + str1 #var in storing the fetched rows from database.
print sql #prints the sql statement(str) and I can see that the var, ctr and ctr1 have incremented correctly. The way I want it.

What I want to achieve is using a messaging queue, RabbitMQ, I will send this rows to another database and I want to speed up the process. Selecting all and sending it to the queue returns an error.

The output of the code is that it returns 1-1000 rows correctly on the 1st but, on the 2nd loop, instead of 1001-2001 rows, it returns 1-2001 rows, 1-3001 and so on.. It always starts on 1.

Solution

I was able to recreate your issue with both pyodbc and pypyodbc. I also tried using

WITH seq (AccountNo, rownum) AS
(
    SELECT AccountNo, ROW_NUMBER() OVER (ORDER BY Accountno) rownum
    FROM casa4
)
SELECT AccountNo FROM seq
WHERE rownum BETWEEN 11 AND 20

When I run that in SSMS I just get rows 11 through 20, but when I run it from Python I get all the rows (starting from 1).

The following code does work using pyodbc. It uses a temporary table named #numbered, and might be helpful in your situation since your process looks like it would do all of its work using the same database connection:

import pyodbc
cnxn = pyodbc.connect("DSN=myDb_SQLEXPRESS")
crsr = cnxn.cursor()
sql = """\
CREATE TABLE #numbered (rownum INT PRIMARY KEY, AccountNo VARCHAR(10))
"""
crsr.execute(sql)
cnxn.commit()
sql = """\
INSERT INTO #numbered (rownum, AccountNo)
SELECT
    ROW_NUMBER() OVER (ORDER BY Accountno) AS rownum,
    AccountNo
FROM casa4
"""
crsr.execute(sql)
cnxn.commit()
sql = "SELECT AccountNo FROM #numbered WHERE rownum BETWEEN ? AND ? ORDER BY rownum"
batchsize = 1000
ctr = 1
while True:
    crsr.execute(sql, [ctr, ctr + batchsize - 1])
    rows = crsr.fetchall()
    if len(rows) == 0:
        break
    print("-----")
    for row in rows:
        print(row)
    ctr += batchsize
cnxn.close()