I'm doing a query on oracle to update a token in a table to do some work on the marked rows.
My problem is that I want to limit the number of rows updated each time to a specific batch size and also to a group by result.
It is kinda hard for me to explain in abstract terms so I'll materialize it.
Say that I have a table called 'staging' with the following fields: (Id, IdFile Grouping, Name, Address1, Address2, Address3, Country, Token)
What I'm trying to do is create a query to do batches with this info.
For each batch, I update the token so that those records are marked to work later on. I want a batch by file, and each batch has a size limit, so i'm using the rownum to limit it. Another limitation I have is grouping by batch.
Let's say that my batch size is 5, and If I have 3 and only 3 rows with grouping column filled I have to create a batch with only that 3 rows.
This is my biggest pet peeve as when I'm doing the group by, I get as a result in just one batch results with and without grouping. (If I had a grouping, I had to see if the destiny address is the same for all rows and create a batch with it.)
Can you help me understand how I have to create this query?
What I'm using right now is:
update schema.staging set
token = 'token4'
where id in ( select t.id
from (
select stage.id, stage.grouping
from (select idFile
from (select a.idFile
from schema.staging a
inner join schema.config c on c.id = a.idcfgpriority
where nvl2(a.token,0,1) = 1
group by a.idFile, c.order
order by c.order desc) files
where rownum = 1 ) priorityFile
inner join schema.staging stage on stage.idFile = priorityFile.idFile
where nvl2(stage.token,0,1) = 1
group by stage.idFile, stage.token, stage.id, grouping sets ( (stage.grouping),
(stage.name, stage.Address1,stage.Address2,stage.Address3,stage.Country))
order by stage.grouping
) t
where rownum <= 5 )
For the following data:
Id IdFile Grouping Name Address1 Address2 Address3 Country Token
====================================================================================
1 1 null Name1 Address1 Address1 Address1 Country null
2 1 1 Name1 Address1 Address1 Address1 Country null
3 1 1 Name1 Address1 Address1 Address1 Country null
4 1 1 Name1 Address1 Address1 Address1 Country null
5 1 2 Name1 Address1 Address1 Address1 Country null
6 1 2 LALAL XPTO Address1 Address1 Country null
7 1 null Name1 Address1 Address1 Address1 Country null
8 1 null Name1 Address1 Address1 Address1 Country null
9 1 null Name1 Address1 Address1 Address1 Country null
10 2 null Name1 Address1 Address1 Address1 Country null
11 2 null Name1 Address1 Address1 Address1 Country null
12 2 null Name1 Address1 Address1 Address1 Country null
13 2 null Name1 Address1 Address1 Address1 Country null
14 2 null Name1 Address1 Address1 Address1 Country null
15 2 null Name1 Address1 Address1 Address1 Country null
I would be expecting 6 batches.
batch#1: 1, 7, 8, 9
batch#2: 2, 3, 4
batch#3: 5
batch#4: 6
batch#5: 10, 11, 12, 13, 14
batch#6: 15
Sooo... Does anyone have any ideas?
Cheers
I'm not sure that fully understand all requreiments in details, but based on your sample data and desired output following query will work:
SELECT dense_rank() over(ORDER BY "IdFile",
"Grouping" nulls FIRST,
"Name" desc,
"Address1",
"Address2",
"Address3",
"Country",
"Token",
row_num) batch_num,
q.*
FROM (SELECT trunc((row_number()
over(partition BY "IdFile",
"Grouping",
"Name",
"Address1",
"Address2",
"Address3",
"Country",
"Token" ORDER BY "Id") - 1) / 5) row_num,
s.*
FROM sample s) q
ORDER BY "Id";
Result:
| BATCH_NUM | ROW_NUM | ID | IDFILE | GROUPING | NAME | ADDRESS1 | ADDRESS2 | ADDRESS3 | COUNTRY | TOKEN |
|-----------|---------|----|--------|----------|-------|----------|----------|----------|---------|--------|
| 1 | 0 | 1 | 1 | (null) | Name1 | Address1 | Address1 | Address1 | Country | (null) |
| 2 | 0 | 2 | 1 | 1 | Name1 | Address1 | Address1 | Address1 | Country | (null) |
| 2 | 0 | 3 | 1 | 1 | Name1 | Address1 | Address1 | Address1 | Country | (null) |
| 2 | 0 | 4 | 1 | 1 | Name1 | Address1 | Address1 | Address1 | Country | (null) |
| 3 | 0 | 5 | 1 | 2 | Name1 | Address1 | Address1 | Address1 | Country | (null) |
| 4 | 0 | 6 | 1 | 2 | LALAL | XPTO | Address1 | Address1 | Country | (null) |
| 1 | 0 | 7 | 1 | (null) | Name1 | Address1 | Address1 | Address1 | Country | (null) |
| 1 | 0 | 8 | 1 | (null) | Name1 | Address1 | Address1 | Address1 | Country | (null) |
| 1 | 0 | 9 | 1 | (null) | Name1 | Address1 | Address1 | Address1 | Country | (null) |
| 5 | 0 | 10 | 2 | (null) | Name1 | Address1 | Address1 | Address1 | Country | (null) |
| 5 | 0 | 11 | 2 | (null) | Name1 | Address1 | Address1 | Address1 | Country | (null) |
| 5 | 0 | 12 | 2 | (null) | Name1 | Address1 | Address1 | Address1 | Country | (null) |
| 5 | 0 | 13 | 2 | (null) | Name1 | Address1 | Address1 | Address1 | Country | (null) |
| 5 | 0 | 14 | 2 | (null) | Name1 | Address1 | Address1 | Address1 | Country | (null) |
| 6 | 1 | 15 | 2 | (null) | Name1 | Address1 | Address1 | Address1 | Country | (null) |
I have played a little with ordering to fully simulate batch numbering you have provided. You can safely discard desc
and nulls FIRST
parts. Let me know if the query solves your problem.