What is the current way to chunk a list of the following form: ["record_a:", "x"*N, "record_b:", "y"*M, ...]
, i.e. a list where the start of each record is denoted by a string ending in ":", and includes all the elements up until the next record. So the following list:
["record_a:", "a", "b", "record_b:", "1", "2", "3", "4"]
would be split into:
[["record_a", "a", "b"], ["record_b", "1", "2", "3", "4"]]
The list contains an arbitrary number of records, and each record contains an arbitrary number of list items (up until when the next records begins or when there are no more records.) how can this be done efficiently?
Use a generator:
def chunkRecords(records):
record = []
for r in records:
if r[-1] == ':':
if record:
yield record
record = [r[:-1]]
else:
record.append(r)
if record:
yield record
Then loop over that:
for record in chunkRecords(records):
# record is a list
or turn in into a list again:
records = list(chunkRecords(records))
The latter results in:
>>> records = ["record_a:", "a", "b", "record_b:", "1", "2", "3", "4"]
>>> records = list(chunkRecords(records))
>>> records
[['record_a', 'a', 'b'], ['record_b', '1', '2', '3', '4']]