Search code examples
pythonlistlimitchunksslice

Splitting up a python list in chunks based on length of items


I see a few great posts here on how to split Python lists into chunks like how to split an iterable in constant-size chunks. Most posts deal with dividing up the chunks or join all strings in the list together and then limit based on normal slice routines.

However, I was in need of performing something similar based on a character-limit. If you have a list of sentences but cannot truncate any slices in the list.

I was able to churn out some code here:

def _splicegen(maxchars, stringlist):
    """
    Return a list of slices to print based on maxchars string-length boundary.
    """
    count = 0  # start at 0
    slices = []  # master list to append slices to.
    tmpslices = []  # tmp list where we append slice numbers.

    for i, each in enumerate(stringlist):
        itemlength = len(each)
        runningcount = count + itemlength
        if runningcount < int(maxchars):
            count = runningcount
            tmpslices.append(i)
        elif runningcount > int(maxchars):
            slices.append(tmpslices)
            tmpslices = []
            count = 0 + itemlength
            tmpslices.append(i)
        if i==len(stringlist)-1:
            slices.append(tmpslices)
    return slices

The output should return something like: Slices is: [[0, 1, 2, 3, 4, 5, 6], [7, 8, 9, 10, 11, 12, 13], [14, 15, 16, 17, 18, 19, 20]] (Each number references an item in stringlist)

So, as I iterate over this list of lists, I can use something like "".join([item for item in each]) to print 0,1,2,3,4,5,6 on one line, 7,8,9,10,11,12,13 on another. Sometimes, a list might only be 2 items because each of those two items are very long (would add up to under the limit of 380 characters or whatever).

I know that the code is pretty bad and that I should use a generator. I'm just not sure how to do this.

Thanks.


Solution

  • Something like this should work

    def _splicegen(maxchars, stringlist):
        """
        Return a list of slices to print based on maxchars string-length boundary.
        """
        runningcount = 0  # start at 0
        tmpslice = []  # tmp list where we append slice numbers.
        for i, item in enumerate(stringlist):
            runningcount += len(item)
            if runningcount <= int(maxchars):
                tmpslice.append(i)
            else:
                yield tmpslice
                tmpslice = [i]
                runningcount = len(item)
        yield(tmpslice)
    

    Also see the textwrap module