Search code examples
pythonpython-3.xstringsplitsplice

Slice a string into two chunks of different lengths based on character in Python


So I have a file that looks something like this:

oak
elm
tulip
redbud
birch

/plants/

allium
bellflower
ragweed
switchgrass

All I want to do is split the trees and herbaceous species into two chunks so I can call them separately like this:

print(trees)
oak
elm
tulip
redbud
birch

print(herbs)
allium
bellflower
ragweed
switchgrass

As you can see in the sample data, the data chunks are of unequal length so I have to split based on the divider "/plants/". If I try splicing, the data is now just separated by space:

for groups in plant_data:
    groups  = groups.strip()
    groups = groups.replace('\n\n', '\n')
    pos = groups.find("/plants/") 
    trees, herbs = (groups[:pos], groups[pos:])
print(trees)
oa
el
tuli
redbu
birc



alliu
bellflowe
ragwee
switchgras

If I try simply splitting, I'm getting lists (which would be okay for my purposes), but they are still not split into the two groups:

for groups in plant_data:
    groups  = groups.strip()
    groups = groups.replace('\n\n', '\n')
    trees = groups.split("/plants/")
print(trees)
['oak']
['elm']
['tulip']
['redbud']
['birch']
['']
['', '']
['']
['allium']
['bellflower']
['ragweed']
['switchgrass']

To remove blank lines, which I thought was the issue, I tried following: How do I remove blank lines from a string in Python? And I know that splitting a string by a character has been asked similarly here: Python: split a string by the position of a character

But I'm very confused as to why I can't get these two to split.


Solution

  • spam = """oak
    elm
    tulip
    redbud
    birch
    
    /plants/
    
    allium
    bellflower
    ragweed
    switchgrass"""
    
    spam = spam.splitlines()
    idx = spam.index('/plants/')
    trees, herbs = spam[:idx-1], spam[idx+2:]   
    print(trees)
    print(herbs)
    

    output

    ['oak', 'elm', 'tulip', 'redbud', 'birch']
    ['allium', 'bellflower', 'ragweed', 'switchgrass']
    

    Of course, instead of playing with idx-1, idx+2, you can remove empty str using different approach (e.g. list comprehension)

    spam = [line for line in spam.splitlines() if line]
    idx = spam.index('/plants/')
    trees, herbs = spam[:idx], spam[idx+1:]