Splitting a list element by a separator | 'Can't convert 'list' object to str implicitly' error (Python)

I have a list json_data:

> print(json_data)
> ['abc', 'bcd/chg', 'sdf', 'bvd', 'wer/ewe', 'sbc & osc']

I need to split those elements with '/', '&' or 'and' into two different elements. The result I am looking for should look like this:

>['abc', 'bcd', 'chg', 'sdf', 'bvd', 'wer', 'ewe', 'sbc' , 'osc']

The code is:

separators = ['/', 'and', '&']

titles = []
for i in json_data:
    titles.extend([t.strip() for t in i.split(separators)
                  if i.strip() != ''])

When running it, I am getting an error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-15-d0db85078f05> in <module>()
      5 titles = []
      6 for i in json_data:
----> 7     titles.extend([t.strip() for t in i.split(separators)
      8                   if i.strip() != ''])

TypeError: Can't convert 'list' object to str implicitly

How can this be fixed?

Solution

Regex is your friend:

>>> import re
>>> pat = re.compile("[/&]|and")
>>> json_data = ['abc', 'bcd/chg', 'sdf', 'bvd', 'wer/ewe', 'sbc & osc']
>>> titles = []
>>> for i in json_data:
...   titles.extend([x.strip() for x in pat.split(i)])
... 
>>> titles
['abc', 'bcd', 'chg', 'sdf', 'bvd', 'wer', 'ewe', 'sbc', 'osc']

This line noise: re.compile("[/&]|and") means "create a regular expression matching either [/&] or the word 'and'". [/&] of course matches either / or &. Having that in hand, pat.split(i) just splits the string i on anything matching pat.

LATE EDIT: Realized that of course we can skip the strip() step by complicating the regex a little. If we have the regex "\s[/&]\s|\sand\s" then of course we match any whitespace before or after the basic matched elements. This means that splitting on this pattern removes the excess whitespace, and in addition it prevents us from splitting in the middle of a word like "sandwich", should that happen to appear in our data:

>>> pat = re.compile("\s[/&]\s|\sand\s")
>>> pat.split("beans and rice and sandwiches")
['beans', 'rice', 'sandwiches']
>>>

This simplifies the construction of the list, since we no longer need to strip the whitespace from the results of the split, which incidentally saves us some looping. Given the new pattern, we can write it this way:

>>> titles = []
>>> for i in json_data:
...   titles.extend(pat.split(i))
...