Search code examples
pythonregexlistregex-lookaroundsunicode-string

Python2.7 code to extract only strings from list containing unicode characters, blank spaces and square brackets


I have a list with some unicode characters, blank spaces and square brackets :

alist = [u'[', u'', u'I', u'', u'want, want & want', u'', u'only & only', u'', u'this', u'', u'\\n', u'', u']', u'', u'']

How do I modify the above list using Python2.7 so that the list contains only the relevant string items 'I', 'want, want & want', 'only & only', 'this'?

alist = ['I', 'want, want & want', 'only & only', 'this']

Solution

  • Never mind. I solved this using below code :
    crumb_list = [] for breadcrumb in breadcrumbs: breadcrumb = breadcrumb.decode('unicode_escape').encode('ascii', 'ignore')
    breadcrumb = breadcrumb.replace('&', '&') if breadcrumb not in ('' , '[' , ']' , '\n'): crumb_list.append(breadcrumb) print "Crumb LIST :",crumb_list