Search code examples
pythonlist-comprehension

defining a nested variable within a python comprehension for use within that comprehension


I have a thousand + files with remarks in HTML format. Some of them have spaces at the front, some have extra spaces inbetween words and there is a specific remark that is often found that I want to exclude.

I have created a function to strip the html tags (strip_tags()). This accomplishes what I want:

stripped_remarks = [" ".join(strip_tags(rem).split()) for rem  in  remarks]  #removes  extra spaces and  html tags
stripped_remarks = [rem for rem in  remarks if rem  != r'garbage text ***']  #removes the garbage remark from  the list

I can make this one line by changing the "if rem" part so it strips the spaces and html tags like it does before "for", but that seems to do the work twice when it's not necessary. Is it possible to do something like this?

stripped_remarks = [" ".join(strip_tags(rem).split()) as strip_rem for rem in remarks if split_rem != r'garbage text ***']

By defining strip_rem within the comprehension and reusing it for my conditional, I could easily make this one line without stripping the extra spaces or html tags twice. But is it possible?


Solution

  • Using the 'walrus operator' introduced in Python 3.8, this should work:

    stripped_remarks = [strip_rem for rem in remarks if (strip_rem := " ".join(strip_tags(rem).split())) != r'garbage text ***']