I have an rdd object(created from a text file) and I am creating another rdd object by filtering with exact matching word.
rdd2 = rdd1.filter(lambda x: word in x)
word
is a string generated in a for loop. So I will be searching for some words in rdd1
in a loop. For example, if my word value is 'ebook'. So, when I am searching the rdd1, I am getting all the lines matching ebook. But, I am also getting lines with value 'ebooks'.
How to filter an rdd with exact word match? rdd2
should contain lines with only exact matching word, which is ebook
not ebooks
.
I need to create an intermediate rdd for further processes. Please help.
rdd2 = rdd1.filter(lambda x: word in x.split())
x.split()
worked for the exact word match.