I am trying to search through the following list
/for_sale/44.97501,46.22024,-124.82303,-123.01166_xy/0-150000_price/LOT%7CLAND_type/9_zm/2_p/
/for_sale/44.97501,46.22024,-124.82303,-123.01166_xy/0-150000_price/LOT%7CLAND_type/9_zm/3_p/
/for_sale/44.97501,46.22024,-124.82303,-123.01166_xy/0-150000_price/LOT%7CLAND_type/9_zm/6_p/
/for_sale/44.97501,46.22024,-124.82303,-123.01166_xy/0-150000_price/LOT%7CLAND_type/9_zm/7_p/
/for_sale/44.97501,46.22024,-124.82303,-123.01166_xy/0-150000_price/LOT%7CLAND_type/9_zm/8_p/
/for_sale/44.97501,46.22024,-124.82303,-123.01166_xy/0-150000_price/LOT%7CLAND_type/9_zm/2_p/
using this code:
next_page = re.compile(r'/(\d+)_p/$')
matches = list(filter(next_page.search, href_search)) #search or .match
for match in matches:
#refining_nextpage = re.compile()
print(match.group())
and am getting the following error: AttributeError: 'str' object has no attribute 'group'
.
I thought that the parenthesis around the \d+
would group the one or more numbers. My goal is to get the number preceding "_p/"
at the end of the string.
You are filtering your original list, so what is being returned are the original strings, not the match objects. If you want to return the match objects, you need to map
the search to the list, then filter the match objects. For example:
next_page = re.compile(r'/(\d+)_p/$')
matches = filter(lambda m:m is not None, map(next_page.search, href_search))
for match in matches:
#refining_nextpage = re.compile()
print(match.group())
Output:
/2_p/
/3_p/
/6_p/
/7_p/
/8_p/
/2_p/
If you only want the number part of the match, use match.group(1)
instead of match.group()
.