I am getting some odd behavior that I do not quite understand. I am hoping someone can explain what is going on.
Consider this metadata:
<meta property="og:title" content="This is the Tesla Semi truck">
<meta name="twitter:title" content="This is the Tesla Semi truck">
This line successfully finds ALL "og" properties and returns a list.
opengraphs = doc.html.head.findAll(property=re.compile(r'^og'))
However, this line fails to do the same thing for the twitter cards.
twitterCards = doc.html.head.findAll(name=re.compile(r'^twitter'))
Why does the first line successfully find all the "og" (opengraph cards), but fail to find the twitter cards?
Problem is name=
which has special meaning. It is used to find tag name - in your code it is meta
You have to add "meta"
and use dictionary with "name"
Example with different items.
from bs4 import BeautifulSoup
import re
data='''
<meta property="og:title" content="This is the Tesla Semi truck">
<meta property="twitter:title" content="This is the Tesla Semi truck">
<meta name="twitter:title" content="This is the Tesla Semi truck">
'''
head = BeautifulSoup(data)
print(head.findAll(property=re.compile(r'^og'))) # OK
print(head.findAll(property=re.compile(r'^tw'))) # OK
print(head.findAll(name=re.compile(r'^meta'))) # OK
print(head.findAll(name=re.compile(r'^tw'))) # empty
print(head.findAll('meta', {'name': re.compile(r'^tw')})) # OK