xpath matching wrong node

The xpath

//*[h1]

shows different results when tried on python and Firebug. My code:

import requests
from lxml import html

url = "http://machinelearningmastery.com/naive-bayes-classifier-scratch-python/"
resp = requests.get(url)
page = html.fromstring(resp.content)

node = page.xpath("//*[h1]")
print node
#[<Element center at 0x7fb42143c7e0>]

But Firebug matches to a <header> tag which is what I desire.

Why is this so? How do i make my python code match <header> too?

Solution

You are missing the User-Agent header and hence the response content returned 403 Forbidden, add it to request and it works as expected:

In [9]: resp = requests.get(url, headers={"User-Agent": "Test Agent"})

In [10]: page = html.fromstring(resp.content)

In [11]: node = page.xpath("//*[h1]")

In [12]: print node
[<Element header at 0x104ff15d0>]

How to pretty format the printing of SQL queries in SQLAlchemy
global frame vs. stack frame
No FileSystem for scheme: s3 with pyspark
Printing a Tree data structure in Python
Python list.append function changes previous added member unexpectedly
Parse a xml file with multiple root element in python
Python datetime.now() with timezone
equivalent to R's `do.call` in python
Search for sequence of bytes in python
Installation of Kivy garden
Get Rid of SyntaxError: invalid syntax in Python
What shebang to use for Python scripts run under a pyenv virtualenv
Cannot use Requests-Module on AWS Lambda
One line python code to remove all the strings that doesn't start with T or t and if they contain number greater then 6
Could not find a version that satisfies the requirement tensorflow
Is there a way to exit a pytest test and continue to the next one?
Returning boolean if set is empty
Python-magic installation error - ImportError: failed to find libmagic
In scikit's precision_recall_curve, why does thresholds have a different dimension from recall and precision?
how to merge two data frames based on particular column in pandas python?
How do I access the URL's Query String in a Python CGI script?
Xarray.open_dateset uses more than double the size of the file itself
Is it safe to use the python word "type" in my code?
How to prevent an imported module from 'importing' sys.argv[1:]
How to change font (size/family) in PyScripter?
"ValueError: too many values to unpack" in Learn Python The Hard Way, Ex 13
Python 2 tuple / list unpacking using star throws SyntaxError
Python - Can't find pip.ini or pip.conf in Windows
Skimage Region Adjacency Graph (RAG) from quickshift segmentation
New URL on django admin independent of the apps