I wonder whether Bio.Entrez's efetch()
retrieves all metadata of a PubMed article, given a PMID as input. By all metadata, I mean whether PubMed has any more metadata than what efetch()
retrieves.
For example, I see that for the PMID 23954024
, efetch()
retrieves an abstract that contains a bit less information than the abstract on PubMed's website (http://www.ncbi.nlm.nih.gov/pubmed/23954024):
efetch()
:
"AbstractText": [
"Rotator cuff tendinopathy is a common source of shoulder pain characterised by persistent and/or recurrent problems for a proportion of sufferers. The aim of this study was to pilot the methods proposed to conduct a substantive study to evaluate the effectiveness of a self-managed loaded exercise programme versus usual physiotherapy treatment for rotator cuff tendinopathy.",
"A single-centre pragmatic unblinded parallel group pilot randomised controlled trial.",
"One private physiotherapy clinic, northern England.",
"Twenty-four participants with rotator cuff tendinopathy.",
"The intervention was a programme of self-managed loaded exercise. The control group received usual physiotherapy treatment.",
"Baseline assessment comprised the Shoulder Pain and Disability Index (SPADI) and the Short-Form 36, repeated three months post randomisation.",
"The recruitment target was met and the majority of participants (98%) were willing to be randomised. 100% retention was attained with all participants completing the SPADI at three months. Exercise adherence rates were excellent (90%). The mean change in SPADI score was -23.7 (95% CI -14.4 to -33.3) points for the self-managed exercise group and -19.0 (95% CI -6.0 to -31.9) points for the usual physiotherapy treatment group. The difference in three month SPADI scores was 0.1 (95% CI -16.6 to 16.9) points in favour of the usual physiotherapy treatment group.",
"In keeping with previous research which indicates the need for further evaluation of self-managed loaded exercise for rotator cuff tendinopathy, these methods and the preliminary evaluation of outcome offer a foundation and stimulus to conduct a substantive study."
],
http://www.ncbi.nlm.nih.gov/pubmed/23954024 : Abstract OBJECTIVES: Rotator cuff tendinopathy is a common source of shoulder pain characterised by persistent and/or recurrent problems for a proportion of sufferers. The aim of this study was to pilot the methods proposed to conduct a substantive study to evaluate the effectiveness of a self-managed loaded exercise programme versus usual physiotherapy treatment for rotator cuff tendinopathy.
DESIGN:
A single-centre pragmatic unblinded parallel group pilot randomised controlled trial.
SETTING:
One private physiotherapy clinic, northern England.
PARTICIPANTS:
Twenty-four participants with rotator cuff tendinopathy.
INTERVENTIONS:
The intervention was a programme of self-managed loaded exercise. The control group received usual physiotherapy treatment.
MAIN OUTCOMES:
Baseline assessment comprised the Shoulder Pain and Disability Index (SPADI) and the Short-Form 36, repeated three months post randomisation.
RESULTS:
The recruitment target was met and the majority of participants (98%) were willing to be randomised. 100% retention was attained with all participants completing the SPADI at three months. Exercise adherence rates were excellent (90%). The mean change in SPADI score was -23.7 (95% CI -14.4 to -33.3) points for the self-managed exercise group and -19.0 (95% CI -6.0 to -31.9) points for the usual physiotherapy treatment group. The difference in three month SPADI scores was 0.1 (95% CI -16.6 to 16.9) points in favour of the usual physiotherapy treatment group.
CONCLUSIONS:
In keeping with previous research which indicates the need for further evaluation of self-managed loaded exercise for rotator cuff tendinopathy, these methods and the preliminary evaluation of outcome offer a foundation and stimulus to conduct a substantive study.
(the OBJECTIVES
, DESIGN
, SETTING
, etc. are missing from efetch()
's abstract.)
What other metadata does efetch()
misses, and is there any way to programmatically retrieve the missing information?
The info is not missing:
from Bio import Entrez
Entrez.email = "[email protected]"
handle = Entrez.efetch(db="pubmed", id="23954024", rettype="xml")
print(handle.read())
Part of the output:
<Abstract>
<AbstractText Label="OBJECTIVES" NlmCategory="OBJECTIVE">Rotator cuff tendinopathy is a common source of shoulder pain characterised by persistent and/or recurrent problems for a proportion of sufferers. The aim of this study was to pilot the methods proposed to conduct a substantive study to evaluate the effectiveness of a self-managed loaded exercise programme versus usual physiotherapy treatment for rotator cuff tendinopathy.</AbstractText>
<AbstractText Label="DESIGN" NlmCategory="METHODS">A single-centre pragmatic unblinded parallel group pilot randomised controlled trial.</AbstractText>
<AbstractText Label="SETTING" NlmCategory="METHODS">One private physiotherapy clinic, northern England.</AbstractText>
[...]