I need to exctract a list of Accession number. I am a PhD student in biology working with GEOdatabase on NCBI website which give me datasets of genes. Each dataset possesses an Accession number, generally starting by "GSE" followed by numbers. I would like to exctract the list of Accession number present in the page after my research.
Here is a screen shot of what I would like to exctract (highlighted in yellow), from the page : https://www.ncbi.nlm.nih.gov/gds/?term=brain.
Is it possible to exctract it, writting a script via the console when I use inspect element ? Or any other idea ?
Sorry if I don't use correctly any of the previous term , I am not a dev.
Thank you for your help !
That's rather easy. If we look up the HTML of that page, we can see that Series Accession: ... ID: ...
results are wrapped in <div>
elements with a css class resc
.
To obtain those:
Array.from(document.getElementsByClassName("resc"))
Looking further, the actual results are wrapped in a pair of <dd>
elements, where the first element holds the Accesion number.
So it's just going over all the <div class="resc">
elements and log the first child <dd>
element's text - which can be retrieved using the .innerText property.
Executing the following line will output all the numbers to the console:
Array.from(document.getElementsByClassName("resc")).forEach(result => {console.log(result.getElementsByTagName("dd")[0].innerText)})