I'm trying to scrape data from a bitcoin transaction. I want to parse through the source and get the amount being sent. I'm currently using the BeautifulSoup
library to achieve this. The class that contains the amount being sent it being used by multiple other variables so it's hard to nagivate through.
This is my code so far:
from bs4 import BeautifulSoup as bfs
import requests
r = requests.get(
'https://blockchair.com/bitcoin/transaction/744cb5177faaa8f3376d09657a68b719e42ab48b60cb78730a31c7bce91fd9c8')
html = r.text
soup = bfs(html, 'html.parser')
result = ''
for item in soup.find_all('span', string='0'):
result = soup.find_all('span', {'class': 'grey'})
print(result)
This returns:
[<span class="ml-2 grey"><svg aria-hidden="true" class="svg-inline--fa fa-link fa-w-16" data-icon="link" data-prefix="fas" focusable="false" role="img" viewbox="0 0 512 512" xmlns="http://www.w3.org/2000/svg"><path d="M326.612 185.391c59.747 59.809 58.927 155.698.36 214.59-.11.12-.24.25-.36.37l-67.2 67.2c-59.27 59.27-155.699 59.262-214.96 0-59.27-59.26-59.27-155.7 0-214.96l37.106-37.106c9.84-9.84 26.786-3.3 27.294 10.606.648 17.722 3.826 35.527 9.69 52.721 1.986 5.822.567 12.262-3.783 16.612l-13.087 13.087c-28.026 28.026-28.905 73.66-1.155 101.96 28.024 28.579 74.086 28.749 102.325.51l67.2-67.19c28.191-28.191 28.073-73.757 0-101.83-3.701-3.694-7.429-6.564-10.341-8.569a16.037 16.037 0 0 1-6.947-12.606c-.396-10.567 3.348-21.456 11.698-29.806l21.054-21.055c5.521-5.521 14.182-6.199 20.584-1.731a152.482 152.482 0 0 1 20.522 17.197zM467.547 44.449c-59.261-59.262-155.69-59.27-214.96 0l-67.2 67.2c-.12.12-.25.25-.36.37-58.566 58.892-59.387 154.781.36 214.59a152.454 152.454 0 0 0 20.521 17.196c6.402 4.468 15.064 3.789 20.584-1.731l21.054-21.055c8.35-8.35 12.094-19.239 11.698-29.806a16.037 16.037 0 0 0-6.947-12.606c-2.912-2.005-6.64-4.875-10.341-8.569-28.073-28.073-28.191-73.639 0-101.83l67.2-67.19c28.239-28.239 74.3-28.069 102.325.51 27.75 28.3 26.872 73.934-1.155 101.96l-13.087 13.087c-4.35 4.35-5.769 10.79-3.783 16.612 5.864 17.194 9.042 34.999 9.69 52.721.509 13.906 17.454 20.446 27.294 10.606l37.106-37.106c59.271-59.259 59.271-155.699.001-214.959z" fill="currentColor"></path></svg></span>, <span class="ml-2 grey"><svg aria-hidden="true" class="svg-inline--fa fa-copy fa-w-14" data-icon="copy" data-prefix="fas" focusable="false" role="img" viewbox="0 0 448 512" xmlns="http://www.w3.org/2000/svg"><path d="M320 448v40c0 13.255-10.745 24-24 24H24c-13.255 0-24-10.745-24-24V120c0-13.255 10.745-24 24-24h72v296c0 30.879 25.121 56 56 56h168zm0-344V0H152c-13.255 0-24 10.745-24 24v368c0 13.255 10.745 24 24 24h272c13.255 0 24-10.745 24-24V128H344c-13.2 0-24-10.8-24-24zm120.971-31.029L375.029 7.029A24 24 0 0 0 358.059 0H352v96h96v-6.059a24 24 0 0 0-7.029-16.97z" fill="currentColor"></path></svg></span>, <span class="grey ml-2">(<span>a month ago</span>)</span>, <span class="grey">.00319198</span>, <span class="grey">.00023052</span>, <span class="grey">NO</span>, <span class="grey">.00296146</span>, <span class="grey">.00102453</span>, <span class="grey">.00025613</span>, <span class="grey">.00102453</span>, <span class="grey">NO</span>, <span class="grey">NO</span>, <span class="mr-18 ml-8 grey"><svg aria-hidden="true" class="cursor svg-inline--fa fa-clock fa-w-16" data-icon="clock" data-prefix="far" focusable="false" role="img" viewbox="0 0 512 512" xmlns="http://www.w3.org/2000/svg"><path d="M256 8C119 8 8 119 8 256s111 248 248 248 248-111 248-248S393 8 256 8zm0 448c-110.5 0-200-89.5-200-200S145.5 56 256 56s200 89.5 200 200-89.5 200-200 200zm61.8-104.4l-84.9-61.7c-3.1-2.3-4.9-5.9-4.9-9.7V116c0-6.6 5.4-12 12-12h32c6.6 0 12 5.4 12 12v141.7l66.8 48.6c5.4 3.9 6.5 11.4 2.6 16.8L334.6 349c-3.9 5.3-11.4 6.5-16.8 2.6z" fill="currentColor"></path></svg></span>, <span class="grey">.00319198</span>, <span class="grey">.00031492</span>, <span class="ml-18 mr-8 grey"><svg aria-hidden="true" class="cursor svg-inline--fa fa-clock fa-w-16" data-icon="clock" data-prefix="far" focusable="false" role="img" viewbox="0 0 512 512" xmlns="http://www.w3.org/2000/svg"><path d="M256 8C119 8 8 119 8 256s111 248 248 248 248-111 248-248S393 8 256 8zm0 448c-110.5 0-200-89.5-200-200S145.5 56 256 56s200 89.5 200 200-89.5 200-200 200zm61.8-104.4l-84.9-61.7c-3.1-2.3-4.9-5.9-4.9-9.7V116c0-6.6 5.4-12 12-12h32c6.6 0 12 5.4 12 12v141.7l66.8 48.6c5.4 3.9 6.5 11.4 2.6 16.8L334.6 349c-3.9 5.3-11.4 6.5-16.8 2.6z" fill="currentColor"></path></svg></span>, <span class="grey">.00264654</span>, <span class="ml-18 mr-8 grey"><svg aria-hidden="true" class="cursor svg-inline--fa fa-clock fa-w-16" data-icon="clock" data-prefix="far" focusable="false" role="img" viewbox="0 0 512 512" xmlns="http://www.w3.org/2000/svg"><path d="M256 8C119 8 8 119 8 256s111 248 248 248 248-111 248-248S393 8 256 8zm0 448c-110.5 0-200-89.5-200-200S145.5 56 256 56s200 89.5 200 200-89.5 200-200 200zm61.8-104.4l-84.9-61.7c-3.1-2.3-4.9-5.9-4.9-9.7V116c0-6.6 5.4-12 12-12h32c6.6 0 12 5.4 12 12v141.7l66.8 48.6c5.4 3.9 6.5 11.4 2.6 16.8L334.6 349c-3.9 5.3-11.4 6.5-16.8 2.6z" fill="currentColor"></path></svg></span>, <span class="bold grey">NO</span>]
While the expected output should be:
.00319198
What should I do?
You can directly find the "amount sent" by using a CSS selector: .mr-2 span span.grey
, read the selector from left to right "find the class grey
under a span
under a span
under the class mr-2
"
import requests
from bs4 import BeautifulSoup
url = "https://blockchair.com/bitcoin/transaction/744cb5177faaa8f3376d09657a68b719e42ab48b60cb78730a31c7bce91fd9c8"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
print(soup.select_one(".mr-2 span span.grey").text)
Output:
.00319198