I am currently struggling to create a program that scrapes data from the table on https://coinmarketcap.com. I see that I am in a bit over my head. However, I am trying to learn how it all works to be able to do it on my own. So far, my program prints a cryptocurrencies rank, name, and ticker symbol. Now, I am working to scrape the dynamically changing price from the table. Here is my code:
import requests
from bs4 import BeautifulSoup
url = "https://coinmarketcap.com"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
rank = 1
for td in soup.select("td:nth-of-type(3)"):
t = " ".join(tag.text for tag in td.select("p, span")).strip()
print(rank, "|", end =" "); print("{:<30} {:<10}".format(*t.rsplit(maxsplit=1)))
rank = rank + 1
for td in soup.select("td:nth-of-type(4)"):
t = " ".join(tag.text for tag in td.select("a")).strip()
print("{}_1d".format(t.rsplit(maxsplit=1)))
this prints as follows:
1 | Bitcoin BTC
[]_1d
2 | Ethereum ETH
[]_1d
3 | Tether USDT
[]_1d
4 | Binance Coin BNB
[]_1d
and so on...
How can I have it print the current price of the crypto and not just literal text? I can figure out the formatting on my own, just need help displaying the actual data. Any help is greatly appreciated. And if you can explain your solution, that would be even more helpful.
I found the following issues in your code:
print("{}_1d".format(t.rsplit(maxsplit=1)))
, is outside the inner for loop, this makes only the last value of t
to be printed (which is empty).
So, correcting this to put it inside the loop along with a change to not print every t
value is what is required.I have slightly modified your code to fix some of the issues:
import requests
from bs4 import BeautifulSoup
url = "https://coinmarketcap.com"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
rank = 1
t1, t2 = [], []
for td in soup.select("td:nth-of-type(3)"):
t1.append(" ".join(tag.text for tag in td.select("p, span")).strip())
for td in soup.select("td:nth-of-type(4)"):
t2.append(td.text)
for i in range(0, len(t1)):
rank = rank + 1
print(rank, "|", end =" "); print("{:<30} {:<10}".format(*t1[i].rsplit(maxsplit=1)))
print("{}_1d".format(t2[i]))