With Python, I want to crawl information of publicly-listed companies' executive holding details from the Internet. So I first create the full list of stock codes (with numbers only) called target_list_onlynumber, which looks like this:
Here's the website I try to crawl data from: https://data.eastmoney.com/executive/000001.html. It corresponds to the stock code <000001>. For stock with code number i, the corresponding website is https://data.eastmoney.com/executive/i.html And by performing the following code, I am able to get the dataframe I want:
df = pd.DataFrame(
requests.get('https://datacenter-web.eastmoney.com/api/data/v1/get?reportName=RPT_EXECUTIVE_HOLD_DETAILS&columns=ALL&filter=(SECURITY_CODE%3D"000001")')\
.json().get('result').get('data'))
print(df)
The resulting dataframe looks like this:
Now I want to write a for-loop to get all DataFrames of all stocks whose code numbers lie in the list of target_list_onlynumber. df_i is the DataFrame of stock with code number i. (for example, df_601857 is the DataFrame of stock with code number 601857.) Here's what I tried, which is similar to the previous code for stock <000001>:
df_list = []
for i in target_list_onlynumber:
exec_data = requests.get('https://datacenter-web.eastmoney.com/api/data/v1/get?reportName=RPT_EXECUTIVE_HOLD_DETAILS&columns=ALL&filter=(SECURITY_CODE%3D"{i}")')\
.json().get('result').get('data')
df_i = pd.DataFrame(exec_data)
df_list.append(df_i)
print(df_i)
But the outcome is:
Traceback (most recent call last):
File "/tmp/jqcore/jqboson/jqboson/core/entry.py", line 379, in _run
engine.start()
File "/tmp/jqcore/jqboson/jqboson/core/engine.py", line 231, in start
self._dispatcher.start()
File "/tmp/jqcore/jqboson/jqboson/core/dispatcher.py", line 280, in start
self._run_loop()
File "/tmp/jqcore/jqboson/jqboson/core/dispatcher.py", line 240, in _run_loop
self._loop.run()
File "/tmp/jqcore/jqboson/jqboson/core/loop/loop.py", line 107, in run
self._handle_queue()
File "/tmp/jqcore/jqboson/jqboson/core/loop/loop.py", line 153, in _handle_queue
message.callback(**message.callback_data)
File "/tmp/jqcore/jqboson/jqboson/core/mds/market_data_subscriber.py", line 228, in broadcast
consumer.send(market_data)
File "/tmp/jqcore/jqboson/jqboson/core/mds/market_data_consumer_manager.py", line 59, in consumer_gen
msg_callback()
File "/tmp/jqcore/jqboson/jqboson/core/mds/market_data_consumer_manager.py", line 52, in msg_callback
callback(market_data)
File "/tmp/jqcore/jqboson/jqboson/core/mds/market_data_consumer_manager.py", line 122, in wrapper
result = callback(*args, **kwargs)
File "/tmp/jqcore/jqboson/jqboson/core/strategy.py", line 474, in _wrapper
self._context.current_dt
File "/tmp/strategy/user_code.py", line 90, in handle_data
.json().get('result').get('data')
AttributeError: 'NoneType' object has no attribute 'get'
I don't understand why the code for one stock succeeds and the for-loop fails. Can someone help me fix it please?
the way you want to pass the parameter I into the URL string has to use "f" at the beginning of the string
instead of:
exec_data = requests.get('https://datacenter-web.eastmoney.com/api/data/v1/get?reportName=RPT_EXECUTIVE_HOLD_DETAILS&columns=ALL&filter=(SECURITY_CODE%3D"{i}")')\
.json().get('result').get('data')
do:
exec_data = requests.get(f'https://datacenter-web.eastmoney.com/api/data/v1/get?reportName=RPT_EXECUTIVE_HOLD_DETAILS&columns=ALL&filter=(SECURITY_CODE%3D"{i}")')\
.json().get('result').get('data')
that tells Python to use format style replacing what's in the curly brackets with its value