Search code examples
python-2.7web-scrapingpython-requestsgetjsoninfinite-scroll

sending json requests in Python


I am trying to send some json requests for scraping an infinite scroll box like this link. Its json link is:

http://www.marketwatch.com/news/headline/getheadlines?ticker=XOM&countryCode=US&dateTime=12%3A00+a.m.+Nov.+8%2C+2016&docId=&docType=2007&sequence=6e09aca3-7207-446e-bb8a-db1a4ea6545c&messageNumber=1826&count=10&channelName=%2Fnews%2Fpressrelease%2Fcompany%2Fus%2Fxom&topic=&_=1479366266513

Some of the parameters are not neccesary and I created a dictionary of effective parameters. For example,the parameter Count is the number of items that are shown in each scrolling. My code is :

import json
import requests

parameters = {'countryCode':'US','dateTime':'', 'docId':'','sequence':'6e09aca3-7207-446e-bb8a-db1a4ea6545c', 
         'messageNumber':'1826','count':'10','channelName':'', 'topic':'_:1479366266513' }
data = json.dumps(parameters)
firstUrl = "http://www.marketwatch.com/investing/stock/xom"
html = requests.post(firstUrl, params = data).text 

My problem is that I cannot send the requests according to the parameters, when I remove all parameters, I get the same page (firstUrl link) as if I include all of them. Do you have any idea why it happens and how I can fix this problem?


Solution

  • I think the firstUrl you are using is not correct. Moreover you should use requests.get instead of post. You should send the same parameters as in your link.

    import json
    import requests
    
    parameters = {'ticker':'XOM', 'countryCode':'US','dateTime':'', 'docId':'','sequence':'6e09aca3-7207-446e-bb8a-db1a4ea6545c', 
             'messageNumber':'1826','count':'10','channelName':'', 'topic':'_:1479366266513' }
    firstUrl = "http://www.marketwatch.com/news/headline/getheadlines"
    html = requests.get(firstUrl, params = parameters)
    print(json.loads(html.text)) # array of size 10