when I crawl a website data
headers = {"user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36",
"X-Requested-With":"XMLHttpRequest"}
req = requests.get("http://my089.p2peye.com/shuju?&type=new_borrow_paid&flag=2", headers = headers)
the req.text is a string contains these chars:
\\u7ea2\\u5cad\\u521b\\u6295
but what I want is a string like this:
\u7ea2\u5cad\u521b\u6295
how could remove "\" before "\u7ea2" so that the unicode string displays in my screen correctly?
The response you get from the server is encoded as JSON. That's where the double backslashes come from.
You need to decode the JSON to get the data structure it represents.
import requests
import json
headers = {
"user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36",
"X-Requested-With":"XMLHttpRequest"
}
response = requests.get("http://my089.p2peye.com/shuju?&type=new_borrow_paid&flag=2", headers = headers)
data = json.loads(response.text)
print(data['message'])
# >>> '数据查询成功'