Search code examples
c++wikipediacurlpp

CURLPP 400 Bad Request when trying a wikipedia query


I want to extract the abstract of a wikipedia page by using the following URL:

https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro=&explaintext=&titles=Stack%20Overflow

I am also using the CURLPP Library in order to achieve this goal:

#include <curlpp/cURLpp.hpp>
#include <curlpp/Easy.hpp>
#include <curlpp/Options.hpp>
#include <curlpp/Exception.hpp>
#include <sstream>
#include <fstream>
#include <iostream>

using namespace std;

int main()
{
    string query = "Aldous Huxley";
    curlpp::Cleanup myCleanup;
    std::ostringstream os;
    string url = "https://en.wikipedia.org/w/api.php?format=xml&action=query&prop=extracts&exintro=&explaintext=&titles=";
    os << curlpp::options::Url(std::string( url + query));
    
    string asAskedInQuestion = os.str();
    cout << asAskedInQuestion << endl;
   return 0;
}

Output

Every time that I try to query Aldous Huxley I get the following error:

<html>
<head><title>400 Bad Request</title></head>
<body bgcolor="white">
<center><h1>400 Bad Request</h1></center>
<hr><center>nginx/1.9.4</center>
</body>
</html>

Any other query is working correctly. Why am I getting this error? for Example When I try Anarchism I get the following output:

<api batchcomplete="">
<query>
<pages>
<page _idx="12" pageid="12" ns="0" title="Anarchism">
<extract xml:space="preserve">
Anarchism is a political philosophy that advocates stateless societies often defined as self-governed voluntary institutions, but that several authors have defined as more specific institutions based on non-hierarchical free associations. Anarchism holds the state to be undesirable, unnecessary, or harmful. While anti-statism is central, anarchism entails opposing authority or hierarchical organisation in the conduct of human relations, including, but not limited to, the state system. As an anti-dogmatic philosophy, anarchism draws on many currents of thought and strategy. Anarchism does not offer a fixed body of doctrine from a single particular world view, instead fluxing and flowing as a philosophy. There are many types and traditions of anarchism, not all of which are mutually exclusive. Anarchist schools of thought can differ fundamentally, supporting anything from extreme individualism to complete collectivism. Strains of anarchism have often been divided into the categories of social and individualist anarchism or similar dual classifications. Anarchism is usually considered a radical left-wing ideology, and much of anarchist economics and anarchist legal philosophy reflect anti-authoritarian interpretations of communism, collectivism, syndicalism, mutualism, or participatory economics.
</extract>
</page>
</pages>
</query>
</api>

Solution

  • You need to URL-encode the query string. Use e.g.

    string query = "Aldous%20Huxley";