I am currently writing a Java program that uses FreeCite API (a citation extraction service) - the API guide is defined here (there is an example in Ruby). I've been trying the API using Java (Apache HttpClient) for days, but it doesn't work as expected.
Code:
require 'net/http'
Net::HTTP.start('localhost', 3000) do |http|
response = http.post('/citations/create',
'citation=A. Bookstein and S. T. Klein, \
Detecting content-bearing words by serial clustering, \
Proceedings of the Nineteenth Annual International ACM SIGIR Conference \
on Research and Development in Information Retrieval, \
pp. 319327, 1995.',
'Accept' => 'text/xml')
puts "Code: #{response.code}"
puts "Message: #{response.message}"
puts "Body:\n #{response.body}"
end
n.b.: localhost
refers to FreeCite. The expected response code is 201, and the response is XML.
Result:
<citations>
<citation valid=true>
<authors>
<author>I S Udvarhelyi</author>
<author>C A Gatsonis</author>
<author>A M Epstein</author>
<author>C L Pashos</author>
<author>J P Newhouse</author>
<author>B J McNeil</author>
</authors>
<title>Acute Myocardial Infarction in the Medicare population: process of care and clinical outcomes</title>
<journal>Journal of the American Medical Association</ journal>
<pages>18--2530</pages>
<year>1992</year>
<raw_string>Udvarhelyi, I.S., Gatsonis, C.A., Epstein, A.M., Pashos, C.L., Newhouse, J.P. and McNeil, B.J. Acute Myocardial Infarction in the Medicare population: process of care and clinical outcomes. Journal of the American Medical Association, 1992; 18:2530-2536.</raw_string>
<ctx:context-objects xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xsi:schemaLocation='info:ofi/fmt:xml:xsd:ctx http://www.openurl.info/registry/docs/info:ofi/fmt:xml:xsd:ctx' xmlns:ctx='info:ofi/fmt:xml:xsd:ctx'>
<ctx:context-object timestamp='2008-07-11T00:57:33-04:00'
encoding='info:ofi/enc:UTF-8' version='Z39.88-2004' identifier=''>
<ctx:referent>
<ctx:metadata-by-val>
<ctx:format>info:ofi/fmt:xml:xsd:journal</ctx:format>
<ctx:metadata>
<journal xmlns:rft='info:ofi/fmt:xml:xsd:journal' xsi:schemaLocation='info:ofi/fmt:xml:xsd:journal http://www.openurl.info/registry/docs/info:ofi/fmt:xml:xsd:journal'>
<rft:atitle>Acute Myocardial Infarction in the Medicare population: process of care and clinical outcomes</rft:atitle>
<rft:spage>18</rft:spage>
<rft:date>1992</rft:date>
<rft:stitle>Journal of the American Medical Association</rft:stitle>
<rft:genre>article</rft:genre>
<rft:epage>2530</rft:epage>
<rft:au>I S Udvarhelyi</rft:au>
<rft:au>C A Gatsonis</rft:au>
<rft:au>A M Epstein</rft:au>
<rft:au>C L Pashos</rft:au>
<rft:au>J P Newhouse</rft:au>
<rft:au>B J McNeil</rft:au>
</journal>
</ctx:metadata>
</ctx:metadata-by-val>
</ctx:referent>
</ctx:context-object>
</ctx:context-objects>
</citation>
</citations>
Code:
import java.io.IOException;
import java.io.InputStream;
import java.io.UnsupportedEncodingException;
import java.util.ArrayList;
import java.util.List;
import org.apache.commons.io.IOUtils;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.NameValuePair;
import org.apache.http.client.ClientProtocolException;
import org.apache.http.client.HttpClient;
import org.apache.http.client.entity.UrlEncodedFormEntity;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.message.BasicNameValuePair;
public class HttpClientTest {
public static void main(String[] args) throws UnsupportedEncodingException {
HttpClient httpclient = HttpClients.createDefault();
HttpPost httppost = new HttpPost("http://freecite.library.brown.edu/citations/create");
// Request parameters and other properties.
List<NameValuePair> params = new ArrayList<NameValuePair>();
params.add(new BasicNameValuePair("citation", "A. Bookstein and S. T. Klein, Detecting content-bearing words by serial clustering, "
+ "Proceedings of the Nineteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 319327, 1995."));
httppost.setEntity(new UrlEncodedFormEntity(params, "UTF-8"));
//Execute and get the response.
HttpResponse response = null;
try {
response = httpclient.execute(httppost);
response.setHeader("Content-Type", "text/xml");
} catch (ClientProtocolException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
HttpEntity entity = response.getEntity();
System.out.println(response.getStatusLine());
if (entity != null) {
InputStream instream = null;
try {
instream = entity.getContent();
// NB: does not close inputStream, you can use IOUtils.closeQuietly for that
String theString = IOUtils.toString(instream, "UTF-8");
IOUtils.closeQuietly(instream);
System.out.println(theString);
} catch (UnsupportedOperationException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
try {
// do something useful
} finally {
try {
instream.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
}
Result:
I got the entire HTML page, instead of the XML; and response code 200 instead of 201.
HTTP/1.1 200
<script src="/javascripts/prototype.js?1218559878" type="text/javascript"></script>
<link href="/stylesheets/citation.css?1218559878" media="screen" rel="stylesheet" type="text/css" />
<table>
<tr>
<td>
<span class="citation"> <span class="authors"> <span class="author"> A Bookstein</span> <span class="author"> S T Klein</span> </span> <span class="title"> Detecting content-bearing words by serial clustering</span> <span class="booktitle"> Proceedings of the Nineteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval</span> <span class="pages"> 319327</span> <span class="year"> 1995</span> <br> <span class="raw_string"> A. Bookstein and S. T. Klein, Detecting content-bearing words by serial clustering, Proceedings of the Nineteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 319327, 1995.</span> </span>
<br>
<code> <ctx:context-objects xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xsi:schemaLocation='info:ofi/fmt:xml:xsd:ctx http://www.openurl.info/registry/docs/info:ofi/fmt:xml:xsd:ctx' xmlns:ctx='info:ofi/fmt:xml:xsd:ctx'><ctx:context-object timestamp='2016-10-29T02:43:38-04:00' encoding='info:ofi/enc:UTF-8' version='Z39.88-2004' identifier=''><ctx:referent><ctx:metadata-by-val><ctx:format>info:ofi/fmt:xml:xsd:book</ctx:format><ctx:metadata><book xmlns:rft='info:ofi/fmt:xml:xsd:book' xsi:schemaLocation='info:ofi/fmt:xml:xsd:book http://www.openurl.info/registry/docs/info:ofi/fmt:xml:xsd:book'><rft:atitle>Detecting content-bearing words by serial clustering</rft:atitle><rft:date>1995</rft:date><rft:btitle>Proceedings of the Nineteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval</rft:btitle><rft:genre>proceeding</rft:genre><rft:pages>319327</rft:pages><rft:au>A Bookstein</rft:au><rft:au>S T Klein</rft:au></book></ctx:metadata></ctx:metadata-by-val></ctx:referent></ctx:context-object></ctx:context-objects> </code>
</td>
<td bgcolor="FF9999" class='choose_option'>
<input id="unusable"
name="citation_rating_13655375"
type="radio"
value="unusable"
onclick="new Ajax.Request('/citations/set_rating/13655375', {parameters:{rating: this.value} }); return true;"
/>
<label for='unusable'>unusable</label>
</td>
<td bgcolor="FFFFCC" class='choose_option'>
<input id="usable"
name="citation_rating_13655375"
type="radio"
value="usable"
onclick="new Ajax.Request('/citations/set_rating/13655375', {parameters:{rating: this.value} }); return true;"
/>
<label for='usable'>good enough</label>
</td>
<td bgcolor="CCFFCC" class='choose_option'>
<input id="perfect"
name="citation_rating_13655375"
type="radio"
value="perfect"
onclick="new Ajax.Request('/citations/set_rating/13655375', {parameters:{rating: this.value} }); return true;"
/>
<label for='perfect'>perfect</label>
</td>
</tr>
</table>
<br>
Key:
<span title="author" class="author">Authors</span>
<span title="title" class="title">Title</span>
<span title="journal" class="journal">Journal</span>
<span title="booktitle" class="booktitle">Booktitle</span>
<span title="editor" class="editor">Editor</span>
<span title="volume" class="volume">Volume</span>
<span title="publisher" class="publisher">Publisher</span>
<span title="institution" class="institution">Institution</span>
<span title="location" class="location">Location</span>
<span title="number" class="number">Number</span>
<span title="pages" class="pages">Pages</span>
<span title="year" class="year">Year</span>
<span title="tech" class="tech">Tech</span>
<span title="note" class="note">Note</span>
<br>
<span class="raw_string">Original citation string</span>
<br>
<code>ContextObject</code>
<br>
<a href="/welcome">Home</a>
n.b.: Inside the <code>
tag above, there is this XML data:
<rft:atitle>Detecting content-bearing words by serial clustering</rft:atitle>
<rft:date>1995</rft:date>
<rft:btitle>Proceedings of the Nineteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval</rft:btitle>
<rft:genre>proceeding</rft:genre>
<rft:pages>319327</rft:pages>
<rft:au>A Bookstein</rft:au>
<rft:au>S T Klein</rft:au>
Question: Where is the error and how could I fix this to get an XML response (w/ response code 201)?
Here is what you are doing in Ruby ...
response = http.post('/citations/create',
'citation=A. Bookstein and S. T. Klein, \
Detecting content-bearing words by serial clustering, \
Proceedings of the Nineteenth Annual International ACM SIGIR Conference \
on Research and Development in Information Retrieval, \
pp. 319327, 1995.',
'Accept' => 'text/xml')
Here is what you are doing in Java
HttpClient httpclient = HttpClients.createDefault();
HttpPost httppost = new HttpPost(
"http://freecite.library.brown.edu/citations/create");
// Request parameters and other properties.
List<NameValuePair> params = new ArrayList<NameValuePair>();
params.add(new BasicNameValuePair("citation",
"A. Bookstein and S. T. Klein, Detecting content-bearing " +
"words by serial clustering, " +
"Proceedings of the Nineteenth Annual International ACM SIGIR " +
"Conference on Research and Development in Information " +
"Retrieval, pp. 319327, 1995."));
httppost.setEntity(new UrlEncodedFormEntity(params, "UTF-8"));
...
response = httpclient.execute(httppost);
response.setHeader("Content-Type", "text/xml");
See the difference?
In the Java case:
Content-type
instead of Accept
Response
object rather than HttpPost
objectNow Accept
and Content-type
mean different things. The first one says "I want you to send me something of this type". The second one says "I am sending you something of this type".
And, of course, setting a content type on a Response that you have just received is worse than useless. It is actually clobbering the real content type in the response ... which was probably "text/html", because your request didn't specify anything.
You should actually be calling
httppost.setHeader("Accept", "text/xml");
before the execute call.