I'm trying to use Spring's webflux to create an http endpoint to stream github users using Github's api. I tried to do what is described here and here but it seems that the expand is not fetching the second page of results from github's api. What am I doing wrong? Here's the code I currently have:
@RestController
@RequestMapping("/user")
public class GithubUserController {
private static final String GITHUB_API_URL = "https://api.github.com";
private final WebClient client = WebClient.create(GITHUB_API_URL);
@GetMapping(value = "/search/stream", produces = MediaType.APPLICATION_STREAM_JSON_VALUE)
public Flux<GithubUser> search(
@RequestParam String location,
@RequestParam String language,
@RequestParam String followers) {
return fetchUsers(
uriBuilder ->
uriBuilder
.path("/search/users")
.queryParam(
"q",
String.format(
"location:%s+language:%s+followers:%s", location, language, followers))
.build())
.expand(
response -> {
var links = response.headers().header("link");
Pattern p = Pattern.compile("<(.*)>; rel=\"next\".*");
for (String link : links) {
Matcher m = p.matcher(link);
if (m.matches()) {
return client.get().uri(m.group(1)).exchange();
}
}
return Flux.empty();
})
.flatMap(response -> response.bodyToFlux(GithubUsersResponse.class))
.flatMap(parsedResponse -> Flux.fromIterable(parsedResponse.getItems()))
.log();
}
private Mono<ClientResponse> fetchUsers(Function<UriBuilder, URI> url) {
return client.get().uri(url).exchange();
}
}
I can see that the regex for the second page works because if I add a print inside the if, it gets printed, however if I test this on the browser or on postman I only get the results for the first page of results returned by github's api:
{"login":"chrisbanes","id":"227486"}
{"login":"keyboardsurfer","id":"336005"}
{"login":"lucasr","id":"730395"}
{"login":"hitherejoe","id":"3879281"}
{"login":"StylingAndroid","id":"933874"}
{"login":"rstoyanchev","id":"401908"}
{"login":"RichardWarburton","id":"328174"}
{"login":"slightfoot","id":"906564"}
{"login":"tomwhite","id":"85085"}
{"login":"jstrachan","id":"30140"}
{"login":"wakaleo","id":"55986"}
{"login":"cesarferreira","id":"277426"}
{"login":"kevalpatel2106","id":"20060162"}
{"login":"jodastephen","id":"213212"}
{"login":"caveofprogramming","id":"19751656"}
{"login":"AlmasB","id":"3594742"}
{"login":"scottyab","id":"404105"}
{"login":"makovkastar","id":"1076309"}
{"login":"salaboy","id":"271966"}
{"login":"blundell","id":"655860"}
{"login":"PierfrancescoSoffritti","id":"7457011"}
{"login":"0xddr","id":"4354177"}
{"login":"irsdl","id":"1798313"}
{"login":"andreban","id":"1733592"}
{"login":"TWiStErRob","id":"2906988"}
{"login":"geometer","id":"344328"}
{"login":"neomatrix369","id":"1570917"}
{"login":"nebraslabs","id":"32421477"}
{"login":"lucko","id":"8352868"}
{"login":"isabelcosta","id":"11148726"}
The link
header in the Github API provides the URI in an escaped format. The String you pass to client.get().uri()
should be unescaped - so it escapes the escaped string, and you end up with a URL that returns nothing.
Instead, you probably want to use something similar to:
if (m.matches()) {
return client.get().uri(URI.create(m.group(1))).exchange();
}
Side note - your regular expression will probably want to account for any number of characters before the "next" link as well otherwise you'll be unable to go past the second page, so you probably want to prepend .*
to that:
Pattern p = Pattern.compile(".*<(.*)>; rel=\"next\".*");
Second side note - Github's API is rate limited (heavily rate limited if you're unauthenticated), so you may well run into those rate limits. You'll probably want to handle that situation elegantly somehow, but that's a reasonably big topic that's beyond the scope of this question.