Search code examples
excelpowerbipowerquerym

How can I setup Pagination in Excel Power Query?


I am importing financial data using JSON from the web into excel, but as the source uses pagination (giving 50 results per page I need to implement pagination in order to import all the results.

The data source is JSON:

    https://localbitcoins.com//sell-bitcoins-online/VES/.json?page=1 
    or https://localbitcoins.com//sell-bitcoins-online/VES/.json?page=2

?page=1, ?page=2, ?page=3

I use the following code to implement pagination, but receive an error:

= (page as number) as table =>
let
    Source = Json.Document(Web.Contents("https://localbitcoins.com//sell-bitcoins-online/VES/.json?page="  & Number.ToText(page) )),
    Data1 = Source{1}[Data],
    RemoveBottom = Table.RemoveLastN(Data1,3)
in
    RemoveBottom

When I envoke a parameter (1 for page 1) to test it I get the following error and I can't seem to find out why?

An error occurred in the ‘GetData’ query. Expression.
Error: We cannot convert a value of type Record to type List.
Details:
    Value=Record
    Type=Type

For the record, I try to include page handling using ListGenerate:

= List.Generate( ()=>
[Result= try GetData(1) otherwise null, page = 1],
        each [Result] <> null,
        each [Result = try GetData([page]+1) otherwise null, Page = [Page]+1],
        each [Result])

What is the default way to implement pagination using Power Query in MS Excel?


Solution

  • I realise you asked this nearly a month ago and may have since found an answer, but will respond anyway in case it helps someone else.

    This line Data1 = Source{1}[Data] doesn't make sense to me, since I think Source will be a record and you can't use {1} positional lookup syntax with records.

    The code below returns 7 pages for me. You may want to check if it's getting all the pages you need/expect.

    let
        getPageOfData = (pageNumber as number) =>
            let
                options = [
                    Query = [page = Number.ToText(pageNumber)]
                ],
                url = "https://localbitcoins.com/sell-bitcoins-online/VES/.json",
                response = Web.Contents(url, options),
                deserialised = Json.Document(response)
            in deserialised,
        responses = List.Generate(
            () => [page = 1, response = getPageOfData(page), lastPage = null],
            each [lastPage] = null or [page] <= [lastPage],
            each [
                page = [page] + 1,
                response = getPageOfData(page),
                lastPage = if [lastPage] = null then if Record.HasFields(response[pagination], "next") then null else page else [lastPage]
            ],
            each [response]
        )
    in
        responses
    

    In List.Generate, my selector only picks the [response] field to keep things simple. You could drill deeper into the data either within selector itself (e.g. each [response][data][ad_list]) or create a new step/expression and use List.Transform to do so.

    After a certain amount of drilling down and transforming, you might see some data like:

    Output

    but that depends on what you need the data to look like (and which columns you're interested in).


    By the way, I used getPageOfData in the query above, but this particular API was including the URL for the next page in its responses. So pages 2 and thereafter could have just requested the URL in the response (rather than calling getPageOfData).