Search code examples
javaspring-bootgoogle-cloud-platformgoogle-bigqueryspring-cloud-gcp-bigquery

How to get paginated data from BigQuery in Java?


I am confused in how to fetch the paginated data from bigquery in Java,

I followed the docs from bigquery but it's not clear for me

// Identify the table itself
        TableId tableId = TableId.of("dataset", "table");

        // Page over 100 records. If you don't need pagination, remove the pageSize parameter.
        TableResult result = bigQuery.listTableData(tableId, BigQuery.TableDataListOption.pageSize(100));

        // Print the records
        result
            .iterateAll()
            .forEach(
                    row -> {
                        row.forEach(fieldValue -> System.out.print(fieldValue.toString() + ", "));
                        System.out.println();
                    });

I have doubt this query will be giving only first 100 rows, whereas I want to fetch all the data from table, for example in page 1 -> 100 rows, page 2 -> 100 rows and so on untill all data is fetched from table.

Please help me understand where I am wrong.. in the table there is 20 Million records


Solution

  • I have doubts this query will be giving only the first 100 rows.

    No, it will give all the rows which are returned by your query.

    The function iterateAll() iterates through all the records at once and gives you all the rows from the result. Whereas getValues() function will give only paged records of the current page.

    You can try the below code snippet for testing and understanding:

            // Page over 100 records. If you don't need pagination, remove the pageSize
            // parameter.
            TableResult results = bigquery.listTableData(tableId, TableDataListOption.pageSize(100));
    
            // .iterateAll() will iterate through the all the records at once
            int rowCount=0;
            for (FieldValueList row : results.iterateAll()) {
                rowCount++;
            }
            System.out.println("Total Fetched "+ rowCount +" records");
            
            // .getValues() will only iterate through the all the records of the current page
            int pageCount=0;
            rowCount=0;
            do{
                System.out.println("page number: "+(++pageCount));                
                for (FieldValueList row : results.getValues()) {
                    rowCount++;
                }
                System.out.println("Fetched "+rowCount+" records");
                results = results.getNextPage();
                rowCount=0;
            }while(results.hasNextPage());
    
            //for the last page
            System.out.println("page number: "+(++pageCount));
            for (FieldValueList row : results.getValues()) {
                rowCount++;
            }
            System.out.println("Fetched "+rowCount+" records");
    
            System.out.println("Query ran successfully");
    

    Result:

    enter image description here

    You can follow this cloud doc for more information.