Search code examples
google-bigquerydatabase-performance

BigQuery performance: Is this correct?


Folks, I'm using BigQuery as a superfast database for my analytics queries, but I'm very disappointed with its performance.

Let me show you the numbers:

  • Just one Table at "from" clause
  • Select about 15 fields with group by each, about 5 fields with SUM()
  • Total table rows: 3.7 millions
  • Total rows returned: 830K

When I execute this query on BigQuery's console, it takes about 1 minute to process. Is this ok for you? I was expecting that it will return in about 2 seconds... If I execute this query on a columnar database, like Sybase IQ, it takes less than 2 seconds.


Solution

  • Big Query is a highly scalable database, before being a "super fast" database. It's designed to process HUGE amount of data distributing the processing among several different machines using a technique named Dremel. Because it's designed to use several machines and parallel processing, you should expect to have super-scalability with a good performance.

    For example: analyzing all the wikipedia revisions in 5-10 seconds isn't bad, is it? But even a much smaller table would take about the same time.

    Sybase IQ is often installed in a single database and it doesn't use Dremel. That said, it's going to be faster than Big Query in many scenarios...as designed.

    Cheers!