These are some notes on the tradeoffs and best practices between On Demand pricing vs Editions pricing.

Pricing models

BigQuery currently offers two very different pricing models

Editions

With Editions you are charged for "compute time" by slot-hour. A slot is a virtual CPU that BigQuery uses to execute queries.

Within Editions, you can purchase a "committment" for a lower price if you always have stuff running, with the caveat that you are charged the entire time (it doesn't scale up or down at will, and so for a yearly committment for 100 slots you end up paying for 100 * 365 * 24 slot hours, even if you don't use them).

Outside the committed capacity, you can use an autoscaling reservation, which sets up the minimum (could be 0) and the maximum slots that BigQuery can use. Slots are scaled up and down based on compute requirements for the queries.

On Demand

With On Demand you are charged for "bytes processed". The compute capacity that GCP gives you is about 2000 slots, but you are not being charged for it.

Best practices based on the pricing models

Minimize bytes scanned (especially on on demand)

  1. Do not select *, but only select the columns you need.
  2. Partition/Cluster the tables so that you only scan the minimum amount of rows needed (though this increases compute on write, so it's not always helpful with editions, see note on compute used).
  3. Follow the best practices in the bigquery docs (reduce data processed)

Minimize compute used (especially for editions)

  1. Optimize your joins, see my note about semi hash joins.
  2. Pay attention to when clustering and partitioning are more harmful than helpful. Take a look at the execution plan of the jobs that produce/update tables with partitioning and clustering. If the table is large, a lot of slot time is going to be spent on sorting the data to match the clustering.
  3. Follow the best practices in the bigquery docs (optimize query ops)

Right-size your slots autoscaling (editions only)

Pay particular attention to the jobs where you have a "contention" warning, and dig through the steps in the execution plan.

Check that the "Wait ms" stat and compare it with "Read", "Write" and "Compute". "Wait" is time that BigQuery spends waiting for slots to be available.

Given that with autoscaling you are charged by the time allocated, and slots are allocated even when they are not used, wait time still counts towards the cost.

If wait time is too high, this might mean that you need to either:

  • serialize the queries so that they are no longer in contention for slots
  • increase the max in the autoscaling reservation to reduce contention

Don't increase the autoscaling reservation maximum too much, as from experience the bigquery autoscaler is very eager to use as many slots as it can to run the query as quickly as possible, but scaling down takes time (that you are billed for), and the minimum interval of 1 min adds up quickly if you are using many thousands of slots.

Run queries with the model that's cheapest (if possible)

This sounds obvious, but Google doesn't make it exactly easy. You'll need to:

  1. estimate the price of a query in both modes.
  2. run the query in a dedicated project based on which model you choose (you cannot mix modes within a GCP project)

For 1 (estimate the price of a query in both modes), I like to use variations of the following query:

SELECT
job_id, -- the bigquery job id that identifies a run
query, -- the bigquery query text
destination_table.table_id, -- the name of the table (if the query writes to a table)
start_time,
SUM(total_slot_ms) / (1000 * 60 * 60) * 0.06 as editions_cost -- unit cost (here $0.06/slot_hour) could be different based on the edition chosen and gcp discounts
SUM(total_bytes_billed) / POW(1024, 4) * 6.25 as on_demand_cost -- unit cost (here $6.25/tb) could be different based on the edition chosen and gcp discounts
FROM `{your_project}`.`{your_region}`.INFORMATION_SCHEMA.JOBS_BY_PROJECT -- or JOBS_BY_ORGANIZATION to see the whole company, but then you'll need to remove the query field
GROUP BY ALL

Note that the prices here are estimated, because google bills by slots "assigned" not slots used so sometimes slot price is always slightly higher than the estimate (it takes time to scale up and down, and google bills a minimum of 1 minute even if a query runs for 3 seconds).

For 2 (run the query in a dedicated project), you'll need to set up two separate GCP projects, one with the reservation (or the reservation assigned from another project) and the other without.