2025  2

March  1

Avoiding select * in BigQuery

A quick workaround to avoid people doing "SELECT *" on your tables (which I consider harmful, see stackoverflow) in BigQuery is the following:

  1. Create a policy tag in the policy tags page. I create it under a new taxonomy "sql_rules", and call it "no_select_star"
  2. Once created, in the policy tag taxonomy page I enable the "enforce access control" option. If you need users to bypass this policy (e.g. needs to run bigquery copy operations) you can add the "fine graded reader" role to the principal associated with the policy tag in the same page.
  3. In the table you want to protect, add a dummy column (I call it "do_not_select_star") and assign it the policy tag
  4. Done! From now on, SELECT * will throw an error.

For more info see Google Cloud docs

February  1

2024  5

August  1

July  1

June  2

BigQuery performance best practice: use semi joins when possible

SQL is an amazing language, it lets you declaratively say what you want, and the engine figures out for you the best way to return it to you. Or should I say, it figures out the best way to return it to you given the information it has and the capabilities of the engine itself.

In this post, we鈥檒l discuss a performance optimization technique for BigQuery (also other advanced enough Enteprise Data Warehouses and databases support SEMI JOINS, but I'll focus on BigQuery since it's the one I use the most these days): using semi joins.

Continue reading

May  1

2021  1

September  1

2020  3

September  1

May  1

How to get started with Nix

I started learning Nix a few months ago, when it was introduced to the company I work for by some brilliant coworkers. I'm still learning and exploring this space, but I thought I'd share a few tips and resources that helped me start this journey to help newbies like me start working with this tool.

Continue reading

February  1

2019  1

January  1