Chapter 4 Temporary tables, views, stored proceedures and variables

In addition to creating permanent tables and modifying their structure and content, BigQuery also allows you to create temporary tables, views, and stored procedures. These are useful for situations where you need to manipulate data temporarily, or when you want to create reusable SQL code.

4.1 Temporary Tables

Temporary tables are tables that exist only for a short duration (24 hours as standard in BigQuery). BigQuery uses temporary tables to cache query results that aren’t written to a permanent table. The tables are created in a special dataset and named randomly. After a query finishes, the temporary table exists for up to 24 hours. To view table structure and data, go to the BigQuery console, click Personal history, and choose the query that created the temporary table.

You can also create your own temporary tables for multistatement queries. They can be useful when you want to perform intermediate calculations on a subset of data before using that subset in another query. When you are finished with the temporary table, you can delete it manually or wait for BigQuery to delete it after 24 hours.To create a temporary table, use the CREATE TEMPORARY TABLE statement:

CREATE TEMPORARY TABLE temp_table AS
SELECT *
FROM permanent_table
WHERE condition;

This statement creates a temporary table called temp_table that contains the results of the SELECT statement. Note that the column names and data types are inherited from the source table. After a query finishes, the temporary table exists for up to 24 hours. To view table structure and data, go to the BigQuery console, click Personal history, and choose the query that created the temporary table.

CREATE TEMPORARY TABLE hurricanes AS
SELECT * 
FROM `dft-stats-gcp-showcase.TS_Training.tst_storms` 
WHERE status = 'hurricane';

You must call the temporary table later in your SQL script, or Big Query will throw an error.

4.1.1 Exercise

20:00

  1. Using temporary tables and joins, show the largest planet by population for on which each species can reside. I.e. out of the X home planets of Humans, which has the highest population.

4.2 Views

Views are virtual tables that are defined by a SQL query. They can be used to simplify complex queries by breaking them down into smaller, more manageable parts, and can also be used to provide a consistent view of the data to different users. To create a view, use the CREATE VIEW statement:

CREATE VIEW view_name AS
SELECT columns
FROM tables
WHERE condition;

This statement creates a view called view_name that contains the results of the SELECT statement. Note that views do not contain any data themselves, but rather store the SQL query that defines the view. Whenever the view is queried, the query is executed against the underlying tables.

Views can be useful in a number of situations. Here are a few examples:

  • Simplifying complex queries: If you have a complex SQL query that you need to use frequently, you can create a view based on that query to simplify your code. This can make your code more readable and easier to maintain.

  • Restricting access to sensitive data: If you have a table that contains sensitive data, you can create a view that only exposes the data that users need to see. This can help to protect your data and prevent unauthorized access.

  • Aggregating data: If you need to aggregate data from multiple tables or queries, you can create a view that combines the data in a single table. This can simplify your code and make it easier to work with the aggregated data.

  • Joining tables: If you need to join multiple tables together in a specific way, you can create a view that encapsulates the join logic. This can make your code more modular and easier to understand.

  • Filtering data: If you frequently need to query a table with a specific set of filters, you can create a view that applies those filters. This can save you time by eliminating the need to specify the filters every time you query the table.

Overall, views can be a powerful tool for simplifying complex queries, controlling access to data, and improving code modularity and maintainability. By creating views based on frequently used SQL queries, you can streamline your code and make it easier to work with your data.

CREATE VIEW `dft-stats-gcp-showcase.TS_Training.hurricane_view` AS
SELECT * 
FROM `dft-stats-gcp-showcase.TS_Training.tst_storms` 
WHERE status = 'hurricane';

4.2.1 Exercise

10:00

  1. Using what we learned create your own View in your own dataset.
  2. Query your view as part of a new SQL script

4.3 Stored Procedures

Stored procedures are reusable blocks of SQL code that can be called from within other SQL queries. They are useful for encapsulating complex logic, and can help to make your SQL code more modular and maintainable. To create a stored procedure, use the CREATE PROCEDURE statement:

CREATE PROCEDURE procedure_name(parameter_list)
BEGIN
  -- SQL statements here
END;

This statement creates a stored procedure called procedure_name that takes a list of parameters and contains the SQL statements defined within the BEGIN and END blocks. Note that the parameter list is optional, and that stored procedures can be called from within other SQL queries using the CALL statement:

CALL procedure_name(parameter_list);

The example below will insert a droid into our training record every time we call the proceedure.

create procedure TS_Training.insert_droids()
BEGIN
INSERT INTO`dft-stats-gcp-showcase.TS_Training.training_record` 
VALUES("grey", "15th March", "Droid");
END

Call this proceedure like:

CALL TS_Training.insert_droids();

4.3.1 Exercise

10:00

  1. Using what we learned create your own stored proceedure in your own dataset.

4.4 Using Variables

Variables are named values that can be used to store values for later use in SQL queries. In BigQuery, you can declare variables using the DECLARE statement:

DECLARE variable_name datatype [DEFAULT default_value];

The datatype is the data type of the variable, and the default_value is an optional default value for the variable. You can assign a value to the variable using the SET statement:

SET variable_name = expression;

The expression can be any valid SQL expression that evaluates to the same data type as the variable. Once you’ve declared and assigned a value to a variable, you can use it in any subsequent SQL statements in the same session.

Variables can be useful in situations where you need to use the same value in multiple places within a single SQL query. For example, if you need to perform a series of calculations on a single value, you can declare a variable to store that value and use it in each calculation. This can help to simplify your code and make it more readable.

Here’s an example of how to use variables in a BigQuery SQL query:

DECLARE name STRING DEFAULT 'Droid';

INSERT INTO`dft-stats-gcp-showcase.TS_Training.training_record` 
VALUES("grey", "15th March", name);

By using variables in this way, we can make the SQL code more readable and easier to understand. Additionally, if we need to change the name we can simply update the variable values at the top of the query, rather than modifying the SQL code in multiple places.