How to Restore Data Accidentally Deleted from Google BigQuery

Detailed instructions to help you restore accidentally deleted data and not get any gray hairs ahead of schedule

OWOX
Towards Data Science

--

Have you ever accidentally deleted an important table, view, or whole dataset in Google BigQuery? If you have, you know that you cannot continue working with tables and queries containing the deleted data.

The OWOX team has created a detailed step-by-step instructions to restore deleted data and some steps to do to prevent similar problems in the future.

You can restore a deleted table if it hasn’t been more than seven days since it was deleted and if you know its name and the name of the dataset from which it was deleted.

To restore a table, open the Cloud Shell command line in the Google BigQuery interface:

Image by Author

Enter the following command:

bq cp mydataset.mytable@-3600000 mydataset.newtable

Example of the same command with table and dataset names:

bq cp OWOXBI_Reports.123_Transactions_withModels@-13600000 OWOXBI_Reports_Restore.123_Transactions_withModels

Where:

  • OWOXBI_Reports.123­_Transactions­_withModels — is the dataset and the table that has been deleted.
  • OWOXBI_Reports_Restore.123­_Transactions­_withModels — is the dataset and the name of the table within it where you want to restore the data.
  • @-13600000 — a distance in the past (13,600,000 milliseconds ago) when the table you are looking for still existed. For example, if a table was deleted 30 minutes ago, it’s sufficient to set the time interval @ -3600000, which is an hour ago (60 seconds ×60 minutes × 1000).

After entering the request, authorization is required. Click the Authorize button:

Image by Author

After the command is successfully run, the table will be restored:

Image by Author

You’ll see text similar to this:

Waiting on bqjob_r4ca30008c2e3147d_0000017af0d58e5e_1 ... (0s) Current status: DONE Table 'summer-drive-112011:OWOXBI_Reports_Restore.test_table@-600000' successfully copied to 'summer-drive-112011:OWOXBI_Reports_Restore.test_table'

How to restore a deleted view

You cannot restore a remote view in the way described above. That method is only suitable for tables.

Google support recommends using Logs Explorer to restore a deleted view.

To find queries that updated a remote view, run the following query in Google Cloud Platform logs:

resource.type="bigquery_resource" protoPayload.methodName="tableservice.update" protoPayload.serviceData.tableUpdateRequest.resource.tableName.tableId="custom_events_attribution_VIEW"

Where:

  • tableservice.update — is the command to display the view update in logs
  • custom_events_attribution_VIEW — the name of the view

In the Select time range settings set the period during which changes could be made to the view (for example, one year):

Image by Author

When the command is executed, all queries that updated the view you are looking for will be displayed. Select the last timestamp query:

Image by Author

Copy the query from the log and recreate the view.

Image by Author

In addition, you can find the query that created the view you were looking for (before that, we searched for queries that updated the view). To do this, use the following command:

resource.type="bigquery_resource" protoPayload.methodName="tableservice.insert" protoPayload.serviceData.tableInsertRequest.resource.tableName.tableId="query_name_VIEW"
Image by Author

We recommend that you set the longest possible period to search the logs for the required entry.

If a view was created before logs began to be recorded, it won’t be possible to restore it. In this case, the tableservice.update command we described above can help.

How to restore a deleted dataset

If you delete a dataset, you won’t be able to restore it. You must create a new dataset with the same name, restore the tables and views that were in the deleted dataset, and move them to the new dataset.

In the first 24 hours after deleting a dataset, you cannot create a new dataset with the same name, but you can still find the names of tables and views that were part of the deleted dataset. However, the tables and views themselves will no longer be visible.

You can find views from a deleted dataset by name by searching in BigQuery:

Image by Author

The list of tables cannot be found through search, as tables don’t have a common component in their names.

To find a table name from a deleted dataset, use the following SQL query. In the query, specify the period during which the tables can be modified (creation_time BETWEEN). As a result, the query will return tables to which changes have been made within what time period. These will be the tables from the deleted dataset.

SELECT 
*
FROM (
SELECT
query,
user_email, CONCAT(destination_table.project_id,".",destination_table.dataset_id,".",destination_table.table_id) AS destination_table,
COUNT(job_id) AS job_ids,
MAX(creation_time) AS lastUpdateDate
FROM
region-us.INFORMATION_SCHEMA.JOBS_BY_PROJECT
WHERE
creation_time BETWEEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 180 DAY)
AND TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 0 DAY)
AND state = 'DONE'
AND CONCAT(destination_table.project_id,".",destination_table.dataset_id,".",destination_table.table_id) LIKE "%OWOXBI_Reports.%"
GROUP BY
1,
2,
3
ORDER BY
5 DESC)

As a result of the request, you’ll receive the following information:

  • query — the text of the query that made the change to destination_table
  • user _email — the user who started the destination_table update request
  • destination_table — the table that has been updated
  • job_ids — the number of query starts during the specified time interval (creation_time BETWEEN …)
  • lastUpdateDate — the last time the query was started (MAX(creation_time) AS lastUpdateDate)

Twenty-four hours after you delete a dataset, you can create a new dataset with the same name and begin restoring tables and views.

What to do to prevent similar problems in the future

  1. Turn on log collection in your Google BigQuery project when you start working with the project.
  2. Do not name a table and view the same as a dataset.
  3. Double-check the name of any table you want to delete from Google BigQuery.
  4. Delete tables via queries, not via the user interface.
  5. Add the entity type to dataset, table, and view names. For example:
  • OWOXBI_report_dataset
  • OWOXBI_report_table
  • OWOXBI_report_view

You can delete a table using a query:

DROP TABLE `summer-drive-112011.OWOXBI_Reports_Restore.test_table`

You can also delete a view using a query:

DROP VIEW `summer-drive-112011.OWOXBI_Reports_Restore.test_VIEW`

What to do to reduce the scale of the problem if it happens again

  • Do not change queries in Schedule and Google Apps Script; the next day, restore all views and tables, after which Schedule and Google Apps Script will work correctly.
  • If the view or table was involved in setting up the attribution model, you will need to restart the attribution calculation after restoring the data to its previous location.

Useful links on the topic:

Restoring deleted tables

Deleting a table

Managing views

How to instantly recover a table in Google BigQuery

--

--