Cloud Firestore Automatically Schedule Data Backup and Restore (Export and Import)
In the previous article, we learnt everything about how to export data from Cloud Firestore into Cloud Storage and import data from Cloud Storage into Cloud Firestore (vice-versa). A very common requirement is to have backup and restore operations scheduled to run at regular intervals. For instance backup the entire database daily without any manual intervention.
To schedule regular imports and exports, we could use the gcloud
CLI utility or the Cloud SDK scripts (as shown in the previous article) with cron jobs. But there’s an even better way to do the same with a Firebase Function that is triggered by a Cloud Schedular job at regular intervals.
Export a function from your functions/index.js
that looks like this:
const functions = require('firebase-functions');
const firestore = require('firebase-admin/firestore');
const client = new firestore.v1.FirestoreAdminClient();
exports.scheduledFirestoreExport = functions.pubsub
.schedule('every 24 hours')
.onRun(async (context) => {
const projectId = process.env.GCP_PROJECT || process.env.GCLOUD_PROJECT;
const databaseName = client.databasePath(projectId, '(default)');
// Replace BUCKET_NAME with the actual value
const bucket = `gs://[BUCKET_NAME]`;
try {
const responses = await client.exportDocuments({
name: databaseName,
outputUriPrefix: bucket,
// Entire DB
collectionIds: [],
});
const res = responses[0];
console.log(`Export Operation Response: ${res.name}`);
} catch (err) {
console.error(err);
throw new Error('Export Operation Failed');
}
});
Most of the export logic above is borrowed from the previous article. As far as the scheduling logic goes, we create a scheduled function using functions.pubsub.schedule('your schedule').onRun((context) => { ... })
. In this case we schedule the function to run once every 24 hours (daily). This code creates a Pub/Sub topic and uses Cloud Scheduler to trigger events on that topic at the specified schedule that ends up running the function.
You may want to make the following changes in the code snippet above:
- Change the
BUCKET_NAME
to the desired bucket name value. - Modify
every 24 hours
to something more suitable for your needs. You can use the unix-cron syntax or the AppEngine cron.yaml format. - If you want to export specific collection (groups), then specify them in
collectionIds
instead of an empty array. - If you want to import documents instead of exporting, feel free to borrow the Client SDK
importDocuments
code from the previous article.
Finally, deploy the scheduled Cloud Function code to Firebase:
# Deploy all functions
$ firebase deploy --only functions
# or Deploy just a single function
$ firebase deploy --only functions:scheduledFirestoreExport
Configure Appropriate Permissions
Your Cloud Function will use the project’s default service account to perform the import and export operations. The service account looks like this:
[email protected]
Performing import and export operations require permissions that are covered by the following two roles:
Storage Admin
– The default service account should have this role’s permissions for the same project by default. So you need not do anything here.Cloud Datastore Import Export Admin
– You will have to make sure this role is assigned to the service account so that the required/relevant permissions are granted.
Trigger a Test Run
Once the function has been deployed, the scheduler job will be created which will trigger the Cloud Function at the specified interval. For testing though, you may want to trigger a test run. To do that, follow these steps:
- Go to https://console.cloud.google.com/cloudscheduler
- Click on Actions > Force a job run for the relevant scheduled job
Then to view the function logs:
- Go to https://console.cloud.google.com/functions
- Click on the relevant function
- Click on the LOGS tab