This page describes what a backup is, how it works, some common use cases, and best practices when creating and using backups. To learn how to create and manage backups, as well as how to restore a Filestore instance from a backup, see Back up data for disaster recovery.
What is a backup?
A Filestore backup is a copy of a file share that includes all file data and metadata of the file share from the point in time when the backup is created.
After creating a backup of a file share, you can modify or delete the original file share without affecting the backup.
You can use a backup to restore a file share to a new Filestore instance or, for basic tier instances, to the source of an existing file share.
Backups are regional resources that remain within the region that you specify at the time of creation. You can create backups in the same region as the Filestore instance or to another region to help reduce the risk of data loss.
Backups are globally addressable and can be used to restore file shares to any region, but they cannot be shared across projects.
Network transfer charges apply to cross-region network traffic. For details, see the Pricing page.
Backup creation
The first backup you create is a complete copy of all file data and metadata on a file share. Each subsequent backup copies any successive changes made to the data since the previous backup.
A group of backups associated with the same instance, region, and CMEK (if used) is called a backup chain.
A backup chain resides in a single Cloud Storage bucket and region and can be located outside of the region used to store the source instance.
All service tiers support multiple backup chains allowing you to store an instance's backups in multiple regions.
Every time a backup is created, the previous backup is scanned for both differential and incremental changes:
Differential changes: includes changes made to files on the share such as file edits, additions, or deletions.
Incremental changes: includes changes to storage in the bucket where backup data is located. This might include deduplication of data previously referenced in the chain.
Every time you save a backup to the same backup chain, the previous backup is scanned for differential and incremental changes. In such cases, a full copy is not needed.
However, storing an instance's data to multiple backup chains implies that you are saving and storing backups to alternating locations.
Every time you create a new backup to an alternating location, a complete copy
of the backup is generated again. Expect higher latency on backup create
operations when alternating between backup chains.
Unchanged data contained in previous backups are referenced in, but not copied to, newer backups. If an older backup is deleted, its unique data is copied to the next most recent backup and all internal data references are automatically updated.
Internally, a backup chain's history is tracked using snapshots, which consume capacity on the source instance.
Backup creation is instantaneous, but it takes a period that's proportional to the amount of data being copied before the backup is available for use. During this period, the backup transitions through three states:
State | Duration | Description |
---|---|---|
Creating | A few seconds | Capturing the current state of the file share. Any new changes to file share data may or may not be included in the backup. Stable writes acknowledged by the instance before the backup is initiated are included. |
Finalizing | Depends on size | Uploading data to the backup. Any new changes to file share data are not included in the backup. |
Ready | Until the backup is deleted | The backup is ready for use. |
After creation, basic tier backups are automatically compressed to reduce cost. Instance performance may be reduced while creating a backup for instances in zonal, regional, and enterprise service tiers. Creating a backup does not affect the availability or performance of basic tier instances.
Addressing redundant data
By default, backups are incremental to avoid billing for redundant data and to minimize the use of storage space. To ensure the reliability of the underlying change history, a backup may occasionally capture a full copy of the instance.
For more information, see Compare snapshots and backups.
Backup deletion
Backups are project-level resources, not a sub-resource of the source instance, and require their own separate storage. As a result, a backup's lifecycle is not tied to that of the source instance. Deleting the source does not delete its associated backups. If you want to delete a backup, you must explicitly perform a delete operation on the backup, not the instance.
Be sure to delete any unwanted backups. If a source instance is deleted, any remaining backups continue to accrue fees.
Deleting a backup is permanent and can't be undone.
Backup consistency
Filestore backups have NFSv3 and NFSv4.1 consistency semantics. Before
a backup is initiated, any write that the Filestore instance
acknowledges as written to stable storage or that is followed by an acknowledged
COMMIT
is included in the backup. For details, see
NFSv3 RFC-1813 section 3.3.7
or About supported file system protocols.
Common use cases
The following sections describe common use cases for backups.
Backing up data for disaster recovery
Imagine that you have a Filestore instance in us-west1-c
, and you
want to protect your data against disasters that affect this region. You
can schedule a job
that regularly creates backups of this instance to a remote region, say us-
east1
. If a disaster occurred involving us-west1-c
, you can create a new
instance in another location from any previous backup.
Backing up data to protect against accidental changes
If you want to protect your Filestore data against unintended changes, you can schedule a job that regularly creates backups of the instance. If you lose data, you can browse through the list of backups to identify the one with the version of the file needed. Then, you can create a new Filestore instance from the backup, mount it to the same client as the original instance, and copy the file over.
Before copying the file over, you can use the Linux
diff
command on the two mount points to check the differences between the data on the
original instance and the data restored from the backup. After the data is
recovered, you can delete the restored instance and create a new backup to
preserve your data's present state for future use.
Alternatively, you can do an in-place restore where the backup data is directly restored to the original Filestore instance, replacing all data on it with data from the backup. We recommend that you create a backup of the latest data before performing an in-place restoration because any unbacked data is lost.
Creating clones for development and testing
Imagine you have a database set up on a Filestore instance that serves production traffic. If you want to run a test with a database as an input, you can create a new Filestore instance from a backup of the production instance for the test. In this way, test usage does not interfere with production.
Similarly, you can use backups for offline analysis and investigation without affecting production.
Migrating Data
After you create a Filestore instance, you cannot change its location or service tier. To migrate your data to another region, you can create a backup of it and use the backup to create a new Filestore instance or restore it to an existing instance.
Also, when you create a new Filestore instance from a backup, you can choose between basic HDD and basic SSD tiers regardless of the tier of the source instance.
Feature limitations
Filestore backups are generally available (GA) for all service tiers.
The following limitations apply:
Filestore backups cannot be combined with the Filestore multishares feature.
Users should create a new backup or backups to replace those created in Preview. Backups created in Preview are subject to deletion. Backups created in Preview reflect feature behavior available at the time of creation. Existing backups are not updated when new capabilities are released.
The following sections cover other feature limitations related to performance, storage, capacity, encryption, and other topics in detail:
Performance
Numerous changes made through many hard-links on the same file (for example, tens or hundreds of thousands) may result in impacts to performance.
For highly utilized instances, the performance may be reduced by as much as 15% while a backup is uploaded. Basic tier instance performance is not affected by backup
create
operations.Storing an instance's data to multiple backup chains does impact backup performance. Expect higher latency on backup
create
operations when alternating between backup chains.Instance operations such as instance
restore
or instancedelete
may be delayed until a backupcreate
operation completes.In some cases,
delete
operations may take up to 24 hours to complete.
Operations concurrency
Backup
delete
operations associated with the same source instance must be performed one at a time.Bulk backup
delete
operations within a backup chain are not supported. While adelete
operation is pending, any newdelete
operations within the same backup chain return aRESOURCE_EXHAUSTED
error. This is regardless of whether the source instance has been deleted.If the source instance has been deleted, users receive a similar
FAILED_PRECONDITION
error.This limitation applies to every service tier but basic SSD and basic HDD.
Note that Filestore does support concurrent backup
delete
operations when backups reference separate source instances.For example, an instance labeled
Source1
has backup data referenced inBackup1
andBackup2
.Source2
has backup data referenced inBackup3
andBackup4
.Backup1
andBackup2
can't be deleted in parallel, however,Backup2
andBackup3
can.
For more information, see Rate limits for backups.
Backup
create
and backupdelete
operations initiated within the same backup chain can run concurrently. However, users can't complete a backupcreate
operation while the most recent backup is being deleted.- If the user attempts to create a new backup of the instance while the most
recent backup is being deleted, they will receive a
FAILED_PRECONDITION
error. For example, ifSource1
has a backup chain composed ofBackup1
andBackup2
, and the user begins acreate
operation forBackup3
, they won't be able to deleteBackup2
until thecreate
operation completes. This is because the most recent backup contains the most critical data needed to successfully complete the backupcreate
operation.
- If the user attempts to create a new backup of the instance while the most
recent backup is being deleted, they will receive a
For more information regarding operation rate limits, see Operation rate limits for backups.
Storage
Backup
restore
operations to the source instance, or to an existing instance, are not supported for zonal, regional, and enterprise instances. If you want to restore a backup of an instance in any of these service tiers, you must create a new instance.The new instance must match the source instance's service tier and capacity range. For example, if the source was created using the zonal service tier with lower capacity range, the new instance must use the same service tier and capacity range.
If you need to create an instance using the legacy high scale SSD service tier, you must run your operations directly through the Filestore API.
If you need to create an instance using the legacy enterprise service tier, you can run your operations directly through the Filestore API or from the Restore backup > New instance page in the Google Cloud.
For example, if you want to create a regional resource with 10 TiB instance capacity, you must use the legacy enterprise service tier.
Backup operations, such as
restore
,edit
, ordelete
, may not be available for select backups created in Preview.Once a RestoreInstance operation is applied to a regional or enterprise instance, you won't be able to create snapshots with the same names as previous snapshots prior to the operation.
Attempts to restore an instance from a backup while either a backup deletion or snapshot deletion are in progress will fail.
If the deletion of a backup fails, the status is marked as
invalid
. In such cases, you will need to retry thedelete
operation.
Capacity
Each backup occupies instance capacity. This capacity varies relative to the scope of changes made to the data since the last backup was created.
More specifically, when a backup is created, Filestore creates an internal snapshot of the file system which also occupies a portion of available instance capacity.
Snapshot size is also relative to the scope of changes made to data within the share since the last backup was created. This snapshot continues to exist until the next subsequent backup is created and uploaded.
All data referenced by the backup persists in the state as it was when captured and continues to take up capacity from the file system. So for example, if you were to delete data from the mounted file system, that action itself won't free up capacity. Instead, to do so, you would create a new backup after deleting or overwriting significant amounts of data.
For a detailed description of differential and incremental changes and how they are handled, see Backup creation.
To anticipate sufficient capacity for your workloads, consider applying one of the following:
- Increase instance capacity for workloads with significant, frequent data changes or a high change rate.
Encryption
When using CMEK to encrypt your backup chains, the following limitations apply:
An entire backup chain is encrypted using the same CMEK.
A CMEK must reside in the same region as the resource it encrypts.
If storing a backup chain in a region separate from the source instance, you may need to apply separate keys, one for the source and one for the backup chain.
- All service tiers support multiple backup chains, or the ability to store an instance's backups in multiple regions. If electing to use CMEK for encryption, a CMEK key must reside in the same region as the resource it encrypts. If you're storing backups in a region separate from the source, and the CMEK is not a multiregion key, you must use separate CMEK keys. For more information, see CMEK restrictions and Choosing the best CMEK location.
A single CMEK is applied to the Cloud Storage bucket where the backup chain is stored and cannot be combined or replaced.
CMEK support is not available for basic tier backups.
For more information, see CMEK support for backup chains.
Protocols
- When restoring a backup, the new instance must use the same protocol as the source instance.
Best practices
The following sections cover recommended best practices.
Preparing your file share for the best backup consistency
The quality of a backup depends on the ability of your application to recover from backups that are created during heavy write workloads. In most situations, you can create backups that have good consistency even while your applications write data to the file share. However, if your applications require strict consistency, we recommend doing one or more of the following:
- Use sync mount. For more information, see "The sync mount option" section
in
nfs(5).
Alternatively, you can open files with the
O_DIRECT|O_SYNC
flags. For more information, see open(2). - Pause applications or operating system processes that write data to the file share and cause them to flush their changes to the file share before initiating the backup. For more information, see fsync(2).
- If your applications require consistency between multiple shares, pause all applications on all instances that are writing to all file shares and create backups of all file shares before resuming your applications.
- If you require application level consistency, stop your applications and unmount the file share before creating a backup.
Using existing backups as a baseline for new backups to reduce backup creation time
Existing backups of a file share within a region are used as baselines for creating new backups of the file share, reducing backup creation time. Therefore, we recommend that you do the following:
Take a new backup of a file share before you delete the previous backup of that file share.
Wait for new backups to be in the
Ready
state before creating subsequent backups of the same file share.
Scheduling backups during off-peak hours to reduce backup creation time
Creating backups during off-peak hours reduces the time that it takes to create a backup. If you schedule regular backups of your file shares, we recommend scheduling them during off-peak hours when possible.
Peak hours for backups creation are the end of each business day and midnight in the region where the Filestore instance is located. We recommend creating your backups either in the early morning or during the business day.
Organizing your data on separate Filestore instances to maximize efficiency
The more data on the file share, the larger the backup and the more it costs. To back up only the data that you need to back up, we recommend organizing your data on separate file shares, namely:
- Storing critical data with different write patterns or with different backup requirements on different file shares.
- Reducing the number of backups that you need to create by keeping similar data in one file share.
Quota
A quota limit exists regarding the number of backups per region for basic SSD and basic HDD service tiers.
Backup quota limits don't apply to zonal, regional, and enterprise service tiers.
For more information, see Service tiers and quota.
Get started with Filestore backups
To get started using the feature, see Backup data for disaster recovery.
What's next
- Learn how to back up and restore file shares.
- Learn how to schedule backups using Cloud Scheduler.
- Learn about Google Cloud regions and zones.
- Learn about backups pricing.