Note
This feature is not available for M0
Free clusters and
Flex clusters. To learn more about which features are
unavailable, see Atlas M0 (Free Cluster) Limits.
You can restore data archived to S3 or Google Cloud Storage buckets using
mongoimport
and mongorestore
. This page has a sample procedure to
import archived data and rebuild indexes using the AWS or gcloud
CLI depending on the data source, and the MongoDB Database
Tools.
Prerequisites
Before you begin, you must:
Install the mongoimport and mongorestore tools
Procedure
Copy the data in the S3 bucket to a folder using the AWS CLI and extract the data.
aws s3 cp s3://<bucketName>/<prefix> <downloadFolder> --recursive gunzip -r <downloadFolder>
where:
| Name of the AWS S3 bucket. | |
| Path to archived data in the bucket. The path has the following format:
| |
| Path to the local folder where you want to copy the archived data. |
For example, run a command similar to the following:
Example
aws s3 cp s3://export-test-bucket/exported_snapshots/1ab2cdef3a5e5a6c3bd12de4/12ab3456c7d89d786feba4e7/myCluster/2021-04-24T0013/1619224539 mybucket --recursive gunzip -r mybucket
Copy and store the following script in a file named massimport.sh
.
!/bin/bash regex='/(.+)/(.+)/.+' dir=${1%/} connstr=$2 iterate through the subdirectories of the downloaded and extracted snapshot export and restore the docs with mongoimport find $dir -type f -not -path '*/\.*' -not -path '*metadata\.json' | while read line ; do [[ $line =~ $regex ]] db_name=${BASH_REMATCH[1]} col_name=${BASH_REMATCH[2]} mongoimport --uri "$connstr" --mode=upsert -d $db_name -c $col_name --file $line --type json done create the required directory structure and copy/rename files as needed for mongorestore to rebuild indexes on the collections from exported snapshot metadata files and feed them to mongorestore find $dir -type f -name '*metadata\.json' | while read line ; do [[ $line =~ $regex ]] db_name=${BASH_REMATCH[1]} col_name=${BASH_REMATCH[2]} mkdir -p ${dir}/metadata/${db_name}/ cp $line ${dir}/metadata/${db_name}/${col_name}.metadata.json done mongorestore "$connstr" ${dir}/metadata/ remove the metadata directory because we do not need it anymore and this returns the snapshot directory in an identical state as it was prior to the import rm -rf ${dir}/metadata/
Here:
--mode=upsert
enablesmongoimport
to handle duplicate documents from an archive.--uri
specifies the connection string for the Atlas cluster.
Run the massimport.sh
utility to import the archived data into the Atlas cluster.
sh massimport.sh <downloadFolder> "mongodb+srv://<connectionString>"
where:
| Path to the local folder where you copied the archived data. |
| Connection string for the Atlas cluster. |
For example, run a command similar to the following:
Example
sh massimport.sh mybucket "mongodb+srv://<myConnString>"
Copy the data in the Google Cloud Storage bucket using the gcloud
CLI and extract the data.
gsutil -m cp -r "gs://<bucketName>/<prefix> <downloadFolder>" --recursive gunzip -r <downloadFolder>
where:
| Name of the Google Cloud bucket. | |
| Path to archived data in the bucket. The path has the following format:
| |
| Path to the local folder where you want to copy the archived data. |
Example
gsutil -m cp -r gs://export-test-bucket/exported_snapshots/1ab2cdef3a5e5a6c3bd12de4/12ab3456c7d89d786feba4e7/myCluster/2021-04-24T0013/1619224539 mybucket --recursive gunzip -r mybucket
Copy and store the following script in a file named
massimport.sh
.
!/bin/bash regex='/(.+)/(.+)/.+' dir=${1%/} connstr=$2 iterate through the subdirectories of the downloaded and extracted snapshot export and restore the docs with mongoimport find $dir -type f -not -path '*/\.*' -not -path '*metadata\.json' | while read line ; do [[ $line =~ $regex ]] db_name=${BASH_REMATCH[1]} col_name=${BASH_REMATCH[2]} mongoimport --uri "$connstr" --mode=upsert -d $db_name -c $col_name --file $line --type json done create the required directory structure and copy/rename files as needed for mongorestore to rebuild indexes on the collections from exported snapshot metadata files and feed them to mongorestore find $dir -type f -name '*metadata\.json' | while read line ; do [[ $line =~ $regex ]] db_name=${BASH_REMATCH[1]} col_name=${BASH_REMATCH[2]} mkdir -p ${dir}/metadata/${db_name}/ cp $line ${dir}/metadata/${db_name}/${col_name}.metadata.json done mongorestore "$connstr" ${dir}/metadata/ remove the metadata directory because we do not need it anymore and this returns the snapshot directory in an identical state as it was prior to the import rm -rf ${dir}/metadata/
Here:
--mode=upsert
enablesmongoimport
to handle duplicate documents from an archive.--uri
specifies the connection string for the Atlas cluster.
Run the massimport.sh
utility to import the archived
data into the Atlas cluster.
sh massimport.sh <downloadFolder> "mongodb+srv://<connectionString>"
where:
| Path to the local folder where you copied the archived data. |
| Connection string for the Atlas cluster. |
Example
sh massimport.sh mybucket "mongodb+srv://<myConnString>"