To create a backup of your EC2 instances EBS snapshots play an important role. In this article, we are going to automate EBS volume snapshots, using Lambda and CloudWatch Events function for a daily backup and delete the snapshots after a particular limit of time.
Prerequisites
- AWS account
- IAM user in AWS account
- IAM user must have permission to access services for creating Lambda function.
- We need an IAM role to define what permissions our lambda function has.
- CloudWatch permission to create Rules.
To Create IAM Role
Log in to the AWS console with root account or with IAM user who has permission to create Roles.
- Click on Services and select IAM services.
- IAM services page will open from the left panel click on Create Role.
- Create Role page will open select lambda service(Because we need permission to access lambda function).
- On selecting lambda function it will invite you to attach permission and policies to the role. but we will create our customer policy.
We want our function able to perform the following:
- Read all type of information from EC2 (we will give it fully describe and read access)
- Create and Delete Snapshots
- Create and Delete Snapshots
- Create and access Cloudwatch Event.
To create the above policy using a visual editor or by providing a JSON. You can use the below JSON policy.
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": "logs:*", "Resource": "*" }, { "Effect": "Allow", "Action": "ec2:Describe*", "Resource": "*" }, { "Effect": "Allow", "Action": [ "ec2:CreateSnapshot", "ec2:DeleteSnapshot", "ec2:CreateTags", "ec2:DeleteTags", "ec2:ModifySnapshotAttribute" ], "Resource": [ "*" ] } ] }
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": "logs:*", "Resource": "*" }, { "Effect": "Allow", "Action": "ec2:Describe*", "Resource": "*" }, { "Effect": "Allow", "Action": [ "ec2:CreateSnapshot", "ec2:DeleteSnapshot", "ec2:CreateTags", "ec2:DeleteTags", "ec2:ModifySnapshotAttribute" ], "Resource": [ "*" ] } ] }
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": "logs:*", "Resource": "*" }, { "Effect": "Allow", "Action": "ec2:Describe*", "Resource": "*" }, { "Effect": "Allow", "Action": [ "ec2:CreateSnapshot", "ec2:DeleteSnapshot", "ec2:CreateTags", "ec2:DeleteTags", "ec2:ModifySnapshotAttribute" ], "Resource": [ "*" ] } ] }
Tagging
Lambda function uses EC2 instance tags to identify for which instance the snapshot will create.
- From EC2 Services select the EC2 instance for which we want to create a snapshot.
- From the Tags section adds the below tag.
- The name of the tag is “auto_snapshot” value for the tag is “true”.
- Save the changes by clicking the Save button.
Lambda function to create a Snapshot
Navigate to the AWS Lambda Management Console from Services.
- Click on Create function new page will open.
- Select Author from Scratch.
- Give a specific name for creating a lambda function in the Function name.
- Choose Python 3.6 as a runtime.
- For roles select Use an existing role and select the role we made earlier.
Code
I m using Boto3. Boto3 is SDK for python, which provides programmatic connectivity to create, configure, and manage AWS services with python.
import boto3 import collections import datetime from pprint import pprint ec = boto3.client('ec2') def lambda_handler(event, context): reservations = ec.describe_instances( Filters=[ {'Name': 'tag-key', 'Values': ['auto_snapshot', 'true']}, ] ).get( 'Reservations', [] ) instances = sum( [ [i for i in r['Instances']] for r in reservations ], []) print("Found {0} instances that need backing up".format(len(instances))) to_tag = collections.defaultdict(list) for instance in instances: pprint(instance) try: retention_days = [ int(t.get('Value')) for t in instance['Tags'] if t['Key'] == 'Retention'][0] except IndexError: retention_days = 15 for dev in instance['BlockDeviceMappings']: if dev.get('Ebs', None) is None: continue vol_id = dev['Ebs']['VolumeId'] print("Found EBS volume {0} on instance {1}".format( vol_id, instance['InstanceId'])) snap = ec.create_snapshot( VolumeId=vol_id, ) to_tag[retention_days].append(snap['SnapshotId']) for tags in instance['Tags']: if tags["Key"] == 'Name': instancename = tags["Value"] print("Retaining snapshot {0} of volume {1} from instance {2} for {3} days for {4}".format( snap['SnapshotId'], vol_id, instance['InstanceId'], retention_days, instancename )) delete_date = datetime.date.today() + datetime.timedelta(days=retention_days) delete_fmt = delete_date.strftime('%Y-%m-%d') print("Will delete {0} snapshots on {1}".format(len(to_tag[retention_days]), delete_fmt)) print("instance id now ") ec.create_tags( Resources=[snap['SnapshotId']],Tags=[ {'Key': 'automatic-snapshot-delete-on', 'Value': delete_fmt}, {'Key': 'Name', 'Value': instancename}, {'Key': 'Instance ID', 'Value': instance['InstanceId']} ])
import boto3 import collections import datetime from pprint import pprint ec = boto3.client('ec2') def lambda_handler(event, context): reservations = ec.describe_instances( Filters=[ {'Name': 'tag-key', 'Values': ['auto_snapshot', 'true']}, ] ).get( 'Reservations', [] ) instances = sum( [ [i for i in r['Instances']] for r in reservations ], []) print("Found {0} instances that need backing up".format(len(instances))) to_tag = collections.defaultdict(list) for instance in instances: pprint(instance) try: retention_days = [ int(t.get('Value')) for t in instance['Tags'] if t['Key'] == 'Retention'][0] except IndexError: retention_days = 15 for dev in instance['BlockDeviceMappings']: if dev.get('Ebs', None) is None: continue vol_id = dev['Ebs']['VolumeId'] print("Found EBS volume {0} on instance {1}".format( vol_id, instance['InstanceId'])) snap = ec.create_snapshot( VolumeId=vol_id, ) to_tag[retention_days].append(snap['SnapshotId']) for tags in instance['Tags']: if tags["Key"] == 'Name': instancename = tags["Value"] print("Retaining snapshot {0} of volume {1} from instance {2} for {3} days for {4}".format( snap['SnapshotId'], vol_id, instance['InstanceId'], retention_days, instancename )) delete_date = datetime.date.today() + datetime.timedelta(days=retention_days) delete_fmt = delete_date.strftime('%Y-%m-%d') print("Will delete {0} snapshots on {1}".format(len(to_tag[retention_days]), delete_fmt)) print("instance id now ") ec.create_tags( Resources=[snap['SnapshotId']],Tags=[ {'Key': 'automatic-snapshot-delete-on', 'Value': delete_fmt}, {'Key': 'Name', 'Value': instancename}, {'Key': 'Instance ID', 'Value': instance['InstanceId']} ])
import boto3 import collections import datetime from pprint import pprint ec = boto3.client('ec2') def lambda_handler(event, context): reservations = ec.describe_instances( Filters=[ {'Name': 'tag-key', 'Values': ['auto_snapshot', 'true']}, ] ).get( 'Reservations', [] ) instances = sum( [ [i for i in r['Instances']] for r in reservations ], []) print("Found {0} instances that need backing up".format(len(instances))) to_tag = collections.defaultdict(list) for instance in instances: pprint(instance) try: retention_days = [ int(t.get('Value')) for t in instance['Tags'] if t['Key'] == 'Retention'][0] except IndexError: retention_days = 15 for dev in instance['BlockDeviceMappings']: if dev.get('Ebs', None) is None: continue vol_id = dev['Ebs']['VolumeId'] print("Found EBS volume {0} on instance {1}".format( vol_id, instance['InstanceId'])) snap = ec.create_snapshot( VolumeId=vol_id, ) to_tag[retention_days].append(snap['SnapshotId']) for tags in instance['Tags']: if tags["Key"] == 'Name': instancename = tags["Value"] print("Retaining snapshot {0} of volume {1} from instance {2} for {3} days for {4}".format( snap['SnapshotId'], vol_id, instance['InstanceId'], retention_days, instancename )) delete_date = datetime.date.today() + datetime.timedelta(days=retention_days) delete_fmt = delete_date.strftime('%Y-%m-%d') print("Will delete {0} snapshots on {1}".format(len(to_tag[retention_days]), delete_fmt)) print("instance id now ") ec.create_tags( Resources=[snap['SnapshotId']],Tags=[ {'Key': 'automatic-snapshot-delete-on', 'Value': delete_fmt}, {'Key': 'Name', 'Value': instancename}, {'Key': 'Instance ID', 'Value': instance['InstanceId']} ])
import boto3 import collections import datetime from pprint import pprint ec = boto3.client('ec2') def lambda_handler(event, context): reservations = ec.describe_instances( Filters=[ {'Name': 'tag-key', 'Values': ['auto_snapshot', 'true']}, ] ).get( 'Reservations', [] ) instances = sum( [ [i for i in r['Instances']] for r in reservations ], []) print("Found {0} instances that need backing up".format(len(instances))) to_tag = collections.defaultdict(list) for instance in instances: pprint(instance) try: retention_days = [ int(t.get('Value')) for t in instance['Tags'] if t['Key'] == 'Retention'][0] except IndexError: retention_days = 15 for dev in instance['BlockDeviceMappings']: if dev.get('Ebs', None) is None: continue vol_id = dev['Ebs']['VolumeId'] print("Found EBS volume {0} on instance {1}".format( vol_id, instance['InstanceId'])) snap = ec.create_snapshot( VolumeId=vol_id, ) to_tag[retention_days].append(snap['SnapshotId']) for tags in instance['Tags']: if tags["Key"] == 'Name': instancename = tags["Value"] print("Retaining snapshot {0} of volume {1} from instance {2} for {3} days for {4}".format( snap['SnapshotId'], vol_id, instance['InstanceId'], retention_days, instancename )) delete_date = datetime.date.today() + datetime.timedelta(days=retention_days) delete_fmt = delete_date.strftime('%Y-%m-%d') print("Will delete {0} snapshots on {1}".format(len(to_tag[retention_days]), delete_fmt)) print("instance id now ") ec.create_tags( Resources=[snap['SnapshotId']],Tags=[ {'Key': 'automatic-snapshot-delete-on', 'Value': delete_fmt}, {'Key': 'Name', 'Value': instancename}, {'Key': 'Instance ID', 'Value': instance['InstanceId']} ])
Every time when the above function run, it will perform the following steps:
- Create a snapshot for all EC2 instance which tagged with auto_snapshot: true
- Delete old created snapshots which were made by these function
- You can use a tag for retention days. Create a tag in EC2 instance with name “retention_days” with value “10” (you can use any number of days according to your requirements)
- You can also add a value of retention days in lambda function retention_days = 10. If you have not made any changes in the above lambda code by default it takes 15 days for retention.
- Lambda function for creating a snapshot creates a “Deletion” tag with “Date” Which calculated based on the retention days. Delete function deletes the snapshots older than the retention days.
- We need to change Basic Settings for the lambda function, the default timeout for the lambda function is 3seconds, we need to increase it (I used 59 sec) otherwise, the create function will timeout before completing the snapshots correctly.
- You can use a tag for retention days. Create a tag in EC2 instance with name “retention_days” with value “10” (you can use any number of days according to your requirements)
- You can also add a value of retention days in lambda function retention_days = 10. If you have not made any changes in the above lambda code by default it takes 15 days for retention.
- Lambda function for creating a snapshot creates a “Deletion” tag with “Date” Which calculated based on the retention days. Delete function deletes the snapshots older than the retention days.
- We need to change Basic Settings for the lambda function, the default timeout for the lambda function is 3seconds, we need to increase it (I used 59 sec) otherwise, the create function will timeout before completing the snapshots correctly.
CloudWatch Rule to create Snapshot
The lambda function is created, and we use the CloudWatch rule to automate the lambda function.
- Navigate to CloudWatch Management Console from Services.
- From CloudWatch page move to Rules from left panel
- Click on Create rule. A new page will open.
- Use the below settings to create a rule.
The above settings create an EC2 snapshot every day at 8:00 am (timezone GMT). You can use cron expression according to your requirement.
Targets are used to select the Lambda function for which the rule is used.
You can also use a Fixed rate of option to create a snapshot if you want a snapshot after a fixed number of minutes Hours and Days.
- Back to the Lambda Management Console Function.
- Select recently created function. The created lambda function page will open.
- Click on Add trigger button from a Designer section.
- Select the CloudWatch Events to trigger.
- Use recently created CloudWatch rule from existing rules and click on Add.
- The lambda function to create a snapshot for Ec2 instance is completed. You can test it by clicking the Test button.
A place for big ideas.
Reimagine organizational performance while delivering a delightful experience through optimized operations.
Lambda function to Delete a Snapshot
It is very important to manage to create or delete function as we are creating a lot of snapshots(according to our requirements) and if we do not delete it the number of snapshots will be created and we need to delete it manually.
Please follow the below steps to automate the delete function.
- Click on Create function new page will open.
- Select Author from Scratch.
- Give a specific name for creating a lambda function in the Function name.
- Choose Python 3.6 as a runtime.
- For roles select Use an existing role and select the role we made earlier.
Code
import boto3 import re import datetime from datetime import timedelta # Set the global variables globalVars = {} globalVars['RetentionDays'] = "15" ec = boto3.client('ec2') iam = boto3.client('iam') def lambda_handler(event, context): account_ids = list() try: """ You can replace this try/except by filling in `account_ids` yourself. Get your account ID with: > import boto3 > iam = boto3.client('iam') > print iam.get_user()['User']['Arn'].split(':')[4] """ iam.get_user() except Exception as e: # use the exception message to get the account ID the function executes under account_ids.append(re.search(r'(arn:aws:sts::)([0-9]+)', str(e)).groups()[1]) retention_days = ( datetime.date.today() - datetime.timedelta(days= int(globalVars['RetentionDays'])) ).strftime('%Y-%m-%d') delete_on = datetime.date.today().strftime('%Y-%m-%d') filters = [ {'Name': 'tag-key', 'Values': ['automatic-snapshot-delete-on']}, {'Name': 'tag-value', 'Values': [delete_on]}, ] snapshot_response = ec.describe_snapshots(OwnerIds=account_ids, Filters=filters) print("snapshot_response {0}".format(snapshot_response)) all_snap = ec.describe_snapshots(OwnerIds=account_ids) for snap in all_snap['Snapshots']: snapshot_time = snap['StartTime'].strftime('%Y-%m-%d') if snapshot_time <= retention_days: snapshot_response['Snapshots'].append(snap) for snap in snapshot_response['Snapshots']: print("Deleting snapshot {0}".format(snap['SnapshotId'])) ec.delete_snapshot(SnapshotId=snap['SnapshotId'])
Every time when the above function run, it will perform the following steps:
- Delete the Snapshot according to the tag of EC2 instance. Otherwise, it takes the default retention_days value “15”. It deletes all the Snapshots older than 15days.
- Also, change Basic Settings for the lambda function, the default timeout for the lambda function is 3seconds, we need to increase it (I used 59 sec) otherwise, the delete function will timeout before deleting the snapshots correctly.
CloudWatch Rule
Now the lambda function is created, and we can also use the CloudWatch rule to automate delete lambda function.
- Navigate to CloudWatch Management Console from Services.
- From CloudWatch page move to Rules from left panel
- Click on Create rule. A new page will open.
- Use the below settings to create a rule.
The above settings delete an EC2 snapshot which is older than 10 days.
- Back to the Lambda Management Console Function.
- Select recently created delete function. Delete the lambda function page will open.
- Click on Add trigger button from a Designer section.
- Select the CloudWatch Events to trigger.
- Use recently created CloudWatch rule from existing rules and click on Add.
The lambda function to delete a snapshot for EC2 instance is completed.