Simple Notification Service (SNS)
Kinesis + MQ
Databases & Analytics
Plan for Disaster Recovery

Define recovery objectives for downtime and data loss

  • RTO:
    • Recovery Time Objective = How much time an application can be down without causing significant damage to the business.
    • Maximum acceptable delay between the interruption of service and restoration of service.
  • RPO:
    • Recovery Point Objective = How much data that can be lost before significant harm to the business occurs.
    • Maximum acceptable amount of time since the last data recovery point.

Use defined recovery strategies to meet the recovery objectives

  • Backup and restore:
    • RPO/RTO = Hours
    • Back up your data and applications using point-in-time backups into the DR Region.
      • Restore this data when necessary to recover from a disaster.
  • Pilot light:
    • RPO = 10s of minutes
    • Replicate your data from one region to another and provision a copy of your core workload infrastructure.
  • Warm standby:
    • RPO/RTO = Minutes
    • Maintain a scaled-down but fully functional version of your workload always running in the DR Region.
  • Multi-site (active-active)
    • RPO/RTO = Near zero
    • Workload runs on AWS as well as on your existing on-site infrastructure in an active-active configuration.
  • Multi-region:
    • RPO/RTO = Near zero
    • Your workload is deployed to, and actively serving traffic from, multiple AWS Regions.
    • Requires you to synchronize data across Regions.

Implement features like Backups, Multi-AZ, Replication to help disaster recovery

Automate recovery

  • Use AWS or third-party tools to automate system recovery and route traffic to the DR site or region.