AWS Certified Solutions Architect Professional – Study Guide – Domain 5.0: Data Storage for a complex large scale deployment
5.1 Demonstrate ability to make architectural trade off decisions involving storage options
- S3 is highly available, replicated object based data within a region
- S3 can be optimised for specific use cases and costs
- Glacier is cheaper storage, object based but recovery of data from this storage takes several hours and is not suitable for quick recovery. Archiving is a good use case
- Backup S3 by copying data to another bucket in another account (cross-account access)
- EBS is block level storage used with EC2 instances. Depending on the use case, EBS can provide magnetic storage (cheaper but slower) or SSD storage and is suitable for persistent storage and random read/write workloads
- EBS provides 99.999% availability and is AZ specific
- EBS and S3 offer encryption at rest
- S3 offers versioning functionality
- EBS offers snapshot functionality. Snapshots only copy the updated blocks and do not affect performance. Snapshots may also be used to create a new volume that has been resized, or you can also change storage type from GP2 to Magnetic or Provisioned IOPS, for example
- You can create EBS Magnetic volumes from 1 GiB to 1 TiB in size; you can create EBS General Purpose (SSD) and Provisioned IOPS (SSD) volumes up to 16 TiB in size. You can mount these volumes as devices on your Amazon EC2 instances. You can mount multiple volumes on the same instance, but each volume can be attached to only one instance at a time
- Delete on terminate flag can be changed at any time
- With General Purpose (SSD) volumes, your volume receives a base performance of 3 IOPS/GiB, with the ability to burst to 3,000 IOPS for extended periods of time. Burst credits are accumulated over time, much like T2 instances and CPU.
- General Purpose (SSD) volumes are ideal for a broad range of use cases such as boot volumes, small and medium size databases, and development and test environments. General Purpose (SSD) volumes support up to 10,000 IOPS and 160 MB/s of throughput
- With Provisioned IOPS (SSD) volumes, you can provision a specific level of I/O performance. Provisioned IOPS (SSD) volumes support up to 20,000 IOPS and 320 MB/s of throughput. This allows you to predictably scale to tens of thousands of IOPS per EC2 instance
- EBS volumes are created in a specific Availability Zone, and can then be attached to any instances in that same Availability Zone. To make a volume available outside of the Availability Zone, you can create a snapshot and restore that snapshot to a new volume anywhere in that region. You can copy snapshots to other regions and then restore them to new volumes there, making it easier to leverage multiple AWS regions for geographical expansion, data center migration, and disaster recovery
- Performance metrics, such as bandwidth, throughput, latency, and average queue length, are available through the AWS Management Console. These metrics are provided by Amazon CloudWatch
- Amazon S3 provides read-after-write consistency for PUTS of new objects in your S3 bucket and eventual consistency for overwrite PUTS and DELETES in all regions
- Amazon CloudFront is a web service that speeds up distribution of your static and dynamic web content, for example, .html, .css, .php, image, and media files, to end users. CloudFront delivers your content through a worldwide network of edge locations.
- When an end user requests content that you’re serving with CloudFront, the user is routed to the edge location that provides the lowest latency, so content is delivered with the best possible performance.
- If the content is already in that edge location, CloudFront delivers it immediately. If the content is not currently in that edge location, CloudFront retrieves it from an Amazon S3 bucket or an HTTP server (for example, a web server) that you have identified as the source for the definitive version of your content
- If you need to design for a high number (> 300 per second) of GETs (reads), then CloudFront can be an appropriate option. Caches content from S3. RTMP and web distribution
- HTTPS can be used on a CloudFront URL (distro.cloudfront.net) and then from CF to the actual source (origin) itself if it is a HTTPS web server. S3 origin traffic will be sent as the same as CF, so if the CF connection is HTTPS, then onward traffic to S3 is HTTPS (“Match Viewer” setting)
- You can use a custom SSL certificate on the CF distribution if you want it to match your domain name. You can upload this or use Certificate Manager to do this. The CA must be supported by Mozilla in order to work with CF
- If you anticipate that your workload will consistently exceed 100 requests per second, you should avoid sequential key names. If you must use sequential numbers or date and time patterns in key names, add a random prefix to the key name. The randomness of the prefix more evenly distributes key names across multiple index partitions
- S3 can be tuned to use versioning, on/off/suspended but cannot be removed
- MFA deletion policies can be used
- Bucket policies can secure S3 contents
- In summary, AWS storage options are:-
- Amazon S3 Scalable storage in the cloud
- Amazon Glacier Low-cost archive storage in the cloud
- Amazon EBS Persistent block storage volumes for Amazon EC2 virtual machines
- Amazon EC2 Instance Storage Temporary block storage volumes for Amazon EC2 virtual machines
- AWS Import/Export Large volume data transfer
- AWS Storage Gateway Integrates on-premises IT environments with cloud storage
- Amazon CloudFront Global content delivery network (CDN)
- CF can have reserved capacity purchased up front to save costs
- S3 usage summary:-
- One very common use for Amazon S3 is storage and distribution of static web content and media. This content can be delivered directly from Amazon S3, since each object in Amazon S3 has a unique HTTP URL address, or Amazon S3 can serve as an origin store for a content delivery network (CDN), such as Amazon CloudFront. Because of Amazon S3’s elasticity, it works particularly well for hosting web content with extremely spiky bandwidth demands. Also, because no storage provisioning is required, Amazon S3 works well for fast growing websites hosting data intensive, user-generated content, such as video and photo sharing sites.
- Amazon S3 is also frequently used to host entire static websites. Amazon S3 provides a highly-available and highly scalable solution for websites with only static content, including HTML files, images, videos, and client-side scripts such as JavaScript.
- Amazon S3 is also commonly used as a data store for computation and large-scale analytics, such as analyzing financial transactions, clickstream analytics, and media transcoding. Because of the horizontal scalability of Amazon S3, you can access your data from multiple computing nodes concurrently without being constrained by a single connection.
- Finally, Amazon S3 is often used as a highly durable, scalable, and secure solution for backup and archival of critical data, and to provide disaster recovery solutions for business continuity. Because Amazon S3 stores objects redundantly on multiple devices across multiple facilities, it provides the highly-durable storage infrastructure needed for these scenarios.
- Amazon S3’s versioning capability is available to protect critical data from inadvertent deletion
- To speed access to relevant data, many developers pair Amazon S3 with a database, such as Amazon DynamoDB or Amazon RDS
- Amazon S3 has the following anti-patterns:
- File system—Amazon S3 uses a flat namespace and isn’t meant to serve as a standalone, POSIX-compliant file system. However, by using delimiters (commonly either the ‘/’ or ‘\’ character) you are able construct your keys to emulate the hierarchical folder structure of file system within a given bucket.
- Structured data with query—Amazon S3 doesn’t offer query capabilities: to retrieve a specific object you need to already know the bucket name and key. Thus, you can’t use Amazon S3 as a database by itself. Instead, pair Amazon S3 with a database to index and query metadata about Amazon S3 buckets and objects.
- Rapidly changing data—Data that must be updated very frequently might be better served by a storage solution with lower read / write latencies, such as Amazon EBS volumes, Amazon RDS or other relational databases, or Amazon DynamoDB.
- Backup and archival storage—Data that requires long-term encrypted archival storage with infrequent read access may be stored more cost-effectively in Amazon Glacier.
- Dynamic website hosting—While Amazon S3 is ideal for websites with only static content, dynamic websites that depend on database interaction or use server-side scripting should be hosted on Amazon EC2
- Amazon Glacier summary:-
- Organizations are using Amazon Glacier to support a number of use cases. These include archiving offsite enterprise information, media assets, research and scientific data, digital preservation and magnetic tape replacement
- Amazon Glacier has the following anti-patterns:
- Rapidly changing data—Data that must be updated very frequently might be better served by a storage solution with lower read/write latencies, such as Amazon EBS or a database.
- Real time access—Data stored in Amazon Glacier is not available in real time. Retrieval jobs typically require 3-5 hours to complete, so if you need immediate access to your data, Amazon S3 is a better choice
- EBS usage summary:-
- Amazon EBS is meant for data that changes relatively frequently and requires long-term persistence.
- Amazon EBS is particularly well-suited for use as the primary storage for a database or file system, or for any applications that require access to raw block-level storage.
- Amazon EBS Provisioned IOPS volumes are particularly well-suited for use with databases applications that require a high and consistent rate of random disk reads and writes.
- Amazon EBS has the following anti-patterns:
- Temporary storage—If you are using Amazon EBS for temporary storage (such as scratch disks, buffers, queues, and caches), consider using local instance store volumes, Amazon SQS, or ElastiCache (Memcached or Redis).
- Highly-durable storage—If you need very highly-durable storage, use Amazon S3 or Amazon Glacier. Amazon S3 standard storage is designed for 99.999999999% annual durability per object. In contrast, Amazon EBS volumes with less than 20 GB of modified data since the last snapshot are designed for between 99.5% and 99.9% annual durability; volumes with more modified data can be expected to have proportionally lower durability.
- Static data or web content—If your data doesn’t change that often, Amazon S3 may represent a more cost effective and scalable solution for storing this fixed information. Also, web content served out of Amazon EBS requires a web server running on Amazon EC2, while you can deliver web content directly out of Amazon S3.
- Instance Store volumes usage:-
- In general, local instance store volumes are ideal for temporary storage of information that is continually changing, such as buffers, caches, scratch data, and other temporary content, or for data that is replicated across a fleet of instances, such as a load-balanced pool of web servers. Amazon EC2 instance storage is well-suited for this purpose. It consists of the virtual machine’s boot device (for instance store AMIs only), plus one or more additional volumes that are dedicated to the Amazon EC2 instance (for both Amazon EBS AMIs and instance store AMIs). This storage is usable only from a single Amazon EC2 instance during its lifetime. Note that, unlike Amazon EBS volumes, instance store volumes cannot be detached or attached to another instance.
- High I/O and high storage provide Amazon EC2 instance storage targeted to specific use cases. High I/O instances provide instance store volumes backed by SSD, and are ideally suited for many high performance database workloads. Example applications include NoSQL databases like Cassandra and MongoDB.
- High storage instances support much higher storage density per Amazon EC2 instance, and are ideally suited for applications that benefit from high sequential I/O performance across very large datasets. Example applications include data warehouses, Hadoop storage nodes, seismic analysis, cluster file systems, etc.
- Note that applications using instance storage for persistent data generally provide data durability through replication, or by periodically copying data to durable storage.
- Amazon EC2 instance store volumes have the following anti-patterns:
- Persistent storage—If you need persistent virtual disk storage similar to a physical disk drive for files or other data that must persist longer than the lifetime of a single Amazon EC2 instance, Amazon EBS volumes or Amazon S3 are more appropriate.
- Relational database storage—In most cases, relational databases require storage that persists beyond the lifetime of a single Amazon EC2 instance, making Amazon EBS volumes the natural choice.
- Shared storage—Instance store volumes are dedicated to a single Amazon EC2 instance, and cannot be shared with other systems or users. If you need storage that can be detached from one instance and attached to a different instance, or if you need the ability to share data easily, Amazon S3 or Amazon EBS volumes are the better choice.
- Snapshots—If you need the convenience, long-term durability, availability, and shareability of point-in-time disk snapshots, Amazon EBS volumes are a better choice
- AWS Import/Export summary:-
- AWS Import/Export is ideal for transferring large amounts of data in and out of the AWS cloud, especially in cases where transferring the data over the Internet would be too slow or too costly. In general, if loading your data over the Internet would take a week or more, you should consider using AWS Import/Export.
- Common use cases include initial data upload to AWS, content distribution or regular data interchange to/from your customers or business associates, transfer to Amazon S3 or Amazon Glacier for off-site backup and archival storage, and quick retrieval of large backups from Amazon S3 or Amazon Glacier for disaster recovery
- AWS Import/Export Anti-Patterns
- AWS Import/Export is optimal for large data that would take too long to load over the Internet, so the anti-pattern is simply data that is more easily transferred over the Internet.
- If your data can be transferred over the Internet in less than one week, AWS Import/Export may not be the ideal solution
- AWS Storage Gateway summary
- Organizations are using AWS Storage Gateway to support a number of use cases. These include corporate file sharing, enabling existing on-premises backup applications to store primary backups on Amazon S3, disaster recovery, and data mirroring to cloud-based compute resources.
- AWS Storage Gateway has the following anti-patterns:
- Database storage—Amazon EC2 instances using Amazon EBS volumes are a natural choice for database storage and workloads.
- Amazon CLoudFront usage patterns:-
- Amazon CloudFront is ideal for distribution of frequently-accessed static content that benefits from edge delivery—like popular website images, videos, media files or software downloads.
- Amazon CloudFront can also be used to deliver dynamic web applications over HTTP. These applications may include static content, dynamic content, or a whole site with a mixture of the two.
- Amazon CloudFront is also commonly used to stream audio and video files to web browsers and mobile devices.
- Amazon CloudFront has the following anti-patterns:
- Programmatic cache invalidation—While Amazon CloudFront supports cache invalidation, AWS recommends using object versioning rather than programmatic cache invalidation.
- Infrequently requested data—It may be better to serve infrequently-accessed data directly from the origin server, avoiding the additional cost of origin fetches for data that is not likely to be reused at the edge
- Could also use instance backed storage (i.e. host local) as a caching mechanism by striping a bunch of volumes as RAID0 and then mirroring or synching them off to EBS volumes for persistence
- S3 also has the ability to use events to push notifications to SNS or to Lambda. For example, you could have a configuration where when a video file is uploaded to S3, the bucket then triggers a create event to add metadata to DynamoDB and add a thumbnail to the bucket. Conversely, a delete event could remove the thumbnail and also clean up the entry in DynamoDB.
- Prefixes can be used to subscribe to events only in certain folders in buckets
- Suffixes can be used to subscribe to events only of a certain file type, i.e. JPG
- S3 permissions are required for this to work
5.2 Demonstrate ability to make architectural trade off decisions involving database options
- Amazon Relational Database Service (Amazon RDS) is a web service that provides the capabilities of MySQL, Oracle, or Microsoft SQL Server relational database as a managed, cloud-based service. It also eliminates much of the administrative overhead associated with launching, managing, and scaling your own relational database on Amazon EC2 or in another computing environment.
- Amazon RDS Usage Patterns:-
- Amazon RDS is ideal for existing applications that rely on MySQL, Oracle, or SQL Server traditional relational database engines. Since Amazon RDS offers full compatibility and direct access to native database engines, most code, libraries, and tools designed for these databases should work unmodified with Amazon RDS.
- Amazon RDS is also optimal for new applications with structured data that requires more sophisticated querying and joining capabilities than that provided by Amazon’s NoSQL database offering, Amazon DynamoDB.
- When creating a new DB instance using the Amazon RDS Provisioned IOPS storage, you can specify the IOPS your instance needs from 1,000 IOPS to 30,000 IOPS and Amazon RDS provisions that IOPS rate for the lifetime of the instance.
- Amazon RDS leverages Amazon EBS volumes as its data store
- The Amazon RDS Multi-AZ deployment feature enhances both the durability and the availability of your database by synchronously replicating your data between a primary Amazon RDS DB instance and a standby instance in another Availability Zone. In the unlikely event of a DB component failure or an Availability Zone failure, Amazon RDS will automatically failover to the standby (which typically takes about three minutes) and the database transactions can be resumed as soon as the standby is promoted
- Amazon RDS anti-patterns:-
- Index and query-focused data—Many cloud-based solutions don’t require advanced features found in a relational database, such as joins and complex transactions. If your application is more oriented toward indexing and querying data, you may find Amazon DynamoDB to be more appropriate for your needs.
- Numerous BLOBs—While all of the database engines provided by Amazon RDS support binary large objects (BLOBs), if your application makes heavy use of them (audio files, videos, images, and so on), you may find Amazon S3 to be a better choice.
- Automated scalability—As stated previously, Amazon RDS provides pushbutton scaling. If you need fully automated scaling, Amazon DynamoDB may be a better choice.
- Other database platforms—At this time, Amazon RDS provides a MySQL, MariaDB (MySQL fork), Postgres, Oracle, and SQL Server databases. If you need another database platform (such as IBM DB2, Informix or Sybase) you need to deploy a self-managed database on an Amazon EC2 instance by using a relational database AMI, or by installing database software on an Amazon EC2 instance.
- Complete control—If your application requires complete, OS-level control of the database server, with full root or admin login privileges (for example, to install additional third-party software on the same server), a self managed database on Amazon EC2 may be a better match.
- There is no BYOL option for RDS. If you need to use a BYOL database, you will need to provision an EC2 instance and install your database on this.
- Amazon DynamoDB is a fast, fully-managed NoSQL database service that makes it simple and cost-effective to store and retrieve any amount of data, and serve any level of request traffic. Amazon DynamoDB helps offload the administrative burden of operating and scaling a highly-available distributed database cluster
- Amazon DynamoDB stores structured data in tables, indexed by primary key, and allows low-latency read and write access to items ranging from 1 byte up to 64 KB. Amazon DynamoDB supports three data types: number, string, and binary, in both scalar and multi-valued sets
- The primary key uniquely identifies each item in a table. The primary key can be simple (partition key) or composite (partition key and sort key).
- When it stores data, DynamoDB divides a table’s items into multiple partitions, and distributes the data primarily based upon the partition key value. The provisioned throughput associated with a table is also divided evenly among the partitions, with no sharing of provisioned throughput across partitions.
- Amazon DynamoDB is integrated with other services, such as Amazon Elastic MapReduce (Amazon EMR), Amazon Redshift, Amazon Data Pipeline, and Amazon S3, for analytics, data warehouse, data import/export, backup, and archive.
- DynamoDB supports query and scan options for searching tables. Query uses indexes that already exist and is quicker and lighter on resource. Scan ignores indexes and basically searches every key and value. Very slow and resource intensive
- Secondary indexes can be used to create additional indexes for query if the standard primary key search is not appropriate. Global secondary key and local secondary keys are supported
- DynamoDB Usage Patterns:-
- Amazon DynamoDB is ideal for existing or new applications that need a flexible NoSQL database with low read and write latencies, and the ability to scale storage and throughput up or down as needed without code changes or downtime.
- Common use cases include: mobile apps, gaming, digital ad serving, live voting and audience interaction for live events, sensor networks, log ingestion, access control for web-based content, metadata storage for Amazon S3 objects, e-commerce shopping carts, and web session management. Many of these use cases require a highly available and scalable database because downtime or performance degradation has an immediate negative impact on an organization’s business.
- Amazon DynamoDB has the following anti-patterns:
- Prewritten application tied to a traditional relational database—If you are attempting to port an existing application to the AWS cloud, and need to continue using a relational database, you may elect to use either Amazon RDS (MySQL, Oracle, or SQL Server), or one of the many preconfigured Amazon EC2 database AMIs. You are also free to create your own Amazon EC2 instance, and install your own choice of database software.
- Joins and/or complex transactions—While many solutions are able to leverage Amazon DynamoDB to support their users, it’s possible that your application may require joins, complex transactions, and other relational infrastructure provided by traditional database platforms. If this is the case, you may want to explore Amazon RDS or Amazon EC2 with a self-managed database.
- BLOB data—If you plan on storing large (greater than 64 KB) BLOB data, such as digital video, images, or music, you’ll want to consider Amazon S3. However, Amazon DynamoDB still has a role to play in this scenario, for keeping track of metadata (e.g., item name, size, date created, owner, location, and so on) about your binary objects.
- Large data with low I/O rate—Amazon DynamoDB uses SSD drives and is optimized for workloads with a high I/O rate per GB stored. If you plan to store very large amounts of data that are infrequently accessed, other storage options, such as Amazon S3, may be a better choice
- ElastiCache is a web service that makes it easy to deploy, operate, and scale a distributed, in-memory cache in the cloud.
- ElastiCache improves the performance of web applications by allowing you to retrieve information from a fast, managed, in-memory caching system, instead of relying entirely on slower disk-based databases.
- ElastiCache supports two popular open-source caching engines: Memcached and Redis (master/slave, cross AZ redundancy)
- ElastiCache usage patterns:-
- ElastiCache improves application performance by storing critical pieces of data in memory for low-latency access.
- It is frequently used as a database front end in read-heavy applications, improving performance and reducing the load on the database by caching the results of I/O-intensive queries.
- It is also frequently used to manage web session data, to cache dynamically-generated web pages, and to cache results of computationally-intensive calculations, such as the output of recommendation engines.
- For applications that need more complex data structures than strings, such as lists, sets, hashes, and sorted sets, the Redis engine is often used as an in-memory NoSQL database.
- Sorted sets make Redis a good choice for gaming applications such as leaderboards, top tens, most popular etc
- Pub/sub for messaging, so real time chat etc
- Amazon ElastiCache has the following anti-patterns:
- Persistent data—If you need very fast access to data, but also need strong data durability (persistence), Amazon DynamoDB is probably a better choice
- Amazon Redshift is a fast, fully-managed, petabyte-scale data warehouse service that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools. It is optimized for datasets that range from a few hundred gigabytes to a petabyte or more.
- Amazon Redshift usage patterns:-
- Amazon Redshift is ideal for analyzing large datasets using your existing business intelligence tools.
- Analyze global sales data for multiple products
- Store historical stock trade data
- Analyze ad impressions and clicks
- Aggregate gaming data
- Analyze social trends
- Measure clinical quality, operation efficiency, and financial performance in the healthcare space
- Amazon Redshift has the following anti-patterns:
- OLTP workloads—Amazon Redshift is a column-oriented database suited to data warehouse and analytics, where queries are typically performed over very large datasets.
- If your application involves online transaction processing, a traditional row-based database system, such as Amazon RDS, is a better match.
- BLOB data—If you plan on storing binary (e.g., video, pictures, or music), you’ll want to consider Amazon S3.
- Redshift can be Can be a single node or multi node cluster
- Provision a Redshift node type to select the size of storage included
- Redshift is a single AZ solution
- Redshift clusters have a leader node which co-ordinates queries and spreads them across worker nodes in the cluster. Data is written as a stripe across all nodes to their local storage
- To scale a cluster, just add nodes to add compute and add its local storage to the cluster. Redshift manager manages adding nodes and spreading queries out
- Redshift is like a massive SQL server
- Encryption can be enabled on Redshift but must be done when the cluster is first spun up
- When a cluster is resized, the cluster is restarted in read only mode, all connections are terminated and the old cluster is used as a data source to re-populate the new cluster. The new cluster is read only until the replication is complete. The end point is then updated and the old cluster terminates connections
- Redshift may be purchased on demand or use reserved instances but spot instances are not possible. Clusters can be scaled up, down, in or out
- When you shut down a cluster, you can take a final manual snapshot from which you can recover from later. If you delete a cluster, all automatic snapshots are deleted
- Snapshots can be manually or automatically copied from one region to another, but data charges are incurred
- Redshift snapshot includes cluster size, instance types, cluster data and cluster configuration
- Amazon EC2, together with Amazon EBS volumes, provides an ideal platform for you to operate your own self-managed relational database in the cloud. Many leading database solutions are available as pre built, ready-to-use Amazon EC2 AMIs, including IBM DB2 and Informix, Oracle Database, MySQL, Microsoft SQL Server, PostgreSQL, Sybase, EnterpriseDB, and Vertica.
- Databases on EC2 usage patterns:-
- Running a relational database on Amazon EC2 and Amazon EBS is the ideal scenario for users whose application requires a specific traditional relational database not supported by Amazon RDS, or for those users who require a maximum level of administrative control and configurability.
- Self-managed relational databases on Amazon EC2 have the following anti-patterns:
- Index and query-focused data—Many cloud-based solutions don’t require advanced features found in a relational database, such as joins or complex transactions.
- If your application is more oriented toward indexing and querying data, you may find Amazon DynamoDB to be more appropriate for your needs, and significantly easier to manage.
- Numerous BLOBs—Many relational databases support BLOBs (audio files, videos, images, and so on). If your application makes heavy use of them, you may find Amazon S3 to be a better choice. You can use a database to manage the metadata.
- Automatic scaling—Users of relational databases on AWS can, in many cases, leverage the scalability and elasticity of the underlying AWS platform, but this requires system administrators or DBAs to perform a manual or scripted task. If you need pushbutton scaling or fully-automated scaling, you may opt for another storage choice such as Amazon DynamoDB or Amazon RDS. MySQL, Oracle, SQL Server
- If you are running a self-managed MySQL, Oracle, or SQL Server database on Amazon EC2, you should consider the automated backup, patching, Provisioned IOPS, replication, and pushbutton scaling features offered by a fully-managed Amazon RDS database
5.3 Demonstrate ability to implement the most appropriate data storage architecture
- S3 for highly available, multi-AZ resilience, BLOB storage, versioning, bucket policies, lifecycling, encryption in transit and at rest
- CloudFront for lots of read requests, caches to geographical edge location
- Glacier for long term storage, infrequent access
- Amazon EBS Persistent block storage volumes for Amazon EC2 virtual machines, encryption of volumes, snapshotting
- Amazon EC2 Instance Storage Temporary block storage volumes for Amazon EC2 virtual machines
- AWS Import/Export Large volume data transfer
- AWS Storage Gateway Integrates on-premises IT environments with cloud storage
- RDS most appropriate for applications pre-built to leverage SQL, Oracle, MySQL etc and structured data
- If asked about ACID capabilities, use an RDS based solution
5.4 Determine use of synchronous versus asynchronous replication
- RDS multi-AZ uses synchronous replication whereas read replicas use asynchronous replication
- Read the storage white paper – http://media.amazonwebservices.com/AWS_Storage_Options.pdf