S3 Service Architecture
Amazon Simple Storage Service (S3) is for you, application and AWS services keep their data, providing an inexpensive and reliable storage solution.
- Maintain backup archives, log files and images
- RAnalytics on big data at rest
- Hosting static websites
S3 is for object storage unlike the volumes for EC2 instances that are block storage.
S3 Service Architecture
S3 files are organized into buckets. By default an AWS account has a limit of 100 buckets. S3 buckets and its content exists with a single AWS region, names of buckets however must be globally unique.
Prefixes and Delimiters
S3 stores objects within a bucket on a flat surface, however you can use prefixes and delimiters to give buckets the appearance of a structure.
Working with Large Objects
Individual objects may be no larger the 5TB. Individual uploads can be no larger the 5GB, it is recommended to use a feature called Multi-part Upload for any object larger than 100MB.
Unless intended, data stored on S3 should always be encrypted. Use encryption keys to protect data while at rest within S3. Data at rest can be protected using either server-side or client-side encryption.
- Server-side encryption with AWS S3-Managed Keys (SSE-S3)
- Server-side encryption with AWS KMS-Managed Keys (SSE-KMS)
- Server-side encryption with Customer-Provided Keys (SSE-C)
Its possible to encrypt data before it's transferred to using AWS KMS-Managed Customer Master Key (CMK).
S3 events to log files is disabled by default, they can produce a lot of activity. When enabled they include:
- Account and IP address of requesters
- SOurce bucket name
- Action requested (GET, PUT etc.)
- Time of request
- Response status
S3 Durability and Availability
S3 offers different classes for objects. Depending on how critical data is and how quickly access is needed and the cost are all factors.
S3 measures durability as a percentage. 99.999999999 percent durability guarantee for most S3 and Glacier. The high durability rates are largely because they are replicated across at least three availability zones.
Amazon S3 One Zone-Infrequent Access (S3 One ZOne-IA) and Reduced Redundancy (RRS) are not quite so resilient.
Object availability is also measured as a percentage. The S3 Standard class guarantees data will be ready when ever you need it for 99.99% of a year. They is almost no chance your data will be lost, even if sometimes not have instant access to it.
Eventually Consistent Data
S3 replicates data across locations. There might be brief delays while updates propagate across the system (typically two seconds or less)
S3 Object Life-cycle
Its often important to maintain previous archive versions and retire or delete then to keep a lid on storage costs.
If versioning is enabled at the bucket level, older overwritten copies of objects will be saved and remain accessible indefinitely.
To avoid historical file bloat, you can configure life-cycle rules for a bucket that will automatically transition on objects storage class or delete them after a set number of days.
Accessing S3 Objects
You'll naturally need to access S3 hosted objects and also restrict access.
By default only S3 buckets and object are accessible from your account. Access can be opened up using access control lists (ACL) rules, finer grained S3 bucket polices or Identity and Access Management (IAM) policies.
Amazon recommends applying S3 bucket polices or IAM policies instead of ACLs.
A pre-signed URL provide temporary access to an otherwise private file, specifying a person of time in which the URL become invalid.
Static Website Hosting
S3 buckets can be used to host HTML files for entire static websites.
S3 and Glacier Select
AWS provides a different way to access data stored in either S3 or Glacier. Select lets you apply SQL-like queries to stored objects.
Glacier support archives as large as 40TB. It archives are encrypted by default and are given machine-generated IDs. Getting objects in an existing Glacier archive can take a number of hours to retrieve. Glacier provides an inexpensive long-term storage solution for data that seldom needs accessing.
|One Zone-IA||65 GB||$0.01||$0.65|
Other Storage-Related Services
- Amazon Elastic File System (EFS)
- AWS Storage Gateway
- AWS Snowball
S3 provides reliable and highly available object-level storage. Objects are stored in buckets on a flat surface but by using prefixes can be made to appear as if there structured like a normal file system.
Its recommend to encrypt data stored on S3.
There are multiple storage classes within S3 with varying degrees of data replication that enable you to balance durability, availability and cost.
Life cycle management lets you automate the transition of your data between classes and finally delete it.
You can control access using S3 bucket policies and/or IAM policies.
Costs can be reduced by leveraging the SQL-like Select feature.
Static HTML website can be hosted directly on S3.
Amazon Glacier store data archives in vaults that might take hours to retrieve but are cheap.
Information on this page was obtained from source: AWS Certified Solutions Architect Second Edition ISBN 978-1-119-50421-4
Notes taken are kept brief and for personal reference. I urge and highly recommend anyone using this page as a source of information to purchase the source material for the complete information. The original book is fantastic and includes exercises, practice questions, verbose explanations and extra learning resources.