Cloud storage offers amazing features like global accessibility, almost infinite storage capacity, high durability and integrated content distribution.
Cloud storage also presents challenges like managing the costs and tracking the state of the data.
Existing S3 Lifecycle Features
Amazon offers several options for helping manage the costs of object storage including?Reduced Redundancy Storage,?Infrequent Access Storage and?Amazon Glacier.
Reduced Redundancy Storage keeps fewer copies of an object. It?costs less than regular S3 storage but results in less durability. This increases the risk of data loss. The trade is cost vs durability.
Infrequent Access Storage feature also costs less than regular S3 storage per GB stored but costs more to retrieve?data. The trade is cost of storage vs cost of access.
Glacier is not part of S3 but is used?for long term archival. Glacier?costs less than regular S3 storage per GB stored but takes much longer when retrieving?data. Much much longer! The trade is cost of storage vs speed of access.
The user must determine?the level of durability required,?the frequency?the data needs to be accessed, and the speed the data needs to be made available,?then?select the?right storage options based on the requirements.
Fortunately, S3 Lifecycle Management features help automate both data?management and tracking:
- Move data to Infrequent Access Storage
- Move data to Glacier
- Expire versioned objects
- Permanently delete objects
New?S3 Lifecycle Features
The blog post covers two new additions to Lifecycle Management,?Incomplete Multipart Uploads and?Expired Object Delete Markers.
Deleting?Incomplete Multipart Uploads reduces costs associated with storing data which failed to upload completely. Incomplete uploads are effectively unusable.
Deleting Expired Object Delete Markers reduces management overhead by automating cleanup processes.
The blog entry also discusses some S3 Best Practices which are worth reading. Here’s the summary:
- Use Versioning to prevent accidental data loss
- Use Cross-Region Replication to prevent disaster related data loss
- For high volume usage, optimize for performance
- Manage costs using lifecycle rules to migrate data to lower storage tiers