A verification email has been sent to
Most applications will need somewhere to store files in. For AWS that place is S3, which is a Simple Storage Service and it's one of the longest running services in AWS. It's been around for a very long time.
S3 provides us with secure, durable, highly scalable object storage. As name suggests, it's easy to use and has a very simple web interface to store and retrieve data from.
As mentioned, S3 is a safe object-based file storage. Object-based means that files are stored as objects rather than files and folders as in a file-storage we're all used to in our Operating Systems.
The data stored in S3 can be as small as 0 Bytes or as big as 5TB and it is all spread across multiple devices and facilities.
There is no limit to how big an S3 Bucket can get. Buckets are basically folders that hold our files.
One important thing to note is that S3 is a universal namespace which means that all bucket names must be globally unique. For example, you won't be able to name your bucket
exanubes becuase it's already taken by me.
Buckets have to have a unique name because they later resolve to a DNS url e.g
Suitable for storing files, not for operating systems or programmes
To reiterate, S3 is a object-based file storage. Meaning every file we upload will be stored as an object. These objects consist of:
This is just the name of the object
Data of the uploaded file, whether it's a text file, video or a picture it will all be stored as a sequence of bytes
Unique identifier in case you choose to version your files
Simply put it's data about data. Some are system defined and some can be defined by the user. Example metadata would be
Content-Length which describes the object's size in bytes,
Content-Type which would tell us what kind of file it is, this can also be modified by the user. There's plenty more in aws documentation.
There are various options for bucket configuration. It can be configured for website hosting, cors, logging or managing lifecycles of objects in the bucket. You can find many more subresources in aws documentation.
This means, that when writing new file and then trying to read it right after that, you will be able to access that data. Changes are instantaneous.
However, when updating or deleting an existing file and then trying to read it right after, you might still get the older version of the file. For example, you might still be able to read a file even though you have just deleted it. Changes take time to propagate.
S3 offers different storage tiers depending on user's needs. Covered in more detail below
Decide what storage tier an object should be in over the course of its life cycle
Version control of files - this way you'll know if the changes to a file have propagated or maybe you're still reading the old file. This also enables you to restore a file to a previous version.
Self explanatory. Encrypt your files to avoid them getting into the wrong hands and leaking sensitive data.
Specify who can access data on individual file basis. When a file holds sensitive employee information maybe only the HR department should be able to access it. ACL allows for that kind of granular control over file access.
Works similar to ACL, however, it is bucket-wide. For example, we can deem the entire bucket private and inaccessible by the public, only by internal staff.
Enables fast, easy and secure transfer of files over long distances between users and buckets. TA utilises CloudFront's globally distributed network of edge locations. As data arrives at an edge location, it is then routed to S3 over an optimized network path.
Very self-explanatory, this means our bucket won't only exist in the Region of our choosing but will also be replicated to other Regions providing very significant benefits. As mentioned previously in this article, all S3 Buckets share the same namespace which means they have to be globally unique, however, it's not very prudent to expect our users from the US to download assets from an S3 Bucket in Australia or vice versa.
This model gives us full control over the location of our data without having to juggle multiple buckets. Sometimes there could be regulatory reasons, that require you to hold a copies of your data far away from the original. You can definitely cater to that with S3 cross-region replication.
It's worth remembering that S3 is an object-based storage. Meaning it's good for storing files but not operating systems or programmes. Objects are stored within buckets and they all share a global namespace.
Each object has a key, value, version, metadata and subresources.
Data inside S3 Buckets is eventually consistent for updates and deletes and immediately consistent for reads after writes. This means we can read a file immediately after uploading it to S3, but when updating a file we could still get the older version back.
Some of the features of S3 include tiered storage, lifecycle management, versioning, encryption, securing data with ACL and securing buckets with policies.
S3 offers many storage options starting from standard and infrequently accessed, through archiving solutions using Glacier, Intelligent Tiering for cost optimisation using machine learning and even on-premises solutions with Outposts.
Important to keep in mind what we pay for with S3 which would be of course storage, requests, storage management and data transfer. Then we have additional options like Transfer Acceleration which allows us to utilise Amazon's high speed, low latency network for data transfer. Last but not least, we can also pay for Cross Region Replication in order to create bucket replicas in regions most suitable to our needs.
A verification email has been sent to