Storage Tier Best Practices
Benefits
There are many benefits to be able to choose your own storage
- Complete control
- Improved Security
- Improved Performance
- Better Value
For example your use case may desire being able to read 100 TB of data in a few minutes, or may require cheaply storing Petabytes for 7+ years, or maybe both!
Costs
There is a very wide gap in storage costs between different tiers of storage.
As a basic rule of thumb:
- Set lifecycle rules. For example 90 days move to cold, 360 days move to archive etc.
This allows you to easily get the best of both worlds, hot storage for day to day work, and inexpensive storage for rarely used data. - Be aware of the storage type at setup. Often it's hard to move data after the fact.
Cost Example
This is a snapshot of moment in time, these prices will change. The intent is just to showcase the need to be aware of storage unit chosen. Of course it's not realistic for 100% of your storage to be in archive, and you may have rules to delete data on some period entirely, it's just good to be aware of this.
For example 100 TB in reserved archived storage is only $84/month.
In standard it's $2,600 month, and ultra premium P series is $11,250 a month.
Product | Per GB Per Month | Percent More vs Lowest |
---|---|---|
S3 Glacier Deep Archive | $.00099 | 18% |
Azure Archive | $.00099 | 18% |
Azure RAGZRS HOT | $.06100 | 7,162% |
GCP Standard Multi | $.02600 | 2,995% |
Azure Archive Reserved | $.00084 | Lowest |
Azure "P" series (SSD) | $.11250 | 13,293% |
MinIO Enterprise With Storage Cost Estimate | $.02500 | 2,281% |
MinIO Enterprise with "buy for peak" Tax | $.04500 | 4,067% |
(At rest costs, excludes all the various network,ops etc costs)
Security
At a minimum keys should be rotated every 90 days, or based on your security posture, whichever is greater.
Performance
In general we have found that the baseline performance tiers (e.g. GCP Standard, Azure LRS) are normally good enough for average use.
Performance ranked from best to worse:
- MinIO Enterprise, on Local Area Network ~ 1-3 MS response time
- Performance series on public cloud ~ 50-200 MS depending on network
- Archive series on public cloud ~ 2-12 hours
Disclaimer
There are entire books, applications, experts etc. that cover in depth the myriad of trade offs. This is just meant to surface a few of the most key ones relevant. You must do your own due diligence on storage configuration, including backup, retention, etc.
Updated over 2 years ago