Storing and Serving 10's of Millions of Images
I have a collection of ~ 30 Million Images i need for AI and Statistical Analysis for a thesys that i'm working on. The Images are quite small and currently take about 1.5TB in total.
Currently i use a Hetzner NVME Ryzen Server with 2x 1TB nvme Raid-0 (I do have a daily external backup)
All images are named based on their sha1 Hash and put in to subfolders like "ab/cd/ef/gh/abcdefgh.jpg " to prevent too many files in one directory.
All metadata i generate is stored on a mysql db on the same server.
I access the files ether locally (some analysis is done with python on the server itself) and i setup a nginx server to load images externally over https for some applications.
I backup the server using incremental rsync.
So much for the background. Now to my Question:
Since i'm running out of space and having it on raid 0 is a little bit worrying i want to offload all images to a separate Server.
I either want something like Raid 10 for higher iops or ssd cache and raid 1 or 5. With just a single hard drive many small images are just too slow.
Any ideas on improving image storage?
Is there something to do caching in ssd automatically?
What should i look out for, do you know any offers?
Storage 3TB+, Location EU, BW: ~5TB-10TB @ 1gbit peak. CPU + RAM not that important.
Budget < 150$ (The cheaper the better)
This Server would be needed until October.
Thanks a lot.