Let's say I want to start an image hosting service.
Should I go for SSD or HDD option?
Let's say I want to start an image hosting service.
Should I go for SSD or HDD option?
Other urls found in this thread:
It don't matter
None of this matters
Use the cloud
explain
hdd if you need the space, popular images will be cached in ram, so will be served quickly
if the ssd is big enough, then it'd be better
The problem I have here is that I don't know how many people will use my service and I'm not sure how to predict it. Therefore I have no idea how much space do I really need.
I'm considering buying 20 GB SSD vs 50 GB HDD, but not sure if the first option is enough.
I can only estimate that average user will send ~60 kb files (it's a very specific image hosting service) and might do it around 5-10 times.
I have always wondering if image hosting sites hash their images to prevent hosting duplicate files and save space.
It might be a little more overhead but the disk space saved would probably be worth it.
i'd probably go for the hdd, consider an ssd later if it turns out to be too slow
commonly-accessed images will be cached by your OS in memory, so will be served without fetching from disk
unless you have a lot of visitors at any instance, it's better to have more space than worry about IO latency
you can always change it later on if need be
Op ll host porn?
Use SSD as cheap "ramdisk" (store most popular images) and HDD as major backbone.
This solves the issue of space and speed.
even Sup Forums does this
there's many ways to do it, like-
- purely frontend, you have a list of images and hashes in a database, and only store unique files, pointing identical images to the same file
- filesystem-level, some filesystems like btrfs and zfs support deduplicating entire volumes, seperate to the userspace (if you have two identical files on disk, they only use the space of one copy)
- something as simple as a shell script that periodically hashes files and replaces copies with symlinks
i imagine op is getting this choice from a VPS vendor, he might not have the option to pick both
though yes, if you're physically making a server, using an ssd as a cache to a hdd/raid backend is a smart way to make the most of both technologies
OP here, you are correct. It's VPS SSD vs VPS HDD
The way i would do it is have a temp upload directory and have a program that hashes the files in that directory and delete duplicates from it. If the hash is not found i would have the cgi program redirect the users browser to the existing image url seemlessly.
That would be much cleaner than fooling with symlinks.
Or even better have js hash the file on the client side then the server only has to verify the hash. If it exists no upload is needed.
>Or even better have js hash the file on the client side then the server only has to verify the hash. If it exists no upload is needed.
some services do this as well, it's an advantage for both parties bandwidth-wise, but uses more cpu/memory resources on the client browser
this would also imply you're using a database mapping 'uploads' to stored files, unless you outright refuse identical files (which might not be a good idea, there's several cases where one might want to the same file with different metadata, such as upload date, filename, ID, or if it's in a particular collection, if that's to be a feature)
to give you an example of what i'm thinking with a database;
hastebin.com
(Sup Forums spam filter)
(oh, you should probably use something better than md5 if your server has a good cpu)
Wouldn't altering the metadata change the hash?
This, use HDD. It's better to have a slightly slower service that works than have your blazing fast service spit out "no space" errors.
depends on which metadata you're talking about
metadata relating to the filesystem or your web service won't affect the file contents (and therefore, the file hash)
only metadata relating to the file format itself will, such as gif comments, jpeg EXIF, mkv titles, etc, etc