Difference between Object Storage And File Storage

Could someone explain what difference between Object Storage and File Storage is please?

I read about Object Storage on wiki, also I read http://www.dell.com/downloads/global/products/pvaul/en/object-storage-overview.pdf, also I read amazons docs(S3), openstack swift and etc. But could someone give me an example to understand better?

All the difference is only that for 'object storage' objects we add more metadata?

For example how to store image like object using some programming language (for example python)?

Thanks.

Solution

IMO, Object storage has nothing to do with scale because someone could build a FS which is capable of storing a huge number of files, even in a single directory.

It is also not about the access methods. HTTP access to data in filesystems has been available in many well known NAS systems.

Storage/Access by OID is a way to handle data without bothering about naming it. It could be done on files too. I believe there is an NFS protocol extension that allows this.

I would muster this: Object storage is a (new/different) ''object centric'' way of thinking of data, its access and management.

Think about these points:

What are snapshots today? They are point in time copies of a volume. When a snapshot is taken, all files in the volume are snapped too. Whether all of them like it or not, whether all of them need it or not. A lot of space can get used(wasted?) for a complete volume snapshot while only a few files needed to be snapped.

In an object storage system, you will rarely see snapshots of volumes, objects will be snapshot-ed, perhaps automatically. This is object versioning. All objects need not be versioned, each individual object can tell if it is versioned.

How are files/volumes protected from a disaster? Typically, in a Disaster Recovery(DR) setup, entire volumes/volume-sets are setup for replication to a DR site. Again, this does not bother whether individual files want to be replicated or not. The unit of disaster protection is the volume. Files are small fry.

In an object storage system, DR is not volume centric. Object metadata can decide how many copies should exist and where(geo locations/fault domains).

Similarly for other features:

Tiering - Objects placed in storage tiers/classes based on its metadata independent of other unrelated objects.
Life - Objects move between tiers, change the number of copies, etc, individually, instead of as a group.
Authentication - Individual objects can get authenticated from different authentication domains if required.

As you can see, the change in thinking is that in an object store, everything is about an object.

Contrast this with the traditional way of thinking about and management and access larger containers like volumes(containing files) is not object storage.

The features above and their object-centric-ness fits well with the requirements of unstructured data and hence the interest.

If a storage system is object(or file) centric instead of volume centric in its thinking, (irrespective of the access protocol or the scale,) it is an object storage system.