In its latest Global DataSphere Forecast, IDC predicts that the amount of data that will be created over the next three years will be more than the data created over the past thirty. In 2020 alone, the analyst house estimates that more than 59 zettabytes of data will be created, captured, copied and consumed.
To put this into perspective, a single zettabyte is equivalent to 250 billion DVDs, and the issue is likely to be compounded by the fact that many enterprise IT organizations plan to keep up to ten copies of the data they create.
But with 90 percent of the world’s data having been created in the last two years alone, very few businesses have planned for the sheer levels at which this explosion in data has taken place. And this is particularly the case where unstructured is concerned.
While 80 percent of the world’s data is of the unstructured type, many businesses are strategically planning to turn their own data into information they can monetize.
But rather than IT budgets being doubled to match the data explosion, they have largely stayed flat. So many businesses are struggling to mobilize and manage this astounding amount of unstructured data in the enterprise.
The traditional approach to managing unstructured data has always been storage-centric – you move data to a storage system, the storage system then manages your data and gives you the tools to search and report on it.
This approach worked when data volumes were small or moderate and all of an enterprise’s data could fit within a single storage solution.
Even as storage architectures have become more sophisticated and flexible, and cloud storage options have emerged, most technology-based organizations today use a mix of expensive, high-performance flash storage, along with the mainstay of disk-based storage and cost-efficient object storage for less used “cold data.”
The heavy cost of silo
Cloud storage now also supports these different options, but it’s all too often treated as a cheap storage locker, which typically becomes just another disconnected data silo.
As enterprises shift to a multi-cloud architecture, they can no longer afford to manage data within each storage silo, search for data within each and pay a heavy cost to move data from one silo to another. Whether it’s file or object data – from user-generated data to home directories, file shares, or machine and application data such as genomics, PACS imaging, seismic data, electronic design data and IoT etc., traditional storage systems were not designed to cope with the modern explosion of unstructured data and multi-cloud architectures.
This is why a new aggregator style of unstructured data management across on-premise and cloud is needed.
We can think of this in terms of the equivalent of an Airbnb-type model for enterprise data.
Moving from a single-vendor controlled ecosystem to a vendor-independent aggregator model is not a new concept. Many industries have already gone through this transformation. A good example lies with the hospitality industry.
For decades, hotel chains relied upon loyal customers who were willing to drive extra miles to stay at their preferred hotel if they were a rewards member, even if a similar hotel was closer.
But Airbnb created a new model. It provided an easy, trustworthy and cost-effective way to find thousands of properties, including home rentals. In doing so, it expanded the available choices for guests.
This in turn has greatly expanded the market. And this might explain why Airbnb’s debut market cap was, at one point, more than the combined market cap of the nation’s three largest hotel chains – Marriott International, Hilton Worldwide and Hyatt Hotels. Of course, they’re still learning some lessons about governance and compliance as the model matures.
An aggregator approach to storage and unstructured data management would solve three major challenges in today’s hybrid cloud era. These include variables such as:
- Visibility – A cross-storage, cross-cloud view into all data owned by an enterprise to ensure ‘cold data’ that is worth less is using cheaper resources than ‘hot data’ that is worth more.
- Mobility – Ensuring correct data placement across different storage architectures and clouds – moving the right data to the right place, and at the right time across different storage silos.
- Value – An understanding of the value of data in its different forms, no matter where data lives – this requires a global approach to unstructured data management that is not storage-centric.
For this aggregator approach to unstructured data management to emerge successfully in any industry, there are various core principles that need to be set in place.
These include the best practice discipline that:
- It must be data agnostic and data-centric. It should work across silos by interoperating with various storage vendors and clouds using open standards, rather than proprietary interfaces.
- It cannot be tied to any storage architecture or vendor. Instead, it should provide global analytics and data management across silos, regardless of where the data lives. The customer must always be in control of their data.
- It should focus on data mobility. Enterprise IT organizations will need to move data by policy across different storage and cloud options to optimize costs and performance.
- It should not lock-in the metadata into a proprietary format. Instead, it should move data using open standards so that data can be used natively wherever it lives. It must keep the metadata intact along with the data itself, and provide an easy way to search, find and build virtual data lakes and deeper analytics that will help extract greater value from the data.
Enterprise IT leaders are beginning to recognize that a real and urgent need exists for a new data-centric, rather than storage-centric, approach to unstructured data management.
The growth we’ve seen in data accumulation can only continue to accelerate with new and upcoming digitalization initiatives and the majority of organizations adopting hybrid, multi-cloud strategies.
The race towards a new aggregator style of unstructured data management across clouds has begun in full force – and the time is right for an Airbnb-style model for unstructured data management.
Krishna Subramanian, President & COO, Komprise