Wrangling Unstructured Data Will Require New Thinking
Managing the mass of unstructured data that people are creating on a daily basis is placing a growing burden on IT organizations. While some storage vendors have attempted to address this from a performance and accessibility perspective, there's been little done to assist in the management of this massive data footprint.
The growth of unstructured data has continued unchecked and unmanaged largely because:
- Companies usually can’t control or manage the intellectual property generated on desktops.
- Organizations lack formal processes or policies for checking in work.
- Companies don’t have a document management system to provide content control around policies.
In general, companies don’t completely realize how much intellectual property their employees are storing on their desktops or controlled in home directories. Much of this is sensitive data that might contain proprietary and confidential information about the organization, yet there are no specific controls, aside from perhaps a footer in the document that says "confidential and proprietary."
When we have a customer that plans to retire its desktops during a data center relocation, we typically double the amount of storage in the centralized data center. That’s because humans have an inconvenient habit of wanting to have their data close by. In many cases, they’ll keep this data on their laptops or on USB drives that they carry in their pockets. Unfortunately, these devices are notoriously difficult to manage and the data on them is almost impossible to tag, track, and monitor.
Now that public cloud storage systems—such as Google or Dropbox—are offering easy-to-use global access to data, people are only too willing to sacrifice security for convenience. And this creates a complex problem for IT departments that are already short staffed.
What’s needed are strong alternatives that provide the services end users demand, but in a way that aligns with business policy. But because there was no cloud computing historically, there are no policies to address this new technology. In addition, the way the public clouds have gained traction is with free client software that’s now become a virus in most enterprises.
IT departments will have to become data content savvy. Even the term ‘unstructured data’ implies that there’s no value to it and no way to manage it. New thinking will be paramount in the next decade when it comes to distributed intellectual property—its innate value and ways to tap that value through content control systems.