Even if you didn’t know anything about your data, would you know the best way to classify it?
Your best bet? Build a data archive. This will quickly identify your “working set of data.” By using a file virtualization solution across the storage infrastructure, we’ve seen up to 50% cost reductions in disk and maintenance of a backup infrastructure AND a reduction in backup windows. (In one case we saw it decrease from 14 hours to 3 hours.) That means disaster recovery is vastly simpler and a great deal faster!
We’ve deployed other solutions, like Synthetic Fulls, to accomplish similar results. Backups have greatly reduced I/O loads and improved the full server recovery process. Synthetic Fulls creates new backup sets offline without touching the source data or communicating with the original host server. This improves the ability to create new full backup tapes for other uses such as vaulting or setting up a new site or test system. Just replicate and replay.
We’ve worked on this concept with a lot of our clients. Why not just move files that are candidates for being backed up to a separate tier of storage, keeping them as files in their native format, and organizing them in time-coherent views? Originally it was because of cost, but not anymore. Solutions now cost the same as LTO4 media!
Users can restore files themselves from any point in time using a search engine. You don’t need backup software to do this, which means it’s simpler to
- Deduplicate and compress
- Apply compliance and regulatory rules to it as policies
- Use it as an archive
Indexing is easier, plus data mining and replication or other data moving requirements are more simply met. Copy-based backup is now a default paradigm for consumers, i.e. Apple’s Time Machine and EMC’s Mozy.