Data De-Duplication is one of the most important aspects of modern data storage technology. It allows you to reduce the amount of data you need to physically store by examining at all incoming data and eliminating repetitive or “duplicated” data from the back-up/storage stream. This is especially important today because business data, both structured (such as databases) and unstructured (such as emails, documents and online content), is growing at astounding rates. IT departments today are now faced with the burdens of managing, storing and protecting the vast amount of information that is critical to the operation of the business. In the past where the growth of data was lower and economic forces were less of a factor, companies had more flexibility in their ability to implement their data management policies. Today, however, businesses are faced with new challenges and regulatory constraints making data management a critical IT responsibility.
But how do you manage and process this rising tide of data? Do you invest in new server hardware with fast processors that can filter, sort and categorize this data more quickly or invest in bigger data storage arrays that can hold more of this new data? At Data Storage Corporation, we believe that the answer is not just throwing more processing power or more storage capacity at the problem. You need intelligent design of the de-duplication logic itself. While more processing power helps, de-duplication also relies on the ability of the system to locate and access the device that holds the data – in other words, the device’s speed becomes more important than simply processing power.
Resist the temptation to throw processing power at the problem, instead, out think the problem with better and more logical system design. Here are some simple tips on how you can increase the efficiency of your data management process – from data capture and ingested to storage and archiving –without investing more capital or processing power:
- De-duplication – With well-designed logic, de-duplication technology can drastically reduce the volume of information your company needs to store. Whether it is configured to support local/client-side de-duplication at the Local Area Network (LAN) level or globally (known as Common File Elimination) across multiple locations, analyzing your data at multiple points across your business is critical. Is your system configured to identify duplicate data only once per day or is it continuous? With a continuous process, you can catch any “common data” at any time and from any location, greatly reducing the volume that needs to be stored.
- Incremental Backup and the Power of Change – After your initial full backup is complete, make sure your incremental backup is configured correctly. With properly configured incremental backup, only new/changed data is transmitted and recorded resulting in significant bandwidth and storage costs.
- Data Compression – Compressing data is at the heart of data storage because the less you have store, the cheaper the cost in terms of processing and final storage. This is one area where practical experience can make a real difference because the compression ratio achieved is highly dependent on the data type being handled. For example, higher compression ratios can be achieved for databases versus that of image or audio files. Also, while it might seem counter intuitive, compressing already compressed data can actually increase the file size! That is why you need to make sure your data management solution has the intelligence to detect compressed files and can skip re-compression of such files.
With careful attention to data storage and management system design, your business can meet its regulatory and operational data management requirements without having to invest lots of new capital. For some additional thinking on how hardware and software impact de-duplication performance, please see this recent story from @InformationWeek – Deduplication Performance: More Than Processing Power. If you have questions about how you can cut your data storage costs while ensuring regulatory compliance and without negatively impacting accessibility and usability, contact our team of experts at @DataStorageCorp at 212.564.4922.