Question

0

3.6kviews

Write a short note on Data reduction

written 6.6 years ago by

aartisahitya • 160

modified 2.6 years ago by

sagarkolekar ★ 11k

data mining and business intelligence

ADD COMMENT EDIT

1

Data reduction is the process of minimizing the amount of data that needs to be stored in a data storage environment. Data reduction can increase storage efficiency and reduce costs.

Data reduction can be achieved using several different types of technologies. The best-known data reduction technique is data deduplication, which eliminates redundant data on storage systems. The deduplication process typically occurs at the storage block level. The system analyzes the storage to see if duplicate blocks exist, and gets rid of any redundant blocks. The remaining block is shared by any file that requires a copy of the block. If an application attempts to modify this block, the block is copied prior to modification so that other files that depend on the block can continue to use the unmodified version, thereby avoiding file corruption.

While data deduplication is probably the most common data reduction technique, it is not the only viable one. Data archiving and data compression can also reduce the amount of data that has to be stored on primary storage systems.

Data compression reduces the size of a file by removing redundant information from files so that less disk space is required. This is accomplished natively in storage systems using algorithms or formulas designed to identify and remove redundant bits of data.

Archiving data also reduces data on storage systems, but the approach is quite different. Rather than reducing data within files or databases, archiving removes older, infrequently accessed data from expensive storage and moves it to low-cost, high-capacity storage. Archive storage can be disk, tape or cloud based.

6.3 years ago by she290795 • 10