Alike Data Deduplication Performance
Alike features global data deduplcation, which means Alike scans each new downloaded backup for duplicate data and stores only the data it has not yet before. Global data deduplication means that backing up VMs running similar OSes will result in a huge savings of storage costs.
Within Alike, the deduplication process is referred to as “munging”, and occurs after the XVA download is complete. Deduplicated data is housed in an internal file format, called the BDBs. These comprise Alike's data store, whose location can be configured in the UI. You'll see a set of files under your data store directory with the ".bdb" extension. By default, they are 2gigs each in size, and more of them are created when Alike needs more storage.
Munge time is dependent on many factors, and may vary widely against the same VM over time, depending on the amount of changed data. The biggest factors for munge time are discussed below.
Image Size.
A good ballpark figure is that munging new data typically takes a minute per gigabyte, and munging mostly redundant data typically goes several times faster.
Changed Data.
As mentioned above, the amount of effort involved to munge is dependent on how much data is new to Alike. Processing a backup where little has changed is several times faster than processing it the first time through. It's best to benchmark your backup window after Alike has completed your jobs at least once.
System I/O.
Munging is dependent upon the speed of your disks, both on the temporary storage that holds your XVA file, and on the directory that houses your BDB files. Generally, local disk has the fastest performance characteristics, and CIFs the slowest, with iSCSI being much closer to local disk than CIFs. If performance is important, avoid CIFs.
Processor.
Munging must calculate whether the data is new or not, and may also optionally compress or encrypt your data. This takes processing power, especially if the data is new. Multicore systems will deliver better performance than single core systems.
3rd Party Software.
If the machine running Alike is taxed with other software competing for disk, memory, and CPU, munging will take longer. Certain software, like antivirus, may seriously impinge munge time by competing for disk access when Alike needs it most.
You can view this article online at:
http://www.quorumsoft.com/kb/index.php/article/alike-data-deduplication-performance