QuroumSoft Knowledge Base

Understanding Data Reclaim in Alike

Date Added February 17, 2011 | Print


Overview

Alike houses backup data in its deduplicated data store, where unique data across all backups is housed. This data store is recorded in a series of 2gb files known as BDBs. When information for a VM is deleted from Alike, the amount of data ultimately reclaimed from BDBs will almost certainly be much less than the production size of the VM due to data deduplication.

Understanding how and when data is removed from Alike is important to planning maintenance operations and understanding storage footprint.

VM Purge
Data reclamation starts when either you manually delete a VM version from the UI, or when an old version is purged after a backup job is run to enforce a version retention limit. For example, if you have set the maximum versions retained to 5, and you have already 5 versions of a VM housed, the next time the VM is backed up the oldest version will be purged, and the job log will indicate this. No BDB space is reclaimed at this time; the VM version has simply been marked for cleanup.
   
Alike runs a background thread, called Purge Manager, which looks for these purged VMs and finds which data in the BDBs is orphaned and can be recorded as no longer pertinent. This thread runs whenever Alike is idle, so if you have many jobs running overnight, Purge Manager will perform much of its work during the day. Depending on how frequently versions are deleted, the size of your data store, and the speed of your disks, Purge Manager will take varying lengths of time to complete its survey. Typically, Purge Manager will finish a survey in several hours of total run time, even if interspersed between jobs. If services are stopped or Purge Manager is interrupted, it will pick up where it left off.

Purge Manager marks data as deleted in your data store, but this will not affect the size of your BDBs, since the deletes are likely to occur inside the BDB file and not at the end of it. BDB files are only shrunk when an Optimize job is run, which moves blocks between BDBs into empty spaces to consolidate space on disk.

Optimize
Optimize looks at Fragmentation Threshold, which is configured under Settings->Advanced. Setting this threshold to a higher percentage will cause Optimize to complete faster at the cost of reclaiming less disk on average. There are diminishing returns to lowering the threshold, however, as Optimize will work harder to find smaller amounts of data to reclaim.

Optimize may be run as a standalone Job, or at the end of backup jobs. If you intend to run Optimize after most jobs, it is recommended that you up the fragmentation threshold to as high as 50% so other jobs are not held up by the process. If you run Optimize on its own, it will run concurrent to other jobs, so you can afford to set the threshold lower.

If Optimize does not complete, either because services are stopped or the job is canceled, BDBs will not be shrunk, because the file sizes are not changed until the very end of the Optimize process. However, Optimize probably still has made considerable progress in freeing space within the BDB files by reorganizing their contents. The next time Optimize is run, it should complete quicker, and reclaim space from both runs at the end.

Data Integrity
In the event Optimize finds data corruption, it will stop Optimizing and switch to Consistency Check. Consistency Check will cause all active jobs to fail, and while Consistency Check is running, new jobs will also fail. Consistency Check will determine if BDBs contain corruption (likely due to physical problems on disk), or whether Alike has been restored from tape or archive and needs to be recalibrated. If BDBs contain corruption, Consistency Check will isolate the corrupted blocks. If a subsequent backup is run that contains these blocks, they will be repaired, and backups associated with them will be completely restorable. For more information on Consistency Checks, see the link below under "See Also".

At the end of an Optimize job, Alike copies and archives job, VM, and licensing information to your BDB directory. In case you must restore Alike from tape or other archive, this data may be required in order to restore VMs. More information about this process is available under the separate KB article referenced below.

See Also:
Disaster Recovery in Alike Standard
Understanding Consistency Checks

Was this article helpful?

Yes No

Category: Alike Administration

Last updated on February 27, 2011 with 585 views

 



Create An Account »
Forgot Password? »

Visit our twitter page for XenServer Backup, VM Replication and other news!


Request Free 30 day Trial! »

Resources

Alike Adming guide for XenServer Backup and DR

Virtual Backup for XenServer Knowledge Base and troubleshooting

XenServer Backup and DR Partners and MSPs

alike™ Tech Sheet »
alike™ DR Tech Sheet »
Submit a Support Ticket »

Become a QuorumSoft Partner »
News »