Some Thoughts on Backups
There are many options to choose from for software to backup your system. There is the bundled-in STORE command, which works well for the majority of sites. Optionally, there are TurboStore, Orbit Backup, Roadrunner, and HIBack. These products offer additional features such as online backup capabilities, appending multiple backups on the same tape, data compression, and more. Not all features are available in all products.
Bear in mind, these features are not without cost. Generally speaking, features that are designed to increase backup throughput or reduce downtime inhibit the ease and sometimes even the possibility of recovering files. Case in point; the “;INTER” option of TurboStore is designed to store “data” from chunks of disk space rather than from files so that the disks would not have to jump around to retrieve all of the extents of a file. This feature was instituted when disks were much slower than they are today. The serious drawback to this is that the data is written to a serial device (tape drive) in a random fashion. To restore a file is a complex operation that requires reading portions of the file from many different spots of tape, potentially even different tapes. Moreover, if one tape of a multi-tape backup is unusable or damaged then there is a high probability that some of the files may not be recoverable.
The ONLINE feature of the backups presents another challenge. A typical backup strategy is to log off all users from the system, run a backup, run a nightly batch process, then let users back on again. This provides for a clear demarcation point where you always know the status of the files on tape. But with ONLINE options, a file is only recoverable if the backup completes. And having the SYNC point occur at the end of a backup rather than the beginning further blurs the line. One reason is that you don’t really know when the SYNC point is going to occur because it is highly dependent upon the system load. Furthermore, because changes to files that are stored at the beginning of the backup are posted to log files that are written at the end, the last tape is required for restoring files.
One of our customers recently uncovered what we consider a bug in the TurboStore 24×7 Online product that you should be aware of. Files that are purged and then rebuilt during the backup are stored in their pre-backup state, even with the “;ONLINE=END” parameter. In this particular case, a dataset capacity management task was performed during an online backup creating new root and dataset files. When the database was restored from this backup it was discovered to be in it’s pre-backup state, even though the “;ONLINE=END” option was used and the backup completed successfully. As you can see this could have disastrous effects if some of the files from a backup are in their pre-backup state, and some are from their post-backup state. There are several lessons to be learned here. First, don’t perform maintenance tasks during a backup. Second, be sure to understand all ramifications of any and all backup options that you utilize. Third, don’t let your backup become a “back-burner” job. If the SYNC point is at the end then the backup is useless until it completes successfully.
Strategies:
There are also strategies you can employ to speed up your backups.
Don’t back up work files. Often times there are hundreds of thousands of sectors of disk space consumed by work files or extracts, often duplicates of an entire Image dataset, that are created as part of a batch process. Put all of your work files into different groups, possibly on a separate volume set from your production data. Better yet, make them job temporary files instead that are automatically purged when the job logs off.
Don’t back up test copies of databases. Don’t store off memory dumps. Don’t back up data that can be easily recreated such as subfiles/extracts/work files (see above.) Write to multiple tape drives simultaneously if your backup software supports it. Contact a Beechglen Support Representative for an analysis of your backup strategy and to review your exposure to recoverability problems.