In another embodiment, some portion or percentage of the additions can be counted toward the threshold. In yet another embodiment, modifications, deletions, and additions can be weighted differently such that all three types of changes count toward the threshold. For example, modifications and deletions can have higher weights than weights applied to additions. Moreover, modifications can be weighted differently from deletions. Deletions might be weighted higher than modifications e.
A new full backup resulting from mainly deletions could be significantly smaller than a new full backup resulting from mainly modifications.
In other embodiments, modifications are weighted higher than deletions. The processes A, B can be implemented by the intelligent backup system The processes A, B can advantageously auto-tune one or more escalation parameters in certain embodiments. This block can be implemented using any of the features described above with respect to block of FIG. At block , backups are performed according to the escalation parameters.
This block can be implemented using the backup module Continuing, at block , a backup history is analyzed to determine when full backup escalation occurred. At block , the escalation parameters can be automatically adjusted based at least in part on the analysis performed at block Blocks and can be implemented by the escalation module The escalation module can then adjust either the data change threshold or the maximum backup interval based on a desired outcome.
If a user desires to reduce the number of times a full backup occurs, for instance, the escalation module might increase the data change threshold. At block , information is obtained regarding characteristics of a data store. This information can be input by a user or can be obtained programmatically by the escalation module In one embodiment, the characteristics of the data store include the rate at which the data store changes, the types of changes typically made to the data store e.
The remainder of the process B can be implemented by the escalation module At block , the characteristics obtained about the data store can be used to determine initial backup escalation parameters. Thus, instead of a user determining the escalation parameters, the escalation parameters can be determined programmatically. In one embodiment, the user inputs a desired maximum backup interval but not a data change threshold.
Based on the user's desired maximum backup interval, the escalation module can select a data change threshold that may for example trigger full backups prior to or around the maximum backup interval.
The user can instead input a desired data change threshold but not a maximum backup interval. The escalation module can then set the maximum backup interval based on an analysis of the logs or other backup history. For example, if the escalation module determines that the user's desired data change threshold has previously been met approximately every 7 days, the escalation module can set the maximum backup interval to 7 days.
The escalation module can also use the optimization criteria described above to derive escalation parameters. The escalation module can select escalation parameters according to the user's desired optimization.
Further, in some data stores, portions of the data store change at a slower rate than other portions of the data store. The escalation module can detect the rate of change different portions of the data store have based on analyzing logs or other backup history. A user can also supply this rate of change information to the escalation module The escalation module or a user can therefore establish different escalation parameters for different portions of a data store.
Blocks through of the process B can proceed in the same or similar manner as blocks through of the process A. Each new backup file can pose a potential risk in that any damage to a backup file particularly the full backup file can limit one's ability to restore data. Thus, as mentioned above, the backup validator can validate backup sets to ensure or attempt to ensure backup set integrity. Techniques for validating a backup set will now be described in greater detail.
The process can be implemented by the backup validator Advantageously, in certain embodiments, the process implements multi-leveled validation to ensure or to attempt to ensure backup set integrity. However, although multi-leveled validation can be used, fewer than all of the validation features shown can be used in some implementations.
At block , it is determined whether physical backup files e. This block can include searching in a last known location for a file. The last known location might include a pathname, directory or file folder, physical drive, logical drive, combinations of the same, or the like. If the physical backup file does not exist, full backup escalation is performed at block to attempt to ensure that a valid backup will exist.
A system administrator can also be notified of the problem in addition to or instead of escalating to a full backup. If a physical backup file does exist, it is further determined at block whether the backup files in the backup are physically related. The system can determine whether partial backups are related to full backups by examining metadata in the backup files for a reference.
If the LSN of a differential backup is equal to the first LSN of a full backup, then the two backups are likely physically related. The backup validator can also check LSNs to verify that incremental backup files are physically related with each other and with a full backup file. In another embodiment, the backup validator can use native data store tools to verify that the backups are related.
For instance, some data stores include a metadata file that includes information that links backup files. The backup validator can access or query this metadata file to determine relatedness between files in addition to or instead of examining LSNs in the backup file headers.
It is further determined at block decision whether any of the backup files are corrupted. At block , the backup files are locked to keep them from being accessed via external methods e. Locking of files is optional in certain implementations. Another optional feature of the intelligent backup system is intelligent cleanup, which can be performed by the cleanup module The process A advantageously provides a mechanism that in certain embodiments allows removal of old backups without disrupting backup set integrity.
At decision block , it is determined whether a backup file has existed longer than a threshold time. This threshold can be user-defined and can specify a point in time after which old backup files are to be removed e.
In an embodiment, removing old backup files can include removing a portion of a backup set file if a backup set is contained in a single file. In such embodiments, when creating the backup set file, the backup module of FIG. This index can include start and end points of separate full and partial backup portions of the file. The cleanup module can access the index to determine the start and end points so as to selectively remove portions of the backup set file.
If the threshold is not met, the process A ends. The process A can be repeated for each backup file. If the threshold is met, it is further determined at decision block whether the backup file is a full backup file, an incremental backup file, a log, or the like.
If not, the file is a differential file. It can therefore be safe to remove the file because differential files are not physically linked to one another in certain implementations. The file is then removed at block Otherwise, at decision block , it is determined whether a later backup file not being deleted depends on the full or incremental backup file or log. The dependency can be determined by analyzing whether the files are physically related, using any of the techniques described above with respect to FIG.
If the files are physically related, the full backup file is not removed at block If it were to be removed, the later backup file can no longer be restored without its corresponding full backup file. However, if there is no dependency on another backup file, it can safely be removed at block In some systems, a data retention policy may be in effect.
The data retention policy might state, for example, that restores should be possible to any point in time within a certain number of days. In the illustrated example, a retention window of four days is shown. Thus, the retention window extends back from the current day , Saturday, through the previous Wednesday.
Backup files should therefore be retained to enable restores on any of the days in this window A cleanup window is also shown. The cleanup window includes days for which backup files may be removed e. However, in this example, incremental backups were performed on Monday and Tuesday of the cleanup window An incremental backup was also performed on Wednesday, the last day of the retention window If either of the incremental backups in the cleanup window were removed, the incremental backup on Wednesday of the retention window cannot be restored.
Thus, in certain embodiments, the cleanup module would retain the incremental backup files on Monday and Tuesday even though they are in the cleanup window Similarly, a full backup file on Sunday of the cleanup window should be retained to enable restoring of the incremental backup file on Wednesday. Like the timeline B of FIG. However, in this timeline C, a full backup was performed on Sunday and differential backups were performed thereafter. Because none of the differential backups depend on each other and only depend on the full backup, any of the differential backups in the cleanup window can be removed safely.
Thus, even if the differential backups on Monday and Tuesday were removed, any of the differential backups in the retention window can still be recovered so long as the full backup on Sunday is not removed. Each of the user interfaces , can be generated by the control console Alternatively, the user interfaces , can be generated on the client device The backup management interface includes options for managing data store backups.
In the embodiment shown, these options include the option to store full and partial backup files in a single self-contained file or in separate backup files. In addition, a validation option is shown for checking whether a full backup file exists prior to performing a partial e.
This option can determine whether the decision block of the process is performed see FIG. Various escalation parameters are shown in a second backup management interface in FIG. In the interface , escalation parameters include a maximum backup interval and a data change threshold. Further escalation parameters are also provided for selecting the manner in which changes are calculated. These parameters include the option to query actual data pages in the data store that have been changed since the last full backup and the option to compare the size of the last differential to the last full backup.
These escalation parameters allow selection of the access or comparison methods for determining data changes described above with respect to FIG. Further escalation parameters are shown for selecting blackout days. A checkbox next to a given day, for instance, results in a full backup being prevented from running on a selected day. Validation parameters specify ways to validate backups after they are performed.
These parameters can include any of the validation options described above. In one embodiment, these parameters can also include the option to validate or verify the last backup, to validate both the last full and latest partial backup, and to validate the last full and all associated partial backups. Cleanup options provide functionality for user to enable backup cleanup along with time thresholds for cleaning up full and partial backup files and logs. Notification options allow users to specify what type of notification to receive for different backup events.
Some possible notification options include not using notification, notifying every time a backup occurs regardless of success or failure, or notifying only in the case of failure. Although not shown in the user interfaces , , in some embodiments, qualitative options can be provided instead of quantitative options for escalation parameters.
Upon a user selecting one of these options, the escalation module can automatically select escalation parameters e. In one embodiment, the user interface might also provide options for the user to perform backups in a manner that improves or optimizes data store availability. In such embodiments, the escalation module might select to perform a full or partial backup depending on an analysis of impact to computing resources that full or partial backups can have.
The escalation module can analyze computing resource usage using the techniques described above with respect to FIG. In addition to determining whether to run a full or partial backup, the escalation module can also increase data store availability by throttling a backup process.
For instance, the escalation module can reduce computing resource usage of a backup process by allocating fewer processors or processor cores to a backup process. For example, the escalation module can analyze a backup history with respect to user's escalation parameters. The analysis can determine, for example, when different escalation parameters triggered a full backup. This analysis can help the user determine whether the chosen parameters were never effective, whether they were effective but now should be updated, or whether they are still effective.
Since the escalation module can know the size and type of the backups and the thresholds provided by the user, the escalation module can not only analyze historical backups but also project what escalation parameters may be useful in the future.
This analysis can be used by the escalation module to auto-tune the escalation parameters, e. Depending on the embodiment, certain acts, events, or functions of any of the algorithms described herein can be performed in a different sequence, can be added, merged, or left out all together e. Moreover, in certain embodiments, acts or events can be performed concurrently, e. The various illustrative logical blocks, modules, and algorithm steps described in connection with the embodiments disclosed herein can be implemented as electronic hardware, computer software, or combinations of both.
To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, and steps have been described above generally in terms of their functionality.
Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. The described functionality can be implemented in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the disclosure.
The various illustrative logical blocks and modules described in connection with the embodiments disclosed herein can be implemented or performed by a machine, such as a general purpose processor, a digital signal processor DSP , an application specific integrated circuit ASIC , a field programmable gate array FPGA or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein.
A general purpose processor can be a microprocessor, but in the alternative, the processor can be a controller, microcontroller, or state machine, combinations of the same, or the like. Thank you for introducing me to these tools, and to the fun side of being a DBA.
I cannot wait to see what your next endeavors are anxiously await your next KEK newsletter! Good luck sir! You will be sorely missed! The impact of your guidance is beyond measure — something that I can vouch for on a personal level — and will bode you extremely well in your new endeavours. Great recap of your career. Congrats on your decision and best of luck in your next endeavor.
Thanks for the kind words, Brent! Congrats man… May you have another Kev, congratulations on the new journey! The early days were some of my fondest. Thanks for the opportunity with Quest and for our friendship. Take care my friend and I hope we cross paths again soon! Thanks so much, Lee!
Keep me posted on your latest. This site uses Akismet to reduce spam. Learn how your comment data is processed. Expert and author on topics like SQL, data management, big data, IT leadership, enterprise architecture, and building technical communities.
Author of "SQL in a Nutshell", popular conference speaker, blogger, magazine columnist, and world traveler. Dax French, then product manager, taught me about the concept of salability. We had a team name and a few products! Hiring your own team is a dream for many leaders in IT. Being empowered to hire a team, set goals, and deliver on them was a fantastic, though demanding experience. The leadership of PASS was an incredibly difficult workload to support, basically equivalent to a second full-time job, and also a difficult period in my life personally.
Now I was back down to an acceptable number of hours per week, including the time spent on PASS, blogging, etc with my day job.
In , I again ventured into new territory and new experiences. But I also watched them closely, tried to absorb their wisdom, and emulate their behaviors that engendered the success of the company. We now had several products which needed more attention and focus on campaign-driven marketing efforts. This team rocked! We became a team of thought leaders that influenced the way the rest of the company did business.
That's because DBAs must interpret mountains of data and the resolution is rarely obvious. It detects and diagnoses performance problems across a SQL Server environment and provides an overview of enterprise performance through an intuitive interface.
Bottlenecks are detected at the server level. Then, the root cause is isolated through granular analyses of metrics and issues are resolved quickly. It can inform DBAs of the health of any given SQL Server database right at that moment, and provides insight and interpretation on over different database performance characteristics via one administration console.
The advanced troubleshooting features enable more detailed analysis to detect the root causes of deadlocks, and provide recommendations to rewrite certain pieces of database code to ensure such deadlocks don't happen again in the future.
Applications Insight from Guy Harrison. Big Data Notes from Guy Harrison. Emerging Technologies from Guy Harrison. Database Elaborations from Todd Schraml. Data and Information Management Newsletters.
0コメント