Source Side Dedup
|The components used are still in developmental stage! To get the required components, send a request to firstname.lastname@example.org.|
SEP sesam applies deduplication technique at block level, and offers a hybrid of both, target-based (Si3T) and source-based deduplication (Si3S, introduced in SEP sesam v. 4.4.3). Both methods require a configured Si3 deduplication store, for which a special license is needed.
Source-side deduplication means that during backup only changed blocks are transferred to the backup server. On the client itself the backup process calculates hashes of data to be backed up and only changed or unknown blocks of the target Si3 deduplication store are sent to the backup server. It can be used to minimize the data transferred during backup in situations where bandwidth is a problem and SEP sesam RDS cannot be used. See Deduplication for more details on recommended utilization of dedupe methods.
|Using source-side deduplication does not necessarily mean that the backup windows will be reduced. This actually depends on your data structure – note that hashing chunks of data is very CPU intensive and such backups might take even longer. You should consider which clients can be overloaded in this way. Typically, source-based deduplication is a great solution for environments with a low daily data change rate and low bandwidth between the backup server and backed up client.|
Source-side deduplication is easily configured and has the following advantages:
- Only new and unique data is being backed up directly at the source.
- Because less data is sent over the network the bandwidth is reduced.
- Reduced amount of required data storage.
Source-side deduplication may have the following disadvantages:
- Backup client might get overloaded and the backup window is lengthened.
- If it is used for virtual data centers where resources are shared among the virtual machines, it can impact production workloads.
Make sure that the following conditions are met before using deduplication:
- Check that the required license is installed.
- Si3S is supported on all available Linux (additional RDS required) and Windows operating systems. Si3S is already a part of a SEP sesam Windows client package, but is not included in the Linux client package. To use it on Linux, you need to install SEP sesam RDS/Server to the Linux backup client. For details on supported OS, see SEP sesam OS and Database Support Matrix.
- At least one Si3 deduplication store has to be configured on either a SEP sesam Server or SEP sesam Remote Device Server. For details on how to set it up, see Configuring Si3 Deduplication Store.
- Si3S increases CPU overhead in the production environment to calculate hashes. Therefore the minimum requirements for the system which is going to be backed are:
- Minimum of 2 CPU cores
- 2 GB RAM
Configuring source-side deduplication
Configuring Si3S consists of 3 main steps:
- Creating a required backup environment with a deduplication store. Check the Si3 Deduplication Hardware Requirements and follow the step-by-step procedure as described in Configuring Si3 Deduplication Store.
- Once the Si3 deduplication store is created, configure the media pools.
- Set up your backup strategy by following the standard backup procedure: First, you will create a backup task by selecting the data to be backed up, then you will specify when you want to back up your data by creating a backup schedule, and afterwards you will create a backup event. In this step you will also enable the SEP Si3 source-side deduplication (see below).
|You can also use the Immediate Start button to enable the Si3S and start your backup instantly.|
Creating a backup event with enabled Si3S
When creating a backup event, you can also enable source-side deduplication.
- From Main Selection -> Scheduling -> Schedules, right-click the schedule for which you want to create a new event then click New Backup Event.
- Under Sequence control, you can set up the Priority of your backup event. For details, see Setting Event Priorities.
- Under the Object, select the task or task group to which you want to link this event.
- Under Parameter, specify the Backup level.
- From the Media pool drop-down list, select the target media pool to which the data will be backed up. Note that you have to select the media pool which is combined with an Si3 deduplication store backend.
- Select the check box SEP Si3 Source Side Deduplication.
- Click OK to save the event.
Enabling and starting Si3S instantly
- From the menu bar, select Activities -> Immediate Start -> Backup.
- In the Immediate Start: Backup dialog, select a deduplication media pool as your backup target.
- The check box SEP Si3 Source Side Deduplication is shown: select it and click Start.
Verifying if Si3S is used
You can verify if source-side deduplication is applied by selecting Job State -> Backups in the Main Selection window. The job state overview provides detailed information on backup status and shows a selected check box in the column Source Side Deduplication if source-side deduplication will be executed for a task. The Si3S status overview also provides information on the job status, deduplication ratio, start and stop time of the Si3S, data size and throughput, assigned media pool, etc.
As of 4.4.3 Beefalo V2, you can also check the details of your backups online by using new Web UI. The information about the source-side deduplication execution is shown in the Main Log. For details, see SEP sesam Web UI.
What network port is used for backup?
The client will connect the RDS or backup server on the following destination port: 11701 + the first dedup drive. For example, when the first dedup drive is 9, client will use the port 11710. Make sure that the respective port is opened in the firewall on RDS or SEP sesam Server. You may need to manually detect and open the relevant port. The source port will be random.