Source:Source Side Deduplication: Difference between revisions

From SEPsesam
(Marked this version for translation)
(removed limitation for oracle and hana (#20346))
 
(15 intermediate revisions by 2 users not shown)
Line 1: Line 1:
<translate><!--T:48-->
<noinclude><div class="noprint"><languages />
<div class="noprint"><languages/>
<br />


<!--T:49-->
<translate>== Overview == <!--T:5--> </translate>
{{Copyright SEP AG|en}}


<!--T:2-->
</div></noinclude><translate><!--T:13-->
{{Navigation_latest|release=[[Special:MyLanguage/SEP_sesam_Release_Versions|4.4.3/4.4.3 ''Beefalo V2'']]|link=[[Special:MyLanguage/SEP_sesam_Documentation#previous|documentation archive]]}}</div></translate><br />
SEP sesam Si3 applies deduplication at the ''block level''. In this deduplication technique, data is divided into blocks, which are then checked and duplicates are skipped. Only unique blocks are sent to storage. By eliminating redundant blocks, the size of the backed up data is reduced as no duplicate data is backed up. Storing the identical data only once results in reduced storage space requirements and network load as no duplicates are transferred over the network.
{{<translate><!--T:3-->
note</translate>|<translate><!--T:4-->
The components used are '''still in developmental stage'''! To get the required components, send a request to [mailto:support@sep.de support@sep.de].</translate>}}<br />
<translate>==Overview== <!--T:5--></translate>
<div class="boilerplate metadata" id="Additional resources" style="background-color: #f0f0f0; color:#636f73; border: 1px ridge #cdd3db; margin: 0.5em; padding: 0.5em; float: right; width: 35%; "><center><b><translate><!--T:6-->
Additional resources</translate></b></center>


{|style="margin: auto; margin-bottom:1em; width:100%; border:0px solid grey;"
<!--T:54-->
| rowspan="2" style="padding:0px 10px 0px;" | <translate><!--T:7-->
To enable the best possible scenarios for efficient data backup in different environments, SEP sesam offers a '''hybrid''' of both:
[[File:SEP_next.png|45px|link=Special:MyLanguage/4_4_3_Grolar:Configuring_Si3_Deduplication_Store]]</translate>
*[[Special:MyLanguage/Deduplication#target|''target-based'' (Si3T)]] and
| style="padding:0px 40px 0px 10px; color: grey; font-size: 90%; text-align:left;" |<translate><!--T:8-->
*''source-based '' (Si3S) deduplication
Also relevant: [[Special:MyLanguage/SEP_sesam_Requirements#Si3_deduplication|Si3 Deduplication Hardware Requirements]] – [[Special:MyLanguage/4_4_3_Grolar:Configuring_Si3_Deduplication_Store|Configuring Si3 Deduplication Store]] – [[Special:MyLanguage/Deduplication|Deduplication]] – [[Special:MyLanguage/Replication|Replication]]</translate>
Both methods use a configured ''Si3 deduplication data store'' that requires a special licence. See [[Special:MyLanguage/Licensing|Licensing]] for details.
|}
 
===Deduplication store types=== <!--T:60-->


{|style="margin: auto; margin-bottom:1em; width:100%; border:0px solid grey;"
<!--T:63-->
| rowspan="2" style="padding:0px 10px 0px;" | <translate><!--T:9-->
;Deprecated Si3 V1 deduplication store</translate>
[[File:SEP_Tip.png|45px|link=https://www.sepsoftware.com/sep-sesam/si3-tachometer-analysis/ SEP Tachometer]]</translate>
:<translate><!--T:61--> As of SEP sesam v. [[SEP_sesam_Release_Versions|5.0.0 ''Jaglion'']], two Si3 deduplication store types are available. It is strongly recommended to use the new type SEP Si3 deduplication store as the old generation Si3 V1 deduplication store is deprecated. This means that the old generation Si3 V1 is no longer being enhanced, but is still supported until further notice.</translate>
| style="padding:0px 40px 0px 10px; color: grey; font-size: 90%; text-align:left;" | <translate><!--T:10-->
;<translate><!--T:64--> Use the new Si3 deduplication store if the data is to be stored to S3 Cloud</translate>
See also: [https://www.sepsoftware.com/sep-sesam/si3-tachometer-analysis/ SEP Tachometer] [[Special:MyLanguage/Licensing|Licensing]]</translate>
*<translate><!--T:65--> If you are using an old generation Si3 V1 deduplication store with S3, you cannot restore from S3 using the GUI! See [[Special:MyLanguage/Configuring_Si3_NG_Deduplication_Store#key|Enable Si3 setup on the same host]] to learn how to configure a new Si3 and an old Si3 V1 on the same backup server or RDS to make the upgrade from Si3 V1 to Si3 smoother.</translate>
|}
;<translate><!--T:66--> Advantages of the new generation Si3 data store</translate>
:<translate><!--T:67-->
Si3 is advantageous over the old Si3 V1 store type as it offers better performance and resource savings. You can back up your data [[Special:MyLanguage/Si3_NG_Direct_to_S3|directly to S3 cloud storage]] and [[Special:MyLanguage/Backup_to_Azure_Storage|Azure storage]] and restore the items you want directly from there. It also provides a new [[Special:MyLanguage/SEP_Immutable_Storage_–_SiS|immutable storage feature – SiS]]. For more details, see [[Special:MyLanguage/Configuring_Si3_NG_Deduplication_Store|Configuring '''''Si3''''' Deduplication Store]].


{|style="margin: auto; margin-bottom:1em; width:100%; border:0px solid grey;"
<!--T:62-->
| rowspan="2" style="padding:0px 10px 0px;" | [[File:SEP_Video.png|45px|link=Video Tutorials & Screencasts]]
''Note that the instructions for source-side deduplication are the same for both types of deduplication store. Si3 is therefore not explicitly mentioned, but the term Si3 store is used for both types of deduplication store.''
| style="padding:0px 40px 0px 10px; color: grey; font-size: 90%; text-align:left;" |<translate><!--T:53--> Watch SEP sesam video [https://www.youtube.com/watch?v=sSkfmufQkXU Why and how to use Deduplication with SEP sesam].</translate>
|}


{|style="margin: auto; margin-bottom:1em; width:100%; border:0px solid grey;"
=== What is Si3 source deduplication (Si3S) === <!--T:55-->
| rowspan="2" style="padding:0px 10px 0px;" | <translate><!--T:11-->
[[File:SEP Troubleshooting.png|45px|link=Special:MyLanguage/Troubleshooting_Guide]]</translate>
| style="padding:0px 40px 0px 10px; color: grey; font-size: 90%; text-align:left;" |<translate><!--T:12-->
Problems? Check the [[Special:MyLanguage/Troubleshooting_Guide|Troubleshooting Guide]].</translate>
|}</div>
<translate><!--T:13-->
SEP sesam applies deduplication technique at block level, and offers a '''hybrid''' of both, '''target-based''' (Si3T) and '''source-based deduplication''' (Si3S, introduced in SEP sesam v. ''4.4.3''). Both methods require a [[Special:MyLanguage/4_4_3_Grolar:Configuring_Si3_Deduplication_Store|configured Si3 deduplication store]], for which a special [[Special:MyLanguage/Licensing|license]] is needed.


<!--T:14-->
<!--T:14-->
Source-side deduplication means that during backup only changed blocks are transferred to the backup server. On the client itself the backup process calculates hashes of data to be backed up and only changed or unknown blocks of the target Si3 deduplication store <!-- is this a target or source dedup store?--> are sent to the backup server. It can be used to minimize the data transferred during backup in situations where bandwidth is a problem and [[Special:MyLanguage/SEP_sesam_Glossary#RDS|SEP sesam RDS]] cannot be used. See [[Special:MyLanguage/Deduplication|Deduplication]] for more details on recommended utilization of dedupe methods.</translate>
Si3 source deduplication means that data is deduplicated before it is sent over the network, making the backup extremely bandwidth efficient. During the backup, SEP sesam calculates the hash values of the data to be backed up on the client and queries the storage to determine whether the hash value of the block is already stored there. If it is, SEP sesam sends only the hash value; if not, it sends only changed or unknown blocks of the target Si3 dedup store to the backup server.
 
<!--T:68-->
The advantage of Si3S deduplication is that only new or changed data is transferred to the backup server during the backup. This optimises bandwidth usage and requires less storage capacity. It can be used to minimize the data transferred during backup in situations where bandwidth is a problem and [[Special:MyLanguage/SEP_sesam_Glossary#RDS|SEP sesam RDS]] cannot be used. See [[Special:MyLanguage/Deduplication|Deduplication]] for more details on recommended utilization of dedupe methods.
 
<!--T:56-->
Not all data is suitable for deduplication: encrypted files, disk blocks with a non-standard size, etc. cannot be deduplicated. See [[Special:MyLanguage/Deduplication#use_cases|Data Deduplication Use Cases]] for more information.</translate>


{{<translate><!--T:15-->
{{note|<translate><!--T:16-->
note</translate>|<translate><!--T:16-->
Using source-side deduplication does not necessarily mean that the backup windows will be reduced. This actually depends on your data structure – note that hashing chunks of data is very CPU intensive and such backups might take even longer. You should consider which clients can be overloaded in this way. In general, source-based deduplication can be an excellent solution for environments with a low daily data change rate and low bandwidth between the backup server and the backed up client.</translate>}}
Using source-side deduplication does not necessarily mean that the backup windows will be reduced. This actually depends on your data structure – note that hashing chunks of data is very CPU intensive and such backups might take even longer. You should consider which clients can be overloaded in this way. Typically, source-based deduplication is a great solution for environments with a low daily data change rate and low bandwidth between the backup server and backed up client.</translate>}}


<translate>=== Key features === <!--T:17-->
<translate><!--T:17-->
Source-side deduplication is easily configured and has the following advantages:
=== Key features ===
*Only new and unique data is being backed up directly at the source.
Source-side deduplication is easy to configure and has the following advantages:
*Because less data is sent over the network the bandwidth is reduced.
*Only new and unique data is backed up directly at the source.
*As less data is sent over the network, bandwidth is reduced.
*Reduced amount of required data storage.
*Reduced amount of required data storage.


<!--T:18-->
<!--T:18-->
Source-side deduplication may have the following disadvantages:
Source-side deduplication can have the following disadvantages:
*Backup client might get overloaded and the backup window is lengthened.
*The backup client can become overloaded and the backup window lengthens
*If it is used for virtual data centers where resources are shared among the virtual machines, it can impact production workloads.
*When used for virtual data centers where resources are shared between virtual machines, it can affect production workloads.
See [[Special:MyLanguage/Deduplication#use_cases|Data Deduplication Use Cases]] for more information.</translate>


== {{anchor|prerequisites}}Prerequisites == <!--T:19-->
== {{anchor|prerequisites}}<translate><!--T:19-->
Prerequisites ==
Make sure that the following conditions are met before using deduplication:
Make sure that the following conditions are met before using deduplication:
*Check that the required [[Special:MyLanguage/Licensing|license]] is installed.</translate>
*Check that the required [[Special:MyLanguage/Licensing|license]] is installed.</translate>
<translate><!--T:20-->
<translate><!--T:20-->
*Si3S is supported on all available Linux (additional RDS required) and Windows operating systems. Si3S is already a part of a SEP sesam Windows client package, but is not included in the Linux client package. To use it on Linux, you need to install SEP sesam RDS/Server to the Linux backup client. For details on supported OS, see [[Special:MyLanguage/SEP_sesam_OS_and_Database_Support_Matrix|SEP sesam OS and Database Support Matrix]].</translate>
*Si3S is supported on all available Linux (additional RDS required) and Windows operating systems. Si3S is already part of a SEP sesam Windows client package, but is not included in the Linux client package. To use it on Linux, you need to install SEP sesam RDS/Server to the Linux backup client. For details on the supported OS, see [[Special:MyLanguage/SEP_sesam_OS_and_Database_Support_Matrix|SEP sesam OS and Database Support Matrix]].</translate>
<translate><!--T:21-->
<translate><!--T:21-->
*At least one Si3 deduplication store has to be configured on either a SEP sesam Server or SEP sesam Remote Device Server. For details on how to set it up, see [[Special:MyLanguage/4_4_3_Grolar:Configuring_Si3_Deduplication_Store|Configuring Si3 Deduplication Store]].</translate>
*At least one Si3 deduplication store has to be configured on either a SEP sesam Server or SEP sesam Remote Device Server. For setup details, see [[Special:MyLanguage/Configuring_Si3_Deduplication_Store|Configuring Si3 Deduplication Store]].</translate>
<translate><!--T:22-->
<translate><!--T:22-->
*Si3S increases CPU overhead in the production environment to calculate hashes. Therefore the minimum requirements for the system which is going to be backed are:
*Si3S increases the CPU overhead in the production environment to calculate hashes. The minimum requirements for the system which is going to be backed are:
** Minimum of 2 CPU cores
** Minimum of 2 CPU cores
** 2 GB RAM</translate>
** 2 GB RAM</translate>
Line 78: Line 73:
| style="padding:0px 40px 0px 10px; color: black; font-size: 100%; text-align:left; width:90%" |  
| style="padding:0px 40px 0px 10px; color: black; font-size: 100%; text-align:left; width:90%" |  
<translate><!--T:23-->
<translate><!--T:23-->
*Currently external backup jobs such as Oracle, SAP or DB2, cannot use source-side deduplication.
*If source-side deduplication is set up for a group backup, it will be performed on the clients with the supported version. If source-side deduplication is not supported, a regular backup is started instead.
*If source-side deduplication is set up for a group backup, it will perform a source-side dedup on the clients with the supported version. If source-side dedup is not supported, a regular backup is started instead.
*Source-side deduplication will not work if the STPD service TCP port on the client side (in <tt>sm.ini</tt> and/or <tt>stpd.ini</tt>) is changed from the default port. Make sure you use the default STPD TCP port on the client side to be able to perform Si3S backups.</translate>
*In SEP sesam v. ≥ [[SEP_sesam_Release_Versions|4.4.3]], SEP Si3 source-side deduplication (Si3S) backup does not work, if the STPD service TCP port on the client side (in <tt>sm.ini</tt> and/or <tt>stpd.ini</tt>) is changed from the default port. Make sure that you use the default STPD TCP port on the client side to be able to perform Si3S backup.<!-- In v. ≥ 5.0.0 ''Jaglion'', you can avoid this issue by setting the STPD service TCP port on the client (client properties -> ''Options'' tab -> ''Listen port'') to the new TCP port.--></translate>
{{note|<translate><!--T:58--> In v. [[SEP sesam Release Versions|≥ 5.0.0 ''Jaglion'']], you can avoid this issue by setting the STPD service TCP port on the client (client properties -> ''Options'' tab -> ''Listen port'') to the new TCP port.</translate>}}
|}
|}


<translate>== {{anchor|configuration}}Configuring source-side deduplication== <!--T:24-->
== {{anchor|configuration}}<translate><!--T:24-->
Configuring source-side deduplication==
Configuring Si3S consists of 3 main steps:
Configuring Si3S consists of 3 main steps:
#Creating a required backup environment with a deduplication store. Check the [[Special:MyLanguage/SEP_sesam_Requirements#Si3_deduplication|Si3 Deduplication Hardware Requirements]] and follow the step-by-step procedure as described in [[Special:MyLanguage/4_4_3_Grolar:Configuring_Si3_Deduplication_Store|Configuring Si3 Deduplication Store]].</translate>
#Creating a required backup environment with a deduplication store. Check the [[Special:MyLanguage/SEP_sesam_Requirements#Si3_deduplication|Si3 Deduplication Hardware Requirements]] and follow the step-by-step procedure as described in [[Special:MyLanguage/5_0_0:Configuring_Si3_NG_Deduplication_Store|Configuring '''''Si3''''' Deduplication Store in v. ≥ 5.0.0 Jaglion]]. For older, deprecated version see [[Special:MyLanguage/4_4_3_Grolar:Configuring_Si3_Deduplication_Store|Configuring Si3 V1 Deduplication Store]].</translate>
<translate><!--T:25-->
<translate><!--T:25-->
#Once the Si3 deduplication store is created, configure the [[Special:MyLanguage/Configuring_a_Media_Pool|media pools]].</translate>
#Once the Si3 deduplication store is created, configure the [[Special:MyLanguage/Configuring_a_Media_Pool|media pools]].</translate>
<translate><!--T:26-->
<translate><!--T:26-->
#Set up your [[Special:MyLanguage/4_4_3_Beefalo:Backup|backup]] strategy by following the [[Special:MyLanguage/Standard_Backup_Procedure|standard backup procedure]]: First, you will [[Special:MyLanguage/Creating_a_Backup_Task|create a backup task]] by selecting the data to be backed up, then you will specify when you want to back up your data by [[Special:MyLanguage/Creating_a_Schedule|creating a backup schedule]], and afterwards you will [[Special:MyLanguage/Creating_a_Backup_Event|create a backup event]]. In this step you will also enable the SEP Si3 source-side deduplication (see below).</translate>
#Set up your [[Special:MyLanguage/Backup_Strategy_Best_Practices|backup strategy]] by following the [[Special:MyLanguage/Standard_Backup_Procedure|standard backup procedure]]: First [[Special:MyLanguage/Creating_a_Backup_Task|create a backup task]] by selecting the data to be backed up, then determine when you want to back up your data and [[Special:MyLanguage/Creating_a_Schedule|create a backup schedule]], and then [[Special:MyLanguage/Creating_a_Backup_Event|create a backup event]]. In this step, you also activate SEP Si3 source-side deduplication (see below).</translate>
{{<translate><!--T:27-->
{{tip|<translate><!--T:28--> You can use the '''Immediate Start''' button to [[#enable|enable Si3S and start your backup immediately]].</translate>}}
tip</translate>|<translate><!--T:28-->
You can also use the '''Immediate Start''' button to [[Special:MyLanguage/Source_Side_Dedup#enable|enable the Si3S and start your backup instantly]].</translate>}}


<translate>=== {{anchor|bck_event}}Creating a backup event with enabled Si3S === <!--T:29-->
=== {{anchor|bck_event}}<translate><!--T:29-->
When creating a backup event, you can also enable source-side deduplication.</translate>
Creating a backup event with enabled Si3S ===  
When you create a backup event, you also activate source-side deduplication.</translate>
<ol><li><translate><!--T:30-->
<ol><li><translate><!--T:30-->
From '''Main Selection -> Scheduling -> Schedules''', right-click the schedule for which you want to create a new event then click '''New Backup Event'''.</translate></li>
From '''Main Selection -> Scheduling -> Schedules''', right-click the schedule for which you want to create a new event, then click '''New Backup Event'''.</translate></li>
<li><translate><!--T:31-->
<li><translate><!--T:31--> Under ''Sequence control'',  you can set the '''Priority''' of your backup event. For details, see [[Special:MyLanguage/SEPuler_-_an_event_calendar#event_priority|Setting Event Priorities]].</translate></li>
Under ''Sequence control'',  you can set up the '''Priority''' of your backup event. For details, see [[Special:MyLanguage/4_4_3_Beefalo:SEPuler_-_an_event_calendar#event_priority|Setting Event Priorities]].</translate></li>
<li><translate><!--T:51--> Under ''Object'', select the ''task'' or ''task group'' you want to link this event.</translate></li>
<li><translate><!--T:51--> Under the ''Object'', select the ''task'' or ''task group'' to which you want to link this event.</translate></li>
<li><translate><!--T:32-->
<li><translate><!--T:32-->
Under ''Parameter'', specify the '''[[Special:MyLanguage/SEP_sesam_Glossary#backup_level|Backup level]]'''.</translate></li>
Under ''Parameter'', specify the '''[[Special:MyLanguage/SEP_sesam_Glossary#backup_level|Backup level]]'''.</translate></li>
<li><translate><!--T:33-->
<li><translate><!--T:33-->
From the '''Media pool''' drop-down list, select the target media pool to which the data will be backed up. Note that you have to select the '''media pool which is combined with an Si3 deduplication store backend'''.</translate></li>
From the '''Media pool''' drop-down list, select the target media pool to which the data will be backed up. Note that you have to select the '''media pool that is combined with an Si3 deduplication store backend'''.</translate></li>
<li><translate><!--T:34-->
<li><translate><!--T:34-->
Select the check box '''SEP Si3 Source Side Deduplication'''.</translate></li>
Select the '''SEP Si3 Source Side Deduplication''' check box.</translate></li>


<translate><!--T:35-->
<translate><!--T:35-->
[[image:SSDD_enable__Beefalo_V2.jpg|link=]]</translate>
[[image:SSDD_enable.jpg|link=]]</translate>
<br clear=all>
<br clear=all>
<li><translate><!--T:36-->
<li><translate><!--T:36-->
Click '''OK''' to save the event.</translate></li></ol>
Click '''OK''' to save the event.</translate></li></ol>


<translate>=== {{anchor|enable}}Enabling and starting Si3S instantly === <!--T:37--></translate>
=== {{anchor|enable}}<translate><!--T:37--> Enabling and starting Si3S instantly</translate> ===
<ol><li><translate><!--T:38-->
<ol><li><translate><!--T:38-->
From the menu bar, select '''Activities''' -> '''Immediate Start''' -> '''Backup'''.</translate></li>
From the menu bar, select '''Activities''' -> '''Immediate Start''' -> '''Backup'''.</translate></li>
<li><translate><!--T:39-->
<li><translate><!--T:39-->
In the ''Immediate Start: Backup'' dialog, select a '''deduplication media pool''' as your backup target.</translate></li>
In the ''Immediate Start: Backup'' dialog, select a '''deduplication media pool''' as the backup target.</translate></li>
<li><translate><!--T:40-->
<li><translate><!--T:40-->
The check box '''SEP Si3 Source Side Deduplication''' is shown: select it and click '''Start'''.</translate></li>
The check box '''SEP Si3 Source Side Deduplication''' is shown: select it and click '''Start'''.</translate></li>


<translate><!--T:41-->
<translate><!--T:41-->
[[image:SSDD_enable-immediate_start_Beefalo_V2.jpg|link=]]</translate>
[[image:SSDD_enable-immediate_start.jpg|link=]]</translate>
<br clear=all></ol>
<br clear=all></ol>


<translate>=== {{anchor|verify}}Verifying if Si3S is used === <!--T:42-->
=== {{anchor|verify}}<translate><!--T:42-->
You can verify if source-side deduplication is applied by selecting '''Job State''' -> '''Backups''' in the ''Main Selection'' window. The job state overview provides detailed information on backup status and shows a selected check box in the column '''Source Side Deduplication''' if source-side deduplication will be executed for a task. The Si3S status overview also provides information on the job status, deduplication ratio, start and stop time of the Si3S, data size and throughput, assigned media pool, etc. </translate>
Verifying if Si3S is used ===
You can verify if source-side deduplication is being applied by selecting '''Job State''' -> '''Backups''' in the ''Main Selection'' window. The job state overview provides detailed information on the backup status and shows a ticked check box in the column '''Source Side Deduplication''' if source-side deduplication is being applied to a job. The Si3S status overview also provides information on the job status, deduplication rate, Si3S start and stop time, data size and throughput, assigned media pool, etc. </translate>
 
<translate><!--T:43-->
<translate><!--T:43-->
[[File:Ssdd-status_Beefalo_V2.jpg|link=]]</translate>
[[File:SSDD-status.jpg|link=]]</translate>


<translate><!--T:52-->
{{tip|<translate><!--T:52--> You can check the details of your backups online as well as start your backups immediately, restart failed backups, restore backups online and more by using Web UI. For details, see [[Special:MyLanguage/SEP_sesam_Web_UI|SEP sesam Web UI]].</translate>}}
As of [[Special:MyLanguage/SEP_sesam_Release_Versions|4.4.3 ''Beefalo V2'']], you can also check the details of your backups online by using new Web UI. The information about the source-side deduplication execution is shown in the ''Main Log''. For details, see [[Special:MyLanguage/4_4_3_Beefalo:SEP_sesam_Web_UI|SEP sesam Web UI]].


==={{anchor|network}}What network port is used for backup?=== <!--T:44--></translate><!-- Is this Si3S specific? Should it be included above in the overview? Or should it be added to general FAQ article? -->
==={{anchor|network}}<translate><!--T:44--> Which network port is used for backups?</translate>===


<translate><!--T:45-->
<translate><!--T:45-->
The client will connect the RDS or backup server on the following destination port: '''11701 + the first dedup drive'''. For example, when the first dedup drive is 9, client will use the port 11710. Make sure that the respective port is opened in the firewall on RDS or SEP sesam Server. You may need to manually detect and open the relevant port. The source port will be random.
The client connects to the RDS or backup server using the following destination port: '''11701 + the first dedup drive'''. For example, if the first dedup drive is 9, the client uses port 11710. Make sure the respective port is open in the firewall on the RDS or SEP sesam Server. You may need to manually detect and open the corresponding port. The source port is chosen randomly.<br />For more information, see also [[Special:MyLanguage/List_of_Ports_Used_by_SEP_sesam|List of ports used by SEP sesam]]. </translate>
 
<div class="noprint">
==See also== <!--T:46-->


<!--T:47-->
<noinclude><div class="noprint">{{Copyright}}</div></noinclude>
[[Special:MyLanguage/SEP_sesam_Requirements#Si3_deduplication|Si3 Deduplication Hardware Requirements]] – [[Special:MyLanguage/4_4_3_Grolar:Configuring_Si3_Deduplication_Store|Configuring Si3 Deduplication Store]] – [[Special:MyLanguage/Deduplication|Deduplication]] – [[Special:MyLanguage/Replication|Replication]]</div></translate>

Latest revision as of 14:29, 17 April 2024

SEP sesam Si3 applies deduplication at the block level. In this deduplication technique, data is divided into blocks, which are then checked and duplicates are skipped. Only unique blocks are sent to storage. By eliminating redundant blocks, the size of the backed up data is reduced as no duplicate data is backed up. Storing the identical data only once results in reduced storage space requirements and network load as no duplicates are transferred over the network.

To enable the best possible scenarios for efficient data backup in different environments, SEP sesam offers a hybrid of both:

Both methods use a configured Si3 deduplication data store that requires a special licence. See Licensing for details.

Deduplication store types

Deprecated Si3 V1 deduplication store
As of SEP sesam v. 5.0.0 Jaglion, two Si3 deduplication store types are available. It is strongly recommended to use the new type SEP Si3 deduplication store as the old generation Si3 V1 deduplication store is deprecated. This means that the old generation Si3 V1 is no longer being enhanced, but is still supported until further notice.
Use the new Si3 deduplication store if the data is to be stored to S3 Cloud
  • If you are using an old generation Si3 V1 deduplication store with S3, you cannot restore from S3 using the GUI! See Enable Si3 setup on the same host to learn how to configure a new Si3 and an old Si3 V1 on the same backup server or RDS to make the upgrade from Si3 V1 to Si3 smoother.
Advantages of the new generation Si3 data store
Si3 is advantageous over the old Si3 V1 store type as it offers better performance and resource savings. You can back up your data directly to S3 cloud storage and Azure storage and restore the items you want directly from there. It also provides a new immutable storage feature – SiS. For more details, see Configuring Si3 Deduplication Store.

Note that the instructions for source-side deduplication are the same for both types of deduplication store. Si3 is therefore not explicitly mentioned, but the term Si3 store is used for both types of deduplication store.

What is Si3 source deduplication (Si3S)

Si3 source deduplication means that data is deduplicated before it is sent over the network, making the backup extremely bandwidth efficient. During the backup, SEP sesam calculates the hash values of the data to be backed up on the client and queries the storage to determine whether the hash value of the block is already stored there. If it is, SEP sesam sends only the hash value; if not, it sends only changed or unknown blocks of the target Si3 dedup store to the backup server.

The advantage of Si3S deduplication is that only new or changed data is transferred to the backup server during the backup. This optimises bandwidth usage and requires less storage capacity. It can be used to minimize the data transferred during backup in situations where bandwidth is a problem and SEP sesam RDS cannot be used. See Deduplication for more details on recommended utilization of dedupe methods.

Not all data is suitable for deduplication: encrypted files, disk blocks with a non-standard size, etc. cannot be deduplicated. See Data Deduplication Use Cases for more information.

Information sign.png Note
Using source-side deduplication does not necessarily mean that the backup windows will be reduced. This actually depends on your data structure – note that hashing chunks of data is very CPU intensive and such backups might take even longer. You should consider which clients can be overloaded in this way. In general, source-based deduplication can be an excellent solution for environments with a low daily data change rate and low bandwidth between the backup server and the backed up client.

Key features

Source-side deduplication is easy to configure and has the following advantages:

  • Only new and unique data is backed up directly at the source.
  • As less data is sent over the network, bandwidth is reduced.
  • Reduced amount of required data storage.

Source-side deduplication can have the following disadvantages:

  • The backup client can become overloaded and the backup window lengthens
  • When used for virtual data centers where resources are shared between virtual machines, it can affect production workloads.

See Data Deduplication Use Cases for more information.

Prerequisites

Make sure that the following conditions are met before using deduplication:

  • Check that the required license is installed.
  • Si3S is supported on all available Linux (additional RDS required) and Windows operating systems. Si3S is already part of a SEP sesam Windows client package, but is not included in the Linux client package. To use it on Linux, you need to install SEP sesam RDS/Server to the Linux backup client. For details on the supported OS, see SEP sesam OS and Database Support Matrix.
  • At least one Si3 deduplication store has to be configured on either a SEP sesam Server or SEP sesam Remote Device Server. For setup details, see Configuring Si3 Deduplication Store.
  • Si3S increases the CPU overhead in the production environment to calculate hashes. The minimum requirements for the system which is going to be backed are:
    • Minimum of 2 CPU cores
    • 2 GB RAM
SEP Warning.png Limitations
  • If source-side deduplication is set up for a group backup, it will be performed on the clients with the supported version. If source-side deduplication is not supported, a regular backup is started instead.
  • Source-side deduplication will not work if the STPD service TCP port on the client side (in sm.ini and/or stpd.ini) is changed from the default port. Make sure you use the default STPD TCP port on the client side to be able to perform Si3S backups.
Information sign.png Note
In v. ≥ 5.0.0 Jaglion, you can avoid this issue by setting the STPD service TCP port on the client (client properties -> Options tab -> Listen port) to the new TCP port.

Configuring source-side deduplication

Configuring Si3S consists of 3 main steps:

  1. Creating a required backup environment with a deduplication store. Check the Si3 Deduplication Hardware Requirements and follow the step-by-step procedure as described in Configuring Si3 Deduplication Store in v. ≥ 5.0.0 Jaglion. For older, deprecated version see Configuring Si3 V1 Deduplication Store.
  2. Once the Si3 deduplication store is created, configure the media pools.
  3. Set up your backup strategy by following the standard backup procedure: First create a backup task by selecting the data to be backed up, then determine when you want to back up your data and create a backup schedule, and then create a backup event. In this step, you also activate SEP Si3 source-side deduplication (see below).
SEP Tip.png Tip
You can use the Immediate Start button to enable Si3S and start your backup immediately.

Creating a backup event with enabled Si3S

When you create a backup event, you also activate source-side deduplication.

  1. From Main Selection -> Scheduling -> Schedules, right-click the schedule for which you want to create a new event, then click New Backup Event.
  2. Under Sequence control, you can set the Priority of your backup event. For details, see Setting Event Priorities.
  3. Under Object, select the task or task group you want to link this event.
  4. Under Parameter, specify the Backup level.
  5. From the Media pool drop-down list, select the target media pool to which the data will be backed up. Note that you have to select the media pool that is combined with an Si3 deduplication store backend.
  6. Select the SEP Si3 Source Side Deduplication check box.
  7. SSDD enable.jpg
  8. Click OK to save the event.

Enabling and starting Si3S instantly

  1. From the menu bar, select Activities -> Immediate Start -> Backup.
  2. In the Immediate Start: Backup dialog, select a deduplication media pool as the backup target.
  3. The check box SEP Si3 Source Side Deduplication is shown: select it and click Start.
  4. SSDD enable-immediate start.jpg

Verifying if Si3S is used

You can verify if source-side deduplication is being applied by selecting Job State -> Backups in the Main Selection window. The job state overview provides detailed information on the backup status and shows a ticked check box in the column Source Side Deduplication if source-side deduplication is being applied to a job. The Si3S status overview also provides information on the job status, deduplication rate, Si3S start and stop time, data size and throughput, assigned media pool, etc.

SSDD-status.jpg

SEP Tip.png Tip
You can check the details of your backups online as well as start your backups immediately, restart failed backups, restore backups online and more by using Web UI. For details, see SEP sesam Web UI.

Which network port is used for backups?

The client connects to the RDS or backup server using the following destination port: 11701 + the first dedup drive. For example, if the first dedup drive is 9, the client uses port 11710. Make sure the respective port is open in the firewall on the RDS or SEP sesam Server. You may need to manually detect and open the corresponding port. The source port is chosen randomly.
For more information, see also List of ports used by SEP sesam.

Copyright © SEP AG 1999-2024. All rights reserved.
Any form of reproduction of the contents or parts of this manual is allowed only with the express written permission from SEP AG. When compiling and designing user documentation SEP AG uses great diligence and attempts to deliver accurate and correct information. However, SEP AG cannot issue a guarantee for the contents of this manual.