Source:Replication Best Practices
Overview
Replication is the process of copying data from one host to another (between backup-to-disk devices capable of replication) in a block-level, incremental fashion and is an important subset of the larger disaster recovery (DR) effort.
SEP sesam provides different replication types. SEP Si3 replication enables you to replicate the data between between SEP sesam Server and SEP sesam Remote Device Server (RDS), or between two RDSs. You can also use HPE Catalyst stores and HPE Cloud Bank Storage, or S3 cloud storage as replication target. For details, see About Replication.
Below are some important best practices to keep in mind to make sure your data is replicated efficiently and effectively.
Establishing a data replication strategy
General considerations
How much disk space is needed on the target server?
Replication needs to consider the current amount of existing data in order to determine the size of the disk space on the source and the target server, which should be at least the same. Since data will only be replicated once, it is more important to consider the volume of new data created on a daily basis since new data will always have to be replicated. Ensure that you have at least the same amount of space available for your source and target media pool.
Can the retention times of the source and target media pool be different?
Different retention times of replication media pools are supported. The target media pool should be the same size or should be larger than the source media pool. And it should have the same or a longer retention time set. Therefore the replicated savesets have at least the same or a longer EOL (end-of-life) as the original backup. For details, see Managing EOL.
Determine available network bandwidth between locations
Data replication can place a huge strain on a network's bandwidth, especially if large amounts of data are being replicated to multiple servers. The rate of change of your applications will impact the bandwidth requirements of your replication solution and also impact your RPO (Recovery Point Objective) requirements. Ideally, you should have a dedicated connection between servers. The amount of data that needs to be copied will also determine the bandwidth of the network required to move that amount of data.
Test your data replication and disaster recovery plan
Test your DR environment to make sure you have addressed any possible infrastructure changes. In addition, the order of operations must also be tested to be certain that all systems communicate properly and replicated files need to be accessed frequently to make sure they have not become corrupted.
Replication checklist
Note | |
|
1. Replication licenses
Depending on your environment (Si3R, HPE StoreOnce VSA, Cloud Storage), a valid replication license is required. For details, see List of Licenses.
2. Use a high-performance disk
Ensure that enough disk space is available in the media pool on the target – at least as much as on the source server and that your storage can be extended for the needs of deduplication. Always keep in mind that horizontal scaling might be necessary.
The disk you use for replication should have a minimum of 1 TB free hard disk space.
3. Processor cores and memory
If replicating to Si3 deduplication store, it is important that you ensure sufficient amount of memory and CPU (cores). The minimum processor core and memory requirements for Si3 are:
For TEST environments:
- 8 GB RAM
- 2 CPU cores
For PRODUCTIVE environments:
- 16 GB RAM
- 4 CPU cores for one Si3 deduplication store
See Configuring and Administering Si3 Deduplication Store by using CLI for information on the amount of additional RAM required for one Si3 data store.
4. Network connection
Ensure that there is a reliable network connection between servers.
Note | |
NAT (Network Address Translation) infrastructure is not supported. |
5. Rate of data change (churn)
The rate of data change (the volume of new data created each day) is an important consideration since new data will always have to be replicated. Take into account also the changes that users make to existing files.
The rate of change of your data will impact the bandwidth requirements of your replication solution and your recovery point objective (RPO) requirements. A high rate of change refers to data that is constantly changing. If you have a low rate of change, your RPO can be longer.
6. Bandwidth requirements based on the amount of replicated data
Replication, rate of change and bandwidth are related because the amount of data that needs to go across the bandwidth to the target site varies based on rate of change.
Calculate available network bandwidth between locations as it can affect replication performance. Test SEP sesam replication processing to determine how much workload can be managed by your network.
7. Determining the replication source and scheduling
Specify the data (so the media pool) to be replicated. You can reduce network load by elaborate scheduling and replication scenarios. To automate your replication, add your replication task to one or more schedules:
- In the Main Selection -> Scheduling -> Schedules, click New Schedule. The New Schedule window appears.
- Configure your schedule and click OK.
- Right-click the schedule you have just created and select New Replication Event. The New Replication Event window appears.
- From the Task name drop-down list, select the replication task you want to link to the schedule.
- Optionally, check the parameters, then click OK to link the event to the schedule.
For step-by-step procedure, see Scheduling Replication.