Source:Troubleshooting Backup

From SEPsesam
Revision as of 11:47, 12 February 2021 by Sta (talk | contribs) (Rewritten and updated according to existing DE article network connection problem (error 10054).)


Template:Copyright SEP AG en

Docs latest icon.png Welcome to the latest SEP sesam documentation version 4.4.3/4.4.3 Beefalo V2. For previous documentation version(s), check documentation archive.


Backup problems

Client backup error

Problem

  • A client backup did not function properly. How can I determine where the problem is?

Solution

  • Run the following commands to determine the error:
SEP Warning.png Warning
The following commands produce a high network load.

BACKUP SERVER UNIX, CLIENT WINDOWS

 sm_ctrlc -l system  {client name} sbc -b -s -  f:/test  >/dev/null

Data from the F:/ directory on Windows is written over the network to the directory /dev/null on Unix. To display it, append -v 1 to the command above. Everything written to /dev/null will be displayed.

 sm_ctrlc -l system  {client name}   sbc  -b -s  -  -v 1      f:/test  >/dev/null

BACKUP SERVER UNIX, CLIENT UNIX

 sm_ctrlc -l root  {client name}   sbc  -b -s -  /usr  >/dev/null

To display the read data:

 sm_ctrlc -l root  {client name}   sbc  -b -s -  -v  1   /usr  >/dev/null

BACKUP SERVER WINDOWS, CLIENT UNIX

 sm_ctrlc -l root  {client name}   sbc  -b -s -             /usr    > NUL

With backup data logging:

 sm_ctrlc -l root  {client name}   sbc  -b -s -  -v  1   /usr    > NUL

If the test backup is to be run on the target backup client only, execute the following command:

In the Unix directory <sesam>/bin/sesam/:

 sbc -b -s  -  /usr   >/dev/null

In the Windows directory <SESAM_ROOT>\bin\sesam\:

 sbc  -b -s  -  f:/test    > NUL
Enter -v 1 to show the backed up data on your monitor.

Failed backups will be automatically deleted

Problem

  • By default, SEP sesam deletes all failed backups after 3 days automatically to release the storage space. How can I keep such backups for a longer time?

Solution

  • If you want to keep your failed backups for more than 3 days, you may manually extend the backup EOL (expiration date) of this particular saveset. For details, see Manually extending EOL. Note: As of v. 4.4.3 Tigon V2, SEP sesam automatically retains the last successful backup saveset when the next backup fails. This means that SEP sesam extends the EOL of the previous 'successful' or 'with warnings' backup, thus ensuring that at least one successful backup is retained.

Failed to write the data to media during backup

Problem

  • During backup, operating system error "23 (ERROR_CRC)" is displayed.

Possible causes

  • The tape drive cannot write proper blocks onto the backup media.

Solution

  • Check the tape drive and backup media.

Incorrect login or password

Problem

  • During backup, the message "Login incorrect. Password incorrect" is displayed.

Solution

  • Check your name resolution (DNS or etc/hosts file). The SEP sesam Server and SEP sesam Client must be reachable with or without FQDN and should be able to resolve each other correctly, including the reverse lookup. If the resolution is correct, do the following:
  1. In the SEP sesam GUI, go to Main Selection -> Tasks -> By clients, and select the client with the backup problem.
  2. Open the backup properties and click the Options tab.
  3. Type -v 4 at Save options.
  4. Start the backup again, then go to Main Selection -> Job state -> Backups and double-click the backup task to open its properties.
  5. Go to Logging -> Day log and search for the line Login incorrect. Password incorrect then correct the name resolution.

Network connection failure on physical or virtual systems

Problem

  • SEP sesam backups of physical or virtual systems fail due to the network connection error (10054). The log files contain one of the following error messages:
  • 10054 An existing connection was forcibly closed by the remote host or Error : Network communication problem: SOCKET error: 10054 - The virtual circuit which reset by the
    remote side . recv () call failed. You can find additional error codes in the Microsoft Developer Network documentation.

Possible causes

Cause 1: Virtualization solutions

Citrix XenTools and Installing VMware Tools may also create your own network card driver in the virtual machine, which will then be used instead of the regular Windows system drivers.

The paravirtual network drivers may sometimes cause the following problems to occur:

  • General CPU load of approx. 30% within a VM without any actions
  • Very poor throughputs, even in disconnected networks or gigabit LANs
  • Broken connections

According to Microsoft documentation, the above error message shows that the connection was reset by the client, i.e., the system to be backed up.

The problem is at the same time visible on the SEP sesam Server:

  • The backup remains in the active status with a 0 GB/h performance.
  • There are active sm_sms_backup processes.
  • There are active sm_stpd processes.

SEP sesam has no feedback from the client that the backup was aborted. The connection has been hard reset.

Cause 2: Network/Port trunking

In several customer environments, port trunking was disabled either on the system to be backed up or on the backup server side. After the trunks were resolved, the backups could be performed without problems. Ordinarily, this type of disconnection happens in less than 5 minutes but can happen randomly. This is almost always related to a trunking problem but could be the result of TSO corruption on virtual machines or even physical machines with first revision 10/100 network cards.

To resolve trunking issues, see Trunking configuration guide.

Cause 3: Firewall

Firewalls actively intervene in the transport of data through stateful inspection measures and can thus be a cause of the problem. If the disconnection occurs in less than 5 minutes, is most likely a TSO/trunking issue described above. If disconnection occurs almost exactly within 5 or 60 or 120 minutes, there is a KeepAlive problem with the NAT/PAT router or a firewall.

Cause 4: Virus scanner

Virus scanners intervene actively in the data transmission and related processes and may thus be the cause of the problem. A false positive from an antivirus application will cause intermittent and random failures. It is recommended to use the antivirus exclusions for SEP sesam as they can have adverse effects on backup and restore operations, as described in FAQ: What effect does an antivirus scanner have on SEP sesam?.

Cause 5: Error on SEP sesam Server side

If a problem with the process sm_stpd.exe or sm_stpd occurs in the event viewer or the dmesg output of a SEP sesam Server at the time the backup is aborted, the problem must be analysed from the SEP sesam Sever side. Note that incorrect routing can also be the cause of this problem.

For example, the SEP sesam Server and SEP sesam Client are on two different networks. Routing is not controlled by one central router but via static routes on the other network. A ping from the SEP sesam Server to the client works, but not vice versa. In such case, the permanent route to the client's network was set on the SEP sesam Server, but it was not part of the active routes.

To solve this problem, delete and re-add the route as follows:

route delete <client_network>
route add -p <client_network> mask <network_mask_of_the_client_network> <SEP_sesam_Server_IP>

Cause 6: Disabling task offloading

For VMs

Disable the TSO (TCP Segment Offloading) feature in the VM.

Both Citrix XenServer and VMware ESXi use the existing "TSO" (TCP Segment Offloading) in the network cards. This feature allows you to swap out different operations that must be performed during the fragmentation of network packets to the network interface card (checksum calculation). The purpose of this is to reduce the CPU load, as the actual calculation is then performed on the network card. Note that the XenServer network card drivers (NIC) are more affected by this problem. Turn off the TSO (TCP segment offloading) feature in the VM. Both Citrix XenServer and VMware ESXi are available in the network, the existing types "TSO" (TCP segment offloading).

On Windows

To disable this feature on Windows, the key DisableTaskOffload has to be set in the registry to prevent offloading to the network card, see Microsoft documentation. After setting this option on the backup client and then restarting, further backups run without any problems.

On Linux

On Linux, you can activate or deactivate the TSO settings during runtime using the ethtool tool as follows:

ethtool -K eth0 tso off

TCP retransmission

Fixing TCP retransmission errors at the network level fixes the root cause of the problem. To find TCP retransmission errors, the network analysis tool Wireshark is required.

You have to start a network analysis with Wireshark on the affected system (Capture -> Interfaces).

Information sign.png Note
Retransmission error appears as black lines in the Wireshark log.

ON WINDOWS

Information sign.png Note
SEP sesam uses Microsoft’s Volume Shadow Copy Service (VSS) to perform backup for various task types. VSS failures are typically caused by system configuration and not by SEP sesam; this section may provide some instructions on how to troubleshoot such issues. If you cannot find your issue here, check also VSS Troubleshooting or refer to Microsoft's article Volume Shadow Copy Service for more detailed ionformation on VSS.

Path backup on Windows is not working

Problem

  • SEP sesam Server failed to execute a path backup of a Windows client (however, a path backup without VSS is working). The following error is displayed:
Problem while loading dynamic link library: [WIN32 API error: 1114 - A dynamic link library (DLL) 
initialization routine failed. LoadLibrary() call failed for: [vss.dll]].

Possible causes

  • Typically, when dll cannot be initialised, a running antivirus solution is preventing it.
  • The client's VSS configuration is incorrect.

Solution

  • Disable the antivirus software during backup.
  • Check the client's VSS configuration by running the following commands:
    • vssadmin list writers - check the status of all writers on a daily basis.
    • vssadmin list shadows - check existing VSS copies.
    • vssadmin list shadowstorage - check how much space is reserved and available space for VSS.
    • vssadmin resize shadowstorage - set the reserved space for VSS snapshots to unlimited.

WIN32 API error: "1450"

Problem

  • What should I do when a client backup fails with a WIN32 API error: "1450 - Not enough system resources to execute the requested service"?
  • The backup of a client may end with the following error message in the backup log:
sbc-1148: Error:   W2KSS Error: [WIN32 API error: 1450 - Not enough system resources to execute 
the requested service. Cannot store registry key: [SOFTWARE]. RegSaveKey() call failed in BackupRegistry().].

Possible causes

  • Insufficient size of the registry/paged memory area. This problem affects SEP sesam as well as other backup tools, such as NTBackup.

Solution

System state backup is backing up large amount of data due to DFS

Problem

  • When performing a Windows system state backup on the server that runs DFS (Distributed File System), large amount of data is backed up and the backup is slow.

Possible cause

  • If the DFS replication is enabled, the system state backup may include all the data from DFS Replication service too.

Solution

Information sign.png Note
On a Domain Controller (DC) the DFS is used to replicate the SYSVOL, therefore it should be included in the system state backup. Use the following procedure only as a workaround if your system state backups are slow due to large amount of DFSR data.
  • Exclude the DFS writer from the system state backup task in the SEP sesam GUI: from the Main Selection -> Tasks -> By clients, select your Windows client then click New backup task. In the New backup task window switch to the Options tab and under Additional call arguments -> Backup options exclude the DFS writer as follows:
  • -x "VSS:/DFS Replication service writer" Next, you have to back up the file system data, served by the DFS, separately by performing a regular Path backup (create a Path task type instead of a System_state backup task). For details, see Backing up System State.

System_State backup (RegLoadKey) error

Problem

  • What does the warning "The system cannot find path. RegLoadKey()..." during System_State backup mean?
  • You may see the following output in NOT-log:
C:\Program Files\SEPsesam\var\tmp\usr_wf_S-1-5-21-220523388-1123561945-839522115-1003].
2010-04-13 02:04:20: sbc-2074: Warning: W2KSS Warning: [WIN32 API error: 3 -
The system cannot find path. RegLoadKey() call failed for
file: [C:\Documents and Settings\nn\ntuser.dat] in BackupUserProfiles().].

Possible causes

  • There are inconsistencies in the OS configuration. The reason is that a user profile has been deleted but the user account still exists. System_State backup is looking for files corresponding to the user in the file system but the files no longer exist.

Solution

  • Delete the user in question or restore the profile date in the file system.
  • Check the following in your registry to see whether it still includes references to usernames which no longer exist:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\ProfileList

Access denied during Microsoft Windows backup via VSS

Problem

  • Why does a Microsoft Windows backup via VSS stop with the message "[CVssBaseObject::CreateVssBackupComponents] - Access denied"?

Possible causes

  • SEP sesam is not allowed to create a snapshot with the current user.

Solution

  • Check the user running the SEP sesam daemon and make sure that the user has all permissions to access the volume(s).

Stream data length exceeds buffer capacity

Problem

  • What does the warning "Stream data length bigger than buffer can accept. Input buffer length = [65536], Stream data size = (High part)[0] (Low part)[65564]" ?

Possible causes

  • SEP sesam uses 64 kB to back up Windows ACL files and folders and one object exceeds this buffer. You can use the Windows command icacls to display the ACL of a file or folder. The output looks like this:
C:\>icacls "C:\Documents and Settings\LocalService\Local Settings\Temp"
C:\Documents and Settings\LocalService\Local Settings\Temp NT AUTHORITY\LOCAL SERVICE:(I)(F)
                                                           NT AUTHORITY\LOCAL SERVICE:(I)(OI)(CI)(IO)(F)
                                                           NT AUTHORITY\SYSTEM:(I)(F)
                                                           NT AUTHORITY\SYSTEM:(I)(OI)(CI)(IO)(F)
                                                           BUILTIN\Administrators:(I)(F)
                                                           BUILTIN\Administrators:(I)(OI)(CI)(IO)(F)
Successfully processed 1 files; Failed processing 0 files

If you get several hundred or thousand lines, there is something wrong with the ACL.

Solution

  • Reset the permissions of the file's respective folder by using the command:
C:\>icacls "C:\Documents and Settings\LocalService\Local Settings\Temp" /reset

This command inherits the permissions of the parent object. You may have to adjust the permissions after running this command if manual settings have been applied for this object.

Backup on Windows 7 did not complete successfully

Problem

  • When you perform a Windows 7 backup, the backup might fail with one of the following errors.
sbc-1146: Error:   DB Module: [WIN32 API error: 55 - The specified network resource or device is no longer available.
sbc-2040: Warning: Cannot read item [\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy730\Windows\winsxs\x86_microsoft-windows-sort_31bf3856ad364e35_6.1.7600.16385_none_ab9479767ad67fd7\sort.exe: (2) WIN32 API error]:[ 2 - The system cannot find the file specified. ]. Padding remaining bytes...
smk-3506: Info:     Backup finished. Status: ERROR Error: Item generator returns [WIN32 API error: 2 - The system cannot find the file specified. ] for item [\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy136\Windows\winsxs\FileMaps\]
smk-3506: Info:     Backup finished. Status: ERROR Error:   DB Module: [Not all items have been processed]
  • To see more detailed error, view the Event Log on your Windows client: Event Viewer -> Windows Logs -> System. The Event Id 33, volsnap shows the following message:
The oldest shadow copy of volume C: was deleted to keep disk space usage for shadow copies of volume C: below the user defined limit.

Possible causes

  • There was not enough disk space for the Volume Shadow Service to take a snapshot; consequently, the operating system automatically deleted the snapshot due to the lack of space.
  • This error occurs if the snapshot size exceeds the snapshot Maximum size: Use limit configured for the volume.

Solution

  • Check the Maximum size: Use limit on your Windows 7 client: Open a Windows Explorer on the client and right-click the drive letter to open the drive properties. Select the Shadow copies tab -> Settings -> check the Maximum size: Use limit:
    • If you select No limit, Windows will not delete the snapshot regardless of how much space it consumes. Note, however, that in case of a large snapshot the process might be failing to create the snapshot because there is not enough disk space left on the respective drive.
    • If you specify Use limit, change the maximum size limit to a large enough value to contain the snapshot with room to spare. According to Microsoft it is recommended to set the maximum size limit to a value that equals 10% of the volume size. For example, if the C:\ drive has 100 GB of data, the limit should be set to 10 GB. Or, you can store the shadow copy data on another disk. For details, see Microsoft forum answers Shadow Copy.

ON LINUX

SEP sesam Linux Client error during backup

Problem

  • A SEP sesam Linux Client (SBC) issues an error or warning during backup.

Possible causes

Backup on Linux may finish with an error or warning if:

  • the size of a file has changed during backup
  • a file is deleted during the backup (between 'find' and data processing)
  • the 'find' function encountered an error

Solution

  • To avoid these warnings and resolve the above errors, double-click the backup task to open its properties and under the Options tab in the Backup options field enter the following command:
  • -o ignore_finderr=<regex>|ALL
  • If you want to avoid all such errors/warnings, specify:
  • -o ignore_finderr=ALL

GVFS error during Linux backup

Problem

  • If a user is logged on to the gnome or kde session and makes use of the GVFS layer, the directory ~/.gvfs is created. This directory cannot be entered by any other user (even root).
  • Additionally, the system call "stat" also errors out on this directory. The directory cannot be excluded, because while creating the file list and looking at the excludes, sbc_find once does a "stat()" call on the directory and receives an error.

Solution

  • Create a new file named /etc/profile.local with the contents below /etc/profile.local:
GVFS_DISABLE_FUSE=1 
export GVFS_DISABLE_FUSE
  • Run the following command for each affected folder
    • fusermount -u /home/$USER/.gvfs
    • test with stat /home/$USER/.gvfs

See also

VSS TroubleshootingBackup