Troubleshooting DiskSpace and Partition related alerts

Troubleshooting Disk Space and Partition related alerts

Problem Summary In some scenarios, the disk space in the Common Partition gets full and there are several alerts related to disk space used. Need to understand and troubleshoot such scenarios.
Error Message None.
Possible Cause This may be caused when the application fills the debug logs, there is no capping to the growth of the log files, and the LPM (Log Partition and Monitoring) tool is not able to monitor or delete the logs.
Recommended Action Some Partition related alerts are:
  • LogPartitionLowWaterMarkExceeded : This indicates that the available disk space is low on the log (Common) partition.
  • LogPartitionHighWaterMarkExceeded : This indicates that the available diskspace is low on the Log partition and the alert is generated by the LPM when the disk usage is above the threshold. The LPM starts deleting the files to bring log partition disk usage below low water mark.
    • Once the alert is generated, use TLC (Trace and Log Central) to download and delete the files from the Server manually. Alternatively, you can also use CLI "file get" or "file delete" commands to transfer or to delete the files.
    • Log (Common) partition usage can be monitored in RTMT "Disk Usage" precan page. Check the alert settings for LogPartitionLowWaterMarkExceeded and LogPartitionHighWaterMarkExceeded aganist the current common partition's disk usage.
    • Look for the following to find whats contributing to the disk usage:
      • Check if there are a lot of Core dump files generated
      • Check if the Troubleshooting Trace Settings are turned on from the Cisco Unified Serviceability interface; if yes, please turn it off.
      • Check all service trace settings not using Detailed level. If Detailed settings, please change service trace settings to the default Significant level
      • If the remote access to the machine is enabled, please run the cmd "du -H" under the /common and get the command output.
    • Check if the LPM service is up and running. Collect the LPM log from the RTMT Trace and Log Central.
  • LowActivePartitionAvailableDiskSpace : This alert indicates available disk space is low on Active Partition, and is generated when active partition disk usage is above threshold.
  • LowInactivePartitionAvailableDiskSpace : This alert indicates available disk space is low for Inactive partition, and is generated when Inactive partition disk usage is above threshold.
Release Release 8.0(1)
Associated CDETS # None

