RDKB-63419 : High Memory fragmentation#126
RDKB-63419 : High Memory fragmentation#126abhishek-kumaracee2 wants to merge 4 commits intodevelopfrom
Conversation
Reason for change : Run memory compaction to reduce fragmentation Test Procedure : Check "/proc/buddyinfo" after running compaction Risks : Medium Priority: P0 Signed-off-by: abhishek_kumaracee2@comcast.com
There was a problem hiding this comment.
Pull request overview
This PR addresses high memory fragmentation (RDKB-63419) by adding automated memory compaction to the check_memory_health.sh script. The solution runs the log_buddyinfo.sh script to calculate fragmentation percentage and triggers memory compaction when fragmentation exceeds 50%.
Changes:
- Added call to log_buddyinfo.sh to calculate memory fragmentation percentage
- Implemented conditional logic to trigger cache clearing and memory compaction when fragmentation exceeds 50%
- Added logging for compaction actions and threshold status
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
scripts/check_memory_health.sh
Outdated
| echo_t "Memory fragmentation is more than 50 percent" | ||
| echo_t "clear cache and run memory compaction" | ||
| sync | ||
| echo 1 > /proc/sys/vm/drop_caches | ||
| echo 1 > /proc/sys/vm/compact_memory | ||
| else | ||
| echo_t "Memory fragmentation is $MemFragPercentage, no compaction needed" |
There was a problem hiding this comment.
The log message prefix is inconsistent with the rest of the file. This file uses "RDKB_MEM_HEALTH :" prefix for memory-related logging (lines 28, 29, 37, 38, 40), but the new code doesn't include this prefix (lines 52, 53, 58). For consistency, consider adding the "RDKB_MEM_HEALTH :" prefix to all log messages.
| /bin/sh /usr/ccsp/tad/log_buddyinfo.sh | ||
| MemFragPercentage=`syscfg get CpuMemFrag_Host_Percentage` | ||
| if [ "$MemFragPercentage" -gt 50 ] | ||
| then | ||
| echo_t "Memory fragmentation is more than 50 percent" | ||
| echo_t "clear cache and run memory compaction" | ||
| sync | ||
| echo 1 > /proc/sys/vm/drop_caches | ||
| echo 1 > /proc/sys/vm/compact_memory |
There was a problem hiding this comment.
Memory compaction via /proc/sys/vm/compact_memory can be a resource-intensive operation that temporarily impacts system performance. On production systems experiencing high load, this could exacerbate issues rather than resolve them. Consider checking system load before initiating compaction, or implementing rate limiting to prevent frequent compactions. Additionally, the action should be logged with sufficient detail to monitor its operational impact.
scripts/check_memory_health.sh
Outdated
| if [ "$MemFragPercentage" -gt 50 ] | ||
| then | ||
| echo_t "Memory fragmentation is more than 50 percent" | ||
| echo_t "clear cache and run memory compaction" | ||
| sync | ||
| echo 1 > /proc/sys/vm/drop_caches | ||
| echo 1 > /proc/sys/vm/compact_memory | ||
| else | ||
| echo_t "Memory fragmentation is $MemFragPercentage, no compaction needed" | ||
| fi |
There was a problem hiding this comment.
The numeric comparison will fail if MemFragPercentage is empty or not a number. Consider adding validation to check if the variable is set and is numeric before performing the comparison. Following the pattern used for Min_Mem_Value on line 31, add a check such as: if [ -n "$MemFragPercentage" ] && [ "$MemFragPercentage" -gt 50 ]. If the log_buddyinfo.sh script fails or the syscfg value is not set, this could cause the script to fail.
| if [ "$MemFragPercentage" -gt 50 ] | |
| then | |
| echo_t "Memory fragmentation is more than 50 percent" | |
| echo_t "clear cache and run memory compaction" | |
| sync | |
| echo 1 > /proc/sys/vm/drop_caches | |
| echo 1 > /proc/sys/vm/compact_memory | |
| else | |
| echo_t "Memory fragmentation is $MemFragPercentage, no compaction needed" | |
| fi | |
| case "$MemFragPercentage" in | |
| ''|*[!0-9]*) | |
| echo_t "MemFragPercentage is not set or not numeric, skipping memory fragmentation check" | |
| ;; | |
| *) | |
| if [ "$MemFragPercentage" -gt 50 ] | |
| then | |
| echo_t "Memory fragmentation is more than 50 percent" | |
| echo_t "clear cache and run memory compaction" | |
| sync | |
| echo 1 > /proc/sys/vm/drop_caches | |
| echo 1 > /proc/sys/vm/compact_memory | |
| else | |
| echo_t "Memory fragmentation is $MemFragPercentage, no compaction needed" | |
| fi | |
| ;; | |
| esac |
scripts/check_memory_health.sh
Outdated
| echo_t "Memory fragmentation is more than 50 percent" | ||
| echo_t "clear cache and run memory compaction" | ||
| sync | ||
| echo 1 > /proc/sys/vm/drop_caches | ||
| echo 1 > /proc/sys/vm/compact_memory |
There was a problem hiding this comment.
The new memory fragmentation logic does not check for DiagnosticMode, unlike the existing low memory check (lines 33-44). When the box is in diagnostic mode, clearing cache and running compaction could interfere with diagnostic processes. Consider adding a DiagnosticMode check before performing memory compaction, similar to how it's handled for low memory situations.
| echo_t "Memory fragmentation is more than 50 percent" | |
| echo_t "clear cache and run memory compaction" | |
| sync | |
| echo 1 > /proc/sys/vm/drop_caches | |
| echo 1 > /proc/sys/vm/compact_memory | |
| # No need to do corrective action during DiagnosticMode state | |
| DiagnosticMode=`syscfg get Selfheal_DiagnosticMode` | |
| if [ "$DiagnosticMode" == "true" ] | |
| then | |
| echo_t "RDKB_MEM_HEALTH : Memory fragmentation is more than 50 percent" | |
| echo_t "RDKB_MEM_HEALTH : Box is in diagnostic mode, so system will not clear cache or run memory compaction" | |
| else | |
| echo_t "Memory fragmentation is more than 50 percent" | |
| echo_t "clear cache and run memory compaction" | |
| sync | |
| echo 1 > /proc/sys/vm/drop_caches | |
| echo 1 > /proc/sys/vm/compact_memory | |
| fi |
| if [ "$MemFragPercentage" -gt 50 ] | ||
| then | ||
| echo_t "Memory fragmentation is more than 50 percent" | ||
| echo_t "clear cache and run memory compaction" |
There was a problem hiding this comment.
Consider adding telemetry notification using t2CountNotify when memory compaction is performed, similar to how "SYS_ERROR_Drop_cache" is logged on line 41 for low memory situations. This would help track how often memory compaction occurs and aid in monitoring system health.
| echo_t "clear cache and run memory compaction" | |
| echo_t "clear cache and run memory compaction" | |
| t2CountNotify "SYS_INFO_Memory_Compaction" |
scripts/check_memory_health.sh
Outdated
| if [ "$MemFragPercentage" -gt 50 ] | ||
| then | ||
| echo_t "Memory fragmentation is more than 50 percent" |
There was a problem hiding this comment.
The hardcoded threshold of 50 percent should be configurable rather than hardcoded. Consider using a syscfg parameter similar to how MinMemoryThreshold_Value is used on line 24. This would allow runtime adjustment without code changes and align with the existing pattern in the codebase.
| if [ "$MemFragPercentage" -gt 50 ] | |
| then | |
| echo_t "Memory fragmentation is more than 50 percent" | |
| MemFragThreshold=`syscfg get CpuMemFrag_Threshold_Percentage` | |
| if [ -z "$MemFragThreshold" ] || [ "$MemFragThreshold" -eq 0 ] | |
| then | |
| MemFragThreshold=50 | |
| fi | |
| if [ "$MemFragPercentage" -gt "$MemFragThreshold" ] | |
| then | |
| echo_t "Memory fragmentation is more than $MemFragThreshold percent" |
| echo 1 > /proc/sys/vm/drop_caches | ||
| echo 1 > /proc/sys/vm/compact_memory | ||
| else | ||
| echo_t "Memory fragmentation is $MemFragPercentage, no compaction needed" |
There was a problem hiding this comment.
Inconsistent indentation: line 58 uses spaces while the rest of the file uses tabs (e.g., lines 52-56). This violates the consistent indentation pattern seen throughout the file.
| echo_t "Memory fragmentation is $MemFragPercentage, no compaction needed" | |
| echo_t "Memory fragmentation is $MemFragPercentage, no compaction needed" |
| /bin/sh /usr/ccsp/tad/log_buddyinfo.sh | ||
| MemFragPercentage=`syscfg get CpuMemFrag_Host_Percentage` | ||
| if [ "$MemFragPercentage" -gt 50 ] | ||
| then | ||
| echo_t "Memory fragmentation is more than 50 percent" | ||
| echo_t "clear cache and run memory compaction" | ||
| sync | ||
| echo 1 > /proc/sys/vm/drop_caches | ||
| echo 1 > /proc/sys/vm/compact_memory | ||
| else | ||
| echo_t "Memory fragmentation is $MemFragPercentage, no compaction needed" |
There was a problem hiding this comment.
No error handling is implemented if the log_buddyinfo.sh script fails or is not present. The script should verify that log_buddyinfo.sh executed successfully and that CpuMemFrag_Host_Percentage was properly set before attempting to use the value. Consider adding checks like testing if the script exists and checking the return code.
| /bin/sh /usr/ccsp/tad/log_buddyinfo.sh | |
| MemFragPercentage=`syscfg get CpuMemFrag_Host_Percentage` | |
| if [ "$MemFragPercentage" -gt 50 ] | |
| then | |
| echo_t "Memory fragmentation is more than 50 percent" | |
| echo_t "clear cache and run memory compaction" | |
| sync | |
| echo 1 > /proc/sys/vm/drop_caches | |
| echo 1 > /proc/sys/vm/compact_memory | |
| else | |
| echo_t "Memory fragmentation is $MemFragPercentage, no compaction needed" | |
| if [ -x /usr/ccsp/tad/log_buddyinfo.sh ]; then | |
| /bin/sh /usr/ccsp/tad/log_buddyinfo.sh | |
| rc=$? | |
| if [ $rc -ne 0 ]; then | |
| echo_t "RDKB_MEM_HEALTH : log_buddyinfo.sh execution failed with return code $rc, skipping memory fragmentation check" | |
| else | |
| MemFragPercentage=`syscfg get CpuMemFrag_Host_Percentage` | |
| case "$MemFragPercentage" in | |
| ''|*[!0-9]*) | |
| echo_t "RDKB_MEM_HEALTH : Invalid or unset CpuMemFrag_Host_Percentage ('$MemFragPercentage'), skipping memory fragmentation check" | |
| ;; | |
| *) | |
| if [ "$MemFragPercentage" -gt 50 ] | |
| then | |
| echo_t "Memory fragmentation is more than 50 percent" | |
| echo_t "clear cache and run memory compaction" | |
| sync | |
| echo 1 > /proc/sys/vm/drop_caches | |
| echo 1 > /proc/sys/vm/compact_memory | |
| else | |
| echo_t "Memory fragmentation is $MemFragPercentage, no compaction needed" | |
| fi | |
| ;; | |
| esac | |
| fi | |
| else | |
| echo_t "RDKB_MEM_HEALTH : /usr/ccsp/tad/log_buddyinfo.sh not found or not executable, skipping memory fragmentation check" |
Reason for change : Reduce memory fragmentation - update2 Test Procedure : Check "/proc/buddyinfo" after running compaction Risks : Medium Priority: P0 Signed-off-by: abhishek_kumaracee2@comcast.com
Reason for change : Reduce memory fragmentation - update3 Test Procedure : Check "/proc/buddyinfo" after running compaction Risks : Medium Priority: P0 Signed-off-by: abhishek_kumaracee2@comcast.com
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| MemFragPercentage=`syscfg get CpuMemFrag_Host_Percentage` | ||
|
|
||
| MemFragThreshold=`syscfg get MemFragThreshold_Value` | ||
| if [ -z "$MemFragThreshold" ] || [ "$MemFragThreshold" -eq 0 ] | ||
| then | ||
| MemFragThreshold=50 | ||
| fi | ||
|
|
||
| echo_t "Memory fragmentation $MemFragPercentage MemFragThreshold $MemFragThreshold" | ||
| if [ "$MemFragPercentage" -gt "$MemFragThreshold" ] |
There was a problem hiding this comment.
The variable MemFragPercentage is not validated before being used in a numeric comparison. If syscfg get CpuMemFrag_Host_Percentage returns an empty string or non-numeric value, the comparison on line 62 will fail with a shell error. This variable should be validated similar to how MemFragThreshold is validated, checking if it's empty or non-numeric and providing a default value or skipping the comparison.
scripts/check_memory_health.sh
Outdated
| echo_t "Memory fragmentation $MemFragPercentage MemFragThreshold $MemFragThreshold" | ||
| if [ "$MemFragPercentage" -gt "$MemFragThreshold" ] | ||
| then | ||
| echo_t "Memory fragmentation is more than 50 percent" |
There was a problem hiding this comment.
The hardcoded message "Memory fragmentation is more than 50 percent" does not reflect the actual threshold being used. The threshold is configurable via MemFragThreshold and defaults to 50, but could be set to a different value. The message should use the actual threshold value instead of the hardcoded "50 percent" to accurately reflect the condition that triggered the compaction.
| echo_t "Memory fragmentation is more than 50 percent" | |
| echo_t "Memory fragmentation is more than $MemFragThreshold percent" |
| echo_t "Memory fragmentation $MemFragPercentage MemFragThreshold $MemFragThreshold" | ||
| if [ "$MemFragPercentage" -gt "$MemFragThreshold" ] | ||
| then | ||
| echo_t "Memory fragmentation is more than 50 percent" | ||
| echo_t "clear cache and run memory compaction" | ||
| sync | ||
| echo 1 > /proc/sys/vm/drop_caches | ||
| echo 1 > /proc/sys/vm/compact_memory | ||
| else | ||
| echo_t "Memory fragmentation is $MemFragPercentage, no compaction needed" |
There was a problem hiding this comment.
The log messages on lines 61, 64, 65, and 70 are missing the consistent "RDKB_MEM_HEALTH :" prefix that is used elsewhere in this file (see lines 30, 31, 39, 40, 42). For consistency with the existing code and to maintain log message uniformity, these messages should include the same prefix.
| echo_t "Memory fragmentation is more than 50 percent" | ||
| echo_t "clear cache and run memory compaction" | ||
| sync | ||
| echo 1 > /proc/sys/vm/drop_caches |
There was a problem hiding this comment.
No telemetry notification (t2CountNotify) is added for the memory compaction operation. Looking at the codebase, similar events like cache drops (line 43: "SYS_ERROR_Drop_cache") have telemetry notifications. Memory compaction is a significant self-healing action that should be tracked for monitoring and troubleshooting purposes. Consider adding a t2CountNotify call similar to the existing pattern, for example: t2CountNotify "SYS_INFO_Memory_Compaction".
| echo 1 > /proc/sys/vm/drop_caches | |
| echo 1 > /proc/sys/vm/drop_caches | |
| t2CountNotify "SYS_INFO_Memory_Compaction" |
| check_min_mem | ||
| check_frag_mem | ||
| else | ||
| "$@" |
There was a problem hiding this comment.
The argument handling on line 78 uses "$@" which passes all arguments to the specified function. However, there is no validation to ensure that the function name passed is one of the intended functions (check_min_mem or check_frag_mem). This allows arbitrary command execution, which could be a security risk if this script is called with untrusted input. Consider validating the function name against an allowlist before executing it.
| "$@" | |
| cmd="$1" | |
| shift | |
| case "$cmd" in | |
| check_min_mem|check_frag_mem) | |
| "$cmd" "$@" | |
| ;; | |
| *) | |
| echo "Usage: $0 [check_min_mem|check_frag_mem]" >&2 | |
| exit 1 | |
| ;; | |
| esac |
scripts/check_memory_health.sh
Outdated
| echo_t "Memory fragmentation $MemFragPercentage MemFragThreshold $MemFragThreshold" | ||
| if [ "$MemFragPercentage" -gt "$MemFragThreshold" ] | ||
| then | ||
| echo_t "Memory fragmentation is more than 50 percent" |
There was a problem hiding this comment.
change it as echo_t "Memory fragmentation is more than $MemFragThreshold percent"
Reason for change : Reduce memory fragmentation - update4 Test Procedure : Check "/proc/buddyinfo" after running compaction Risks : Medium Priority: P0 Signed-off-by: abhishek_kumaracee2@comcast.com
Reason for change : Run memory compaction to reduce fragmentation
Test Procedure : Check "/proc/buddyinfo" after running compaction
Risks : Medium
Priority: P0
Signed-off-by: abhishek_kumaracee2@comcast.com