-
Notifications
You must be signed in to change notification settings - Fork 0
Cst jack check for extended apic #210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
A customer saw vector space exhaustion, which meant NVMEs became unusable. This was caused by x2apic not being available, (which can itself by caused by IOMMU being disabled) Therefore, check that x2apic (Intel) or ext_apic (AMD) is available, which should result in plenty of IRQ space. https://weka-support.slack.com/archives/C066DNGSAE5/p1764947984669029
| #check that extended APIC (or x2apic) is available, because it's required for more | ||
| # space for IRQs | ||
|
|
||
| if (grep -m1 -q -E '^flags.*(\<extapic|\<x2apic)' /proc/cpuinfo) ; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This errors out for me, unless there's a no-op instruction:
if (grep -m1 -q -E '^flags.*(\<extapic|\<x2apic)' /proc/cpuinfo) ; then
:
else
Also, do you intend for this to run in a subshell?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry @vrragosta no idea which version made it into the commit. Pushed a slightly less weird version.
vrragosta
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A-OK
| echo "by a Weka process." | ||
| echo "This can be caused by the presence of an enabled APIC device. Review your hardware," | ||
| echo "firmware, and linux kernel settings if this is causing a problem" | ||
| echo "This can sometimes prevent a WEKA Process from receiving interrupts from the NVME" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weka doesnt care about the interrupts, the issue was that the kernel couldn't allocate interrupts and that caused the KERNEL to fail to use the device, which in turn made weka unable to use it since the device couldn't be scanned to know that it is a weka signed device.
| #check that extended APIC (or x2apic) is available, because it's required for more | ||
| # space for IRQs | ||
|
|
||
| grep -m1 -q -E '^flags.*(\<extapic|\<x2apic)' /proc/cpuinfo 2>/dev/null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know for certain that all relevant platforms have x2apic? down to the oldest supported server platforms?
I never took notice of that to know myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently came out in 2008 with Nehalem.... I think that's older than weka.
No description provided.