-
Notifications
You must be signed in to change notification settings - Fork 13
Description
I am using an AMD Ryzen 9 5950X to run tests against an FPGA-based NTP server, and have experienced issues with large jitters in the ntpperf-measured timestamps. (The used network interface is a Intel 10G X550T. The FPGA has a 100 Mbps interface.)
Example:
./ntpperf -i enp36s0f0 -m be:14:88:16:24:c0 -d 192.168.1.192 -s 192.168.1.128/28 -B -o 1e-9 -r 10-100
| responses | TX timestamp offset (ns)
rate clients | lost invalid basic xleave | min mean max stddev
10 1 0.00% 0.00% 100.00% 0.00% +77758 +130190 +456198 127530
15 1 0.00% 0.00% 100.00% 0.00% +81435 +255213 +487911 172371
22 2 0.00% 0.00% 100.00% 0.00% +85306 +156121 +582343 145000
33 3 0.00% 0.00% 100.00% 0.00% +73671 +84167 +89560 5785
49 4 0.00% 0.00% 100.00% 0.00% +75069 +112980 +624443 100200
73 7 0.00% 0.00% 100.00% 0.00% +66149 +96425 +437758 76665
Locking ntpperf to two cores helps (one for main thread, second for sender):
taskset -c 8,9 ./ntpperf -i enp36s0f0 -m be:14:88:16:24:c0 -d 192.168.1.192 -s 192.168.1.128/28 -B -o 1e-9 -r 10-100
| responses | TX timestamp offset (ns)
rate clients | lost invalid basic xleave | min mean max stddev
10 1 0.00% 0.00% 100.00% 0.00% +56562 +57392 +59037 789
15 1 0.00% 0.00% 100.00% 0.00% +57221 +59203 +103132 8159
22 2 0.00% 0.00% 100.00% 0.00% +49642 +65937 +433359 56032
33 3 0.00% 0.00% 100.00% 0.00% +55492 +61278 +334949 33957
49 4 0.00% 0.00% 100.00% 0.00% +53880 +55375 +69783 2138
73 7 0.00% 0.00% 100.00% 0.00% +44691 +58269 +428358 37826
Also running some CPU-loading program (e.g. stress) on the other 30 cores help further:
stress --cpu 30 &
taskset -c 8,9 ./ntpperf -i enp36s0f0 -m be:14:88:16:24:c0 -d 192.168.1.192 -s 192.168.1.128/28 -B -o 1e-9 -r 10-100
| responses | TX timestamp offset (ns)
rate clients | lost invalid basic xleave | min mean max stddev
10 1 0.00% 0.00% 100.00% 0.00% +35208 +35980 +37017 557
15 1 0.00% 0.00% 100.00% 0.00% +32751 +34233 +35429 686
22 2 0.00% 0.00% 100.00% 0.00% +30436 +31957 +33692 810
33 3 0.00% 0.00% 100.00% 0.00% +26606 +28820 +30661 1115
49 4 0.00% 0.00% 100.00% 0.00% +22584 +24726 +27272 1132
73 7 0.00% 0.00% 100.00% 0.00% +19266 +21228 +23399 924
I was not able to force the CPU frequencies to fixed values using the CPU scaling governors.
Also, the HW timestamping showed large offsets
./ntpperf -i enp36s0f0 -m be:14:88:16:24:c0 -d 192.168.1.192 -s 192.168.1.128/28 -B -o 1e-9 -r 10-100 -H
| responses | TX timestamp offset (ns)
rate clients | lost invalid basic xleave | min mean max stddev
10 1 0.00% 0.00% 100.00% 0.00% -8567055486 -8567043601 -8567031680 7405
15 1 0.00% 0.00% 100.00% 0.00% -8567082188 -8567070248 -8567058299 7234
and also large standard deviations:
./ntpperf -i enp36s0f0 -m be:14:88:16:24:c0 -d 192.168.1.192 -s 192.168.1.128/28 -B -o -8567000000e-9 -r 10-100 -H
| responses | TX timestamp offset (ns)
rate clients | lost invalid basic xleave | min mean max stddev
10 1 0.00% 0.00% 100.00% 0.00% -765192 -753275 -741408 7409
15 1 0.00% 0.00% 100.00% 0.00% -791828 -779881 -768131 7236
22 2 0.00% 0.00% 100.00% 0.00% -818608 -806721 -794791 7126
33 3 0.00% 0.00% 100.00% 0.00% -845815 -833865 -821858 7059
They are however not affected by the use of stress or taskset or not.
The clock of the machine is handled by chronyd, and synchronized against the same NTP server. Here the hardware timestamps are used (no offsets applied):
MS Name/IP address Stratum Poll Reach LastRx Last sample
===============================================================================
^? gbg2.ntp.netnod.se 1 6 77 4 +16us[ +16us] +/- 267us
^* 192.168.1.192 1 2 377 3 +1505ns[+1973ns] +/- 17us
Name/IP Address NP NR Span Frequency Freq Skew Offset Std Dev
==============================================================================
gbg2.ntp.netnod.se 6 5 136 +0.064 0.691 +6413ns 8560ns
192.168.1.192 32 3 127 +0.001 0.016 +4ns 935ns
Reference ID : C0A801C0 (192.168.1.192)
Stratum : 2
Ref time (UTC) : Fri Feb 10 10:30:24 2023
System time : 0.000000166 seconds fast of NTP time
Last offset : +0.000000468 seconds
RMS offset : 0.000000384 seconds
Frequency : 0.499 ppm slow
Residual freq : +0.001 ppm
Skew : 0.020 ppm
Root delay : 0.000032893 seconds
Root dispersion : 0.000004203 seconds
Update interval : 4.6 seconds
Leap status : Normal
Remote address : 192.168.1.192 (C0A801C0)
Remote port : 123
Local address : 192.168.1.75 (C0A8014B)
Leap status : Normal
Version : 4
Mode : Server
Stratum : 1
Poll interval : 4 (16 seconds)
Precision : -26 (0.000000015 seconds)
Root delay : 0.000000 seconds
Root dispersion : 0.000000 seconds
Reference ID : 47505300 (GPS)
Reference time : Fri Feb 10 10:30:24 2023
Offset : -0.000001973 seconds
Peer delay : 0.000032893 seconds
Peer dispersion : 0.000000910 seconds
Response time : 0.000003520 seconds
Jitter asymmetry: +0.00
NTP tests : 111 111 1111
Interleaved : No
Authenticated : No
TX timestamping : Hardware
RX timestamping : Hardware
Total TX : 36
Total RX : 36
Total valid RX : 36
Perhaps this is useful for someone else.