Use prosody 0.11 with the new epoll backend on Linux. Have thousands of connections open (c2s, s2s).
The CPU load will rise significantly (6x), in comparison to lua-socket with libevent. There are also strong peaks in the daily cpu graphs, probably at time of high stanza throughput.
Switching back to lua-socket immediately reverts the cpu load to small values.
Backported in https://hg.prosody.im/trunk/rev/c8c3f2eba898
Brief test showed significant improvement of CPU usage with lots of timers.
Since each connection has at least one read timeout active at any time,
lots of connections means lots of timers. When I found this, I believe it was
because it spent a lot of time sorting the timers since they move around each
time there is some incoming data that pushes their read timeout further into the future.
The sorting compares fields in tables, which would explain the time spent in
table indexing that you saw.
On my not too bad laptop, with 20000 x 60s timers, it hovered around 50% CPU usage before this change, and under 1% after.
For comparison, server_select seemed to hover around 7% and server_event seems to
handle many times as many noop timers without touching the CPU.
Use prosody 0.11 with the new epoll backend on Linux. Have thousands of connections open (c2s, s2s). The CPU load will rise significantly (6x), in comparison to lua-socket with libevent. There are also strong peaks in the daily cpu graphs, probably at time of high stanza throughput. Switching back to lua-socket immediately reverts the cpu load to small values.
An inefficiency in timer handling was identified and fixed in trunk in https://hg.prosody.im/trunk/rev/6c2370f17027 Don't know if there's anything else but will backport that.
ChangesBackported in https://hg.prosody.im/trunk/rev/c8c3f2eba898 Brief test showed significant improvement of CPU usage with lots of timers. Since each connection has at least one read timeout active at any time, lots of connections means lots of timers. When I found this, I believe it was because it spent a lot of time sorting the timers since they move around each time there is some incoming data that pushes their read timeout further into the future. The sorting compares fields in tables, which would explain the time spent in table indexing that you saw. On my not too bad laptop, with 20000 x 60s timers, it hovered around 50% CPU usage before this change, and under 1% after. For comparison, server_select seemed to hover around 7% and server_event seems to handle many times as many noop timers without touching the CPU.
ChangesYes, testing would be appreciated.
Changes