#1388 epoll: massive performance hit on 0.11 (closed)

#1388 epoll: massive performance hit on 0.11

Reporter	ge0rg
Owner	Zash
Created	2019-07-06
Updated	2019-07-08
Stars	★ (1)
Tags	Status-Fixed Priority-Medium Milestone-0.11 Type-Defect Performance

ge0rg on 2019-07-06
Use prosody 0.11 with the new epoll backend on Linux. Have thousands of connections open (c2s, s2s). The CPU load will rise significantly (6x), in comparison to lua-socket with libevent. There are also strong peaks in the daily cpu graphs, probably at time of high stanza throughput. Switching back to lua-socket immediately reverts the cpu load to small values.
Zash on 2019-07-06
An inefficiency in timer handling was identified and fixed in trunk in https://hg.prosody.im/trunk/rev/6c2370f17027 Don't know if there's anything else but will backport that.
Changes
- owner Zash
- tags Milestone-0.11 Status-Accepted
Zash on 2019-07-08
Backported in https://hg.prosody.im/trunk/rev/c8c3f2eba898 Brief test showed significant improvement of CPU usage with lots of timers. Since each connection has at least one read timeout active at any time, lots of connections means lots of timers. When I found this, I believe it was because it spent a lot of time sorting the timers since they move around each time there is some incoming data that pushes their read timeout further into the future. The sorting compares fields in tables, which would explain the time spent in table indexing that you saw. On my not too bad laptop, with 20000 x 60s timers, it hovered around 50% CPU usage before this change, and under 1% after. For comparison, server_select seemed to hover around 7% and server_event seems to handle many times as many noop timers without touching the CPU.
Changes
- tags Status-Fixed
Zash on 2019-07-08
Yes, testing would be appreciated.
Changes
- tags Performance

ge0rg on 2019-07-06

Use prosody 0.11 with the new epoll backend on Linux. Have thousands of connections open (c2s, s2s). The CPU load will rise significantly (6x), in comparison to lua-socket with libevent. There are also strong peaks in the daily cpu graphs, probably at time of high stanza throughput. Switching back to lua-socket immediately reverts the cpu load to small values.

Zash on 2019-07-06

An inefficiency in timer handling was identified and fixed in trunk in https://hg.prosody.im/trunk/rev/6c2370f17027 Don't know if there's anything else but will backport that.

Changes

owner Zash
tags Milestone-0.11 Status-Accepted

Zash on 2019-07-08

Backported in https://hg.prosody.im/trunk/rev/c8c3f2eba898 Brief test showed significant improvement of CPU usage with lots of timers. Since each connection has at least one read timeout active at any time, lots of connections means lots of timers. When I found this, I believe it was because it spent a lot of time sorting the timers since they move around each time there is some incoming data that pushes their read timeout further into the future. The sorting compares fields in tables, which would explain the time spent in table indexing that you saw. On my not too bad laptop, with 20000 x 60s timers, it hovered around 50% CPU usage before this change, and under 1% after. For comparison, server_select seemed to hover around 7% and server_event seems to handle many times as many noop timers without touching the CPU.

tags Status-Fixed

Yes, testing would be appreciated.

tags Performance

#1388 epoll: massive performance hit on 0.11

New comment