FOR TIPS, gUIDES & TUTORIALS

subscribe to our Youtube

GO TO YOUTUBE

14455 questions

17168 answers

28195 comments

0 members

We are migrating to our new platform at https://community.teltonika.lt. Moving forward, you can continue discussions on this new platform. This current platform will be temporarily maintained for reference purposes.
+1 vote
495 views 6 comments
by anonymous

We have a production environment with around 100 RUT955 devices. A couple of months ago we updated serveral devices to fw version RUT9_R_00.07.01.4. 

After about 50 days uptime we are getting high ping responses from those devices. Remote ssh is not available (connection refused or timeout). WebGUI login fails with "device busy". 

Device logs show out of memory problems with process "ports_eventsd" as the source:

[4268791.398774] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),task=port_eventsd,pid=2632,uid=0 
[4268791.407894] Out of memory: Killed process 2632 (port_eventsd) total-vm:67244kB, anon-rss:65532kB, file-rss:4kB, shmem-rss:0kB, UID:0 pgtables:80kB oom_score_adj:0 
[4268792.032643] oom_reaper: reaped process 2632 (port_eventsd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[4586670.162124] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),task=port_eventsd,pid=2632,uid=0
[4586670.171269] Out of memory: Killed process 2632 (port_eventsd) total-vm:71784kB, anon-rss:70076kB, file-rss:4kB, shmem-rss:0kB, UID:0 pgtables:88kB oom_score_adj:0
[4586670.965407] oom_reaper: reaped process 2632 (port_eventsd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
After the oom-kill device response time is back to normal. But before this we see 24-72 hrs with reduced performance and lost connectivity. 
Is this a fw related bug? Any workaround?
by anonymous

As far as we can see this is connected to firmware v7 and can be found on both RUT955 and RUT956 devices. The port_eventsd has increasing memory usage over time.

Firmware v7, RUT955:

root@Teltonika-RUT955:~# cat /etc/version && uptime && free && top -n 1 |grep eventsd

RUT9_R_00.07.01.4

 10:09:29 up 48 days, 16:13,  load average: 0.03, 0.04, 0.00

              total        used        free      shared  buff/cache   available

Mem:         124832       93832       21396         280        9604        1484

Swap:             0           0           0

 4599     1 root     S    66844  53%   0% /usr/bin/port_eventsd --suppress-topol

root@Teltonika-RUT955:~#

Firmware v7, RUT956:

root@Teltonika-RUT956:~# cat /etc/version && uptime && free && top -n 1 |grep eventsd

RUT9M_R_00.07.01.7

 12:08:34 up 32 days, 22:11,  load average: 0.25, 0.30, 0.34

              total        used        free      shared  buff/cache   available

Mem:         123268       73828       25900         220       23540       12988

Swap:             0           0           0

 2575     1 root     S    45924  37%   9% /usr/bin/port_eventsd --suppress-topol

root@Teltonika-RUT956:~#

Firmware v6, RUT955:

root@Teltonika-RUT955:~# cat /etc/version && uptime && free && top -n 1 |grep eventsd

RUT9XX_R_00.06.06.1

 10:10:48 up 27 days,  5:01,  load average: 0.27, 0.06, 0.02

              total        used        free      shared  buff/cache   available

Mem:         125984       25728       74764         492       25492       99040

Swap:             0           0           0

25175 25148 root     S     1536   1%   0% grep eventsd

root@Teltonika-RUT955:~#

1 Answer

0 votes
by anonymous
Hello,

Maybe there is a possibility to get a full troubleshoot file where these logs are visible or more of the logs are visible, that would allow me to create a better case for our RnD department to look more deeply into this.

Thank you
by anonymous
Any updates on this issue? During last week the situation got a lot worse since we are around 50 days from the rollout of RUTOS v7. We have implemented a temporary workaround where we check the free memory and if the value is <30000kb we run killall port_eventsd.

What is the purpose of the port_eventsd process?

Could it be disabled?
by anonymous
Could you post the output of cat /proc/(pid of port_eventsd)/status ?
by anonymous
root@Teltonika-RUT956:~# cat /proc/2558/status
Name:   port_eventsd
Umask:  0022
State:  S (sleeping)
Tgid:   2558
Ngid:   0
Pid:    2558
PPid:   1
TracerPid:      0
Uid:    0       0       0       0
Gid:    0       0       0       0
FDSize: 32
Groups:
NStgid: 2558
NSpid:  2558
NSpgid: 1
NSsid:  1
VmPeak:    37844 kB
VmSize:    37844 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:     37020 kB
VmRSS:     37020 kB
RssAnon:           36128 kB
RssFile:             892 kB
RssShmem:              0 kB
VmData:    36156 kB
VmStk:       132 kB
VmExe:        16 kB
VmLib:      1536 kB
VmPTE:        52 kB
VmSwap:        0 kB
CoreDumping:    0
THP_enabled:    0
Threads:        1
SigQ:   0/953
SigPnd: 00000000000000000000000000000000
ShdPnd: 00000000000000000000000000000000
SigBlk: 00000000000000000000000000000000
SigIgn: 00000000000000000000000000001000
SigCgt: 00000000000000000000000000024002
CapInh: 0000000000000000
CapPrm: 0000003fffffffff
CapEff: 0000003fffffffff
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000
NoNewPrivs:     0
Seccomp:        0
Speculation_Store_Bypass:       unknown
Cpus_allowed:   1
Cpus_allowed_list:      0
voluntary_ctxt_switches:        2305029
nonvoluntary_ctxt_switches:     1480045
root@Teltonika-RUT956:~#
by anonymous

The most interesting field to monitor is VmRSS, it looks high this probably indicates a memory leak. Re-check in a few hours and post the new cat output. The value of cat /proc/2558/oom_score could also be of interest.

htop gives a better view of the memory usage than top:

opkg update; opkg install htop

by anonymous
Hello,

We found the memory leak on our side and will be no more with 7.2.6 FW release (as currently planned)