Skip to content

Mappings are repeatedly parsed #1219

@bobrik

Description

@bobrik

We have a V8 based piece of software (edgeworker, approximately workerd) that has a ton of mappings:

ivan@107m94:~$ sudo wc -l /proc/*/maps | sort -n | tail -n5
6922 /proc/557384/maps
22106 /proc/3927685/maps
52283 /proc/2264585/maps
128198 /proc/2264341/maps
351702 total

All top3 instances are of that software.

A big chunk of time is spent parsing mappings:

Image

I can see in strace that the most map-heavy process gets re-parsed a lot:

ivan@107m94:~$ sudo strace -t -f -p $(pidof ebpf-profiler) 2>&1 | grep --line-buffered /proc/2264341
[pid 910473] 05:45:25 openat(AT_FDCWD, "/proc/2264341/maps", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid 910237] 05:45:27 readlinkat(AT_FDCWD, "/proc/2264341/exe", "/usr/local/bin/edgeworker", 128) = 25
[pid 910465] 05:45:41 openat(AT_FDCWD, "/proc/2264341/maps", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid 910243] 05:45:44 readlinkat(AT_FDCWD, "/proc/2264341/exe",  <unfinished ...>
[pid 910465] 05:45:46 openat(AT_FDCWD, "/proc/2264341/maps", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid 910472] 05:45:48 readlinkat(AT_FDCWD, "/proc/2264341/exe",  <unfinished ...>
[pid 910472] 05:45:49 openat(AT_FDCWD, "/proc/2264341/maps", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid 910465] 05:45:51 readlinkat(AT_FDCWD, "/proc/2264341/exe", "/usr/local/bin/edgeworker", 128) = 25
[pid 909883] 05:45:55 openat(AT_FDCWD, "/proc/2264341/maps", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid 910472] 05:45:59 readlinkat(AT_FDCWD, "/proc/2264341/exe",  <unfinished ...>
[pid 910243] 05:46:05 openat(AT_FDCWD, "/proc/2264341/cgroup", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid 910472] 05:46:05 openat(AT_FDCWD, "/proc/2264341/maps", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid 910472] 05:46:08 readlinkat(AT_FDCWD, "/proc/2264341/exe", "/usr/local/bin/edgeworker", 128) = 25
[pid 910472] 05:46:08 openat(AT_FDCWD, "/proc/2264341/maps", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid 910473] 05:46:11 readlinkat(AT_FDCWD, "/proc/2264341/exe",  <unfinished ...>
[pid 910473] 05:46:11 openat(AT_FDCWD, "/proc/2264341/maps", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid 910473] 05:46:13 readlinkat(AT_FDCWD, "/proc/2264341/exe",  <unfinished ...>
[pid 910243] 05:46:25 openat(AT_FDCWD, "/proc/2264341/maps", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid 910237] 05:46:27 readlinkat(AT_FDCWD, "/proc/2264341/exe",  <unfinished ...>
[pid 910237] 05:46:27 openat(AT_FDCWD, "/proc/2264341/maps", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid 910465] 05:46:30 readlinkat(AT_FDCWD, "/proc/2264341/exe",  <unfinished ...>
[pid 909925] 05:46:30 openat(AT_FDCWD, "/proc/2264341/maps", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid 910465] 05:46:33 readlinkat(AT_FDCWD, "/proc/2264341/exe",  <unfinished ...>
[pid 910465] 05:46:33 openat(AT_FDCWD, "/proc/2264341/maps", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid 910465] 05:46:35 readlinkat(AT_FDCWD, "/proc/2264341/exe",  <unfinished ...>
[pid 910237] 05:46:37 openat(AT_FDCWD, "/proc/2264341/maps", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid 909885] 05:46:39 readlinkat(AT_FDCWD, "/proc/2264341/exe", "/usr/local/bin/edgeworker", 128) = 25
[pid 910243] 05:46:40 openat(AT_FDCWD, "/proc/2264341/maps", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid 910473] 05:46:43 readlinkat(AT_FDCWD, "/proc/2264341/exe",  <unfinished ...>
[pid 910473] 05:46:43 openat(AT_FDCWD, "/proc/2264341/maps", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid 910473] 05:46:45 readlinkat(AT_FDCWD, "/proc/2264341/exe",  <unfinished ...>
[pid 910465] 05:46:50 openat(AT_FDCWD, "/proc/2264341/maps", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid 910243] 05:46:53 readlinkat(AT_FDCWD, "/proc/2264341/exe",  <unfinished ...>
[pid 910243] 05:47:05 openat(AT_FDCWD, "/proc/2264341/cgroup", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid 910243] 05:47:06 openat(AT_FDCWD, "/proc/2264341/maps", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid 909925] 05:47:10 readlinkat(AT_FDCWD, "/proc/2264341/exe",  <unfinished ...>
[pid 910472] 05:47:10 openat(AT_FDCWD, "/proc/2264341/maps", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid 910472] 05:47:14 readlinkat(AT_FDCWD, "/proc/2264341/exe",  <unfinished ...>
[pid 910473] 05:47:16 openat(AT_FDCWD, "/proc/2264341/maps", O_RDONLY|O_CLOEXEC <unfinished ...>

Sprinkling some debug printk in bpf code reveals that this is the most called codepath:

case metricID_UnwindNativeErrWrongTextSection:;
u64 pid_tgid = (u64)trace->pid << 32 | trace->tid;
if (report_pid(ctx, pid_tgid, record->ratelimitAction)) {
increment_metric(metricID_NumUnknownPC);
}

There are indeed many unknown frames. We have v8 interpreter disabled (and it has never worked for us here).

Looking at top offenders in terms of time grouped by exe path, I can also see nginx there, which has a bunch of lua and many worker processes, which I assume is responsible here too.

The actual top of GetMappings timing:

<even smaller nginx with lua>: 1025.23 ms
<smaller nginx with lua>: 1562.27 ms
<big nginx with lua>: 3600.90 ms
<v8 based thingy>: 14815.62 ms

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions