Comment # 17 on bug 1165351 from Thorsten Kukuk

(In reply to Thomas Blume from comment #15)
> (In reply to Franck Bui from comment #14)
> > After some more debugging I finally found that the bug is essentially the
> > same as the one reported here:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1705641
> > 
> > A more interesting discussion can be found here
> > https://github.com/systemd/systemd/pull/13359 but unfortunately it led to
> > nowhere.
> > 
> > Adding Thomas in Cc as he's the libtirpc maintainer.
> 
> Indeed:
> 
>          return (tbsize = (int)rl.rlim_max);
> 
> seems to be a bit excessive.
> But I'm not sure how else to determine a good value.
> Thorsten, would be changing this to the soft limit appropriate, e.g:
> 
>          return (tbsize = (int)rl.rlim_cur);
> 
> ?

As far as I understand the code, no, this will not work. To be fast they are
doing something which is very fast, but also very memory consuming: have a
table for all possible file descriptors to lookup the data instead of going
through a list of existing file descriptors. If you set the limit lower than
possible, you could get a out of array access.
The code was written at a time, when 1024 file handles was the maximum
possible, not for that big numbers as of today.

I would discuss this on the tirpc mailing list. The whole code needs to be
rewritten to be less memory consuming.

> Btw. the RedHat bugzilla indicates that enabling nscd would also fix the
> excessive memory consumption.

No, I tried that already yesterday: if the nscd cache is full (somebody did run
"getent group" before, it works. If the nscd cache is empty, I saw the same
crash.