Comment # 6 on bug 1165502 from
(In reply to Franck Bui from comment #5)
> (In reply to Archie Cobbs from comment #4)
> > 
> > The warning only happens occasionally. I haven't been able to get it to
> > happen by trying manually on the command line yet.
> 
> So what did you try to show in comment #2 ?

Sorry for the confusion.

What happens is that Nagios is running on another machine and it
periodically logs into this machine via SSH to perform various checks.

The only check that seems to be causing the error is the JVM check
that runs sudo (see below for entire script).

All of the actual error cases I've shown are just pulled from /var/log/warn.
These all occurred when I wasn't watching. Most of the time the Nagios
checks do not generate the warning, but occasionally they do and these
instances are captured in /var/log/warn.

I've not yet been able to make the problem occur in "real-time" from the
console.

> > Note each time 'pexpnagios' logs in it may be doing a different Nagios
> > check. The Nagios check that generates the errors seems to always be one
> 
> Can you tell us more about how 'pexpnagios' was created ? Can you share the
> relevant entry in /etc/passwd for this user for example ?

>From /etc/passwd:

    pexpnagios:x:1000:100::/home/pexpnagios:/bin/bash

>From /etc/shadow:

    pexpnagios:*:18183:0:99999:7:::

More info:

    $ cd /home/pexpnagios/
    $ tree -a
    .
    ��������� .bash_history
    ��������� .bashrc
    ��������� bin
    ��������� .config
    ��������� .fonts
    ��������� .local
    ������� ��������� share
    �������     ��������� systemd
    �������         ��������� user -> ../../../.config/systemd/user
    ��������� logwarn
    ������� ��������� pcom.logwarn
    ������� ��������� syslog.logwarn
    ��������� .profile
    ��������� .ssh
        ��������� authorized_keys

    8 directories, 7 files
    $ cat .local/share/systemd/user 
    cat: .local/share/systemd/user: No such file or directory
    $ cat .ssh/authorized_keys 
    # $Id: authorized_keys 8797 2018-09-13 19:02:55Z archie $

   
no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty,no-user-rc,from="127.0.0.1,::1"
ssh-rsa AAAA[...public key elided...] pexpnagios@ops


I have not idea what that ".local/share/systemd/user" file is or how it got
there.

> > This JVM Nagios check script runs "sudo /usr/bin/jstat" twice in rapid
> > succession:
> 
> Can you tell us more about the Nagios plugin you're using and also share (by
> attaching) the check script ?

See below.

> BTW did you make any modification in the pam stuff ?

I hope not. But to digress a bit, often when I do a "zypper dup" to upgrade to
a newer version of openSUSE, there are RPM config file conflicts in /etc/pam.d.
This is really annoying and it's never clear how to resolve these.

It appears that two things that don't know about each other are conflicting:
(a) the /etc/pam.d/common-foo files are normal files owned by the "pam" RPM,
but (b) the "pam-config" RPM turns them into symlinks. So any time the "pam"
RPM is upgraded and changes any of the common-foo files, you get an RPM config
file conflict. 

Anyway, don't know if that has anything to do with this. Here's the contents of
those file:

$ cd /etc/pam.d/
$ ls -l
total 148
-rw-r--r-- 1 root root 167 Sep  5  2019 chage
-rw-r--r-- 1 root root 199 Sep  5  2019 chfn
-rw-r--r-- 1 root root 199 Sep  5  2019 chpasswd
-rw-r--r-- 1 root root 199 Sep  5  2019 chsh
lrwxrwxrwx 1 root root  17 May 23  2016 common-account -> common-account-pc
-rw-r--r-- 1 root root 392 Oct 25  2015 common-account.pam-config-backup
-rw-r--r-- 1 root root 451 Mar  2 16:53 common-account-pc
lrwxrwxrwx 1 root root  14 May 23  2016 common-auth -> common-auth-pc
-rw-r--r-- 1 root root 462 Oct 25  2015 common-auth.pam-config-backup
-rw-r--r-- 1 root root 536 Mar  2 16:53 common-auth-pc
lrwxrwxrwx 1 root root  18 May 23  2016 common-password -> common-password-pc
-rw-r--r-- 1 root root 510 Oct 25  2015 common-password.pam-config-backup
-rw-r--r-- 1 root root 422 Mar  2 16:53 common-password-pc
lrwxrwxrwx 1 root root  17 May 23  2016 common-session -> common-session-pc
-rw-r--r-- 1 root root 482 Oct 25  2015 common-session.pam-config-backup
-rw-r--r-- 1 root root 547 Mar  2 16:53 common-session-pc
-rw-r--r-- 1 root root 481 Dec  4 05:03 crond
-rw-r--r-- 1 root root 172 Sep  5  2019 groupadd
-rw-r--r-- 1 root root 172 Sep  5  2019 groupdel
-rw-r--r-- 1 root root 172 Sep  5  2019 groupmod
-rw-r--r-- 1 root root 370 Sep  5  2019 login
-rw-r--r-- 1 root root 172 Sep  5  2019 newusers
-rw-r--r-- 1 root root 251 Apr 27  2019 other
-rw-r--r-- 1 root root 133 Sep  5  2019 passwd
-rw-r--r-- 1 root root 165 Jul 30  2019 polkit-1
-rw-r--r-- 1 root root 336 Jul 10  2019 pure-ftpd
-rw-r--r-- 1 root root 492 Sep  5  2019 remote
-rw-r--r-- 1 root root 274 Sep  5  2019 runuser
-rw-r--r-- 1 root root 280 Sep  5  2019 runuser-l
-rw-r--r-- 1 root root 137 Apr 27  2019 screen
-rw-r--r-- 1 root root 165 Oct 22 05:13 smtp
-rw-r--r-- 1 root root 404 Dec  9 11:05 sshd
-rw-r--r-- 1 root root 277 Sep  5  2019 su
-rw-r--r-- 1 root root 249 Feb 19 03:10 sudo
-rw-r--r-- 1 root root 255 Feb 19 03:10 sudo-i
-rw-r--r-- 1 root root 329 Sep  5  2019 su-l
-rw-r--r-- 1 root root 220 Feb 25 10:01 systemd-user
-rw-r--r-- 1 root root 172 Sep  5  2019 useradd
-rw-r--r-- 1 root root 172 Sep  5  2019 userdel
-rw-r--r-- 1 root root 172 Sep  5  2019 usermod
-rw-r--r-- 1 root root 164 Mar  5  2019 vlock
[root@test.stv.pexp] 834 cat common-account
#%PAM-1.0
#
# This file is autogenerated by pam-config. All changes
# will be overwritten.
#
# Account-related modules common to all services
#
# This file is included from other service-specific PAM config files,
# and should contain a list of the account modules that define
# the central access policy for use on the system.  The default is to
# only deny service to users whose accounts are expired.
#
account    required    pam_unix.so    try_first_pass 
[root@test.stv.pexp] 835 cat common-auth
#%PAM-1.0
#
# This file is autogenerated by pam-config. All changes
# will be overwritten.
#
# Authentication-related modules common to all services
#
# This file is included from other service-specific PAM config files,
# and should contain a list of the authentication modules that define
# the central authentication scheme for use on the system
# (e.g., /etc/shadow, LDAP, Kerberos, etc.). The default is to use the
# traditional Unix authentication mechanisms.
#
auth    required    pam_env.so    
auth    required    pam_unix.so    try_first_pass 
[root@test.stv.pexp] 836 cat common-password
#%PAM-1.0
#
# This file is autogenerated by pam-config. All changes
# will be overwritten.
#
# Password-related modules common to all services
#
# This file is included from other service-specific PAM config files,
# and should contain a list of modules that define  the services to be
# used to change user passwords.
#
password    requisite    pam_cracklib.so    
password    required    pam_unix.so    use_authtok nullok try_first_pass 
[root@test.stv.pexp] 837 cat common-session
#%PAM-1.0
#
# This file is autogenerated by pam-config. All changes
# will be overwritten.
#
# Session-related modules common to all services
#
# This file is included from other service-specific PAM config files,
# and should contain a list of modules that define tasks to be performed
# at the start and end of sessions of *any* kind (both interactive and
# non-interactive
#
session    optional    pam_systemd.so
session    required    pam_limits.so    
session    required    pam_unix.so    try_first_pass 
session    optional    pam_umask.so    
session    optional    pam_env.so    


This is our homebrew check_jvm Nagios check:

======== cut here ============

cat /usr/lib/nagios/plugins/check_jvm 
#!/bin/sh

#
# Checks JVM stats
#

# Set constants
SUWRAPPER="/usr/bin/sudo"
JSTAT="${SUWRAPPER} /usr/bin/jstat"

# Slurp in common functions
. /usr/lib/nagios/plugins/check_functions

# Usage message
usage()
{
    echo "Usage:" 1>&2
    echo "    ${NAME} [-e] [-o] [-p]" 1>&2
    echo "Options:" 1>&2
    echo "    -e    Warn if eden space > this percentage" 1>&2
    echo "    -o    Warn when old gen space > this percentage" 1>&2
    echo "    -p    Warn when perm gen space > this percentage" 1>&2
    echo "    -g    Warn when GC time exceeds threshold" 1>&2
    echo "    -u    Warn when heap utilization > this percentage" 1>&2
    echo "    -h    Display this help message" 1>&2
}

# Check variables
EDEN_CHECK=
OLDG_CHECK=
PRMG_CHECK=
GCTH_CHECK=
HPTH_CHECK=
PID=`pidof java`

# Parse flags passed in on the command line
while [ ${#} -gt 0 ]; do
    case "$1" in
        -e|--eden)
            shift
            EDEN_CHECK="$1"
            shift
            ;;
        -o|--old)
            shift
            OLDG_CHECK="$1"
            shift
            ;;
        -p|--perm)
            shift
            PRMG_CHECK="$1"
            shift
            ;;
        -g|--gc)
            shift
            GCTH_CHECK="$1"
            shift
            ;;
        -u|--heap)
            shift
            HPTH_CHECK="$1"
            shift
            ;;
        -h|--help)
            usage
            exit
            ;;
        --)
            shift
            break
            ;;
        *)
            break
            ;;
    esac
done
case "${#}" in
    0)
        break
        ;;
    *)
        usage
        exit 1
        ;;
esac

# Sanity check
if [ -z "${PID}" ]; then
    nagios_exit OK "No java process found"
fi

# Check heap utilization, heap capacity, and GC time
JSTAT_GC=`${JSTAT} -gc -t ${PID} \
 | tail -1 | sed "s/^\s*//g"`
JSTAT_HP=`${JSTAT} -gccapacity -t ${PID} \
 | tail -1 | sed "s/^\s*//g"`

# Check Eden
EC=`echo "${JSTAT_GC}" | awk '{ printf("%.0f", $6) }'`
EU=`echo "${JSTAT_GC}" | awk '{ printf("%.0f", $7) }'`
EP=$(printf '%i %i' $EU $EC | awk '{ pc=100*$1/$2; i=int(pc); print
(pc-i<0.5)?i:i+1 }')
if [ -n "${EDEN_CHECK}" ] && [ "${EP}" -gt ${EDEN_CHECK} ] ; then
    nagios_exit WARNING "Eden space at ${EP}%"
fi

# Check Old
OC=`echo "${JSTAT_GC}" | awk '{ printf("%.0f", $8) }'`
OU=`echo "${JSTAT_GC}" | awk '{ printf("%.0f", $9) }'`
OP=$(printf '%i %i' $OU $OC | awk '{ pc=100*$1/$2; i=int(pc); print
(pc-i<0.5)?i:i+1 }')
if [ -n "${OLDG_CHECK}" ] && [ "${OP}" -gt "${OLDG_CHECK}" ] ; then
    nagios_exit WARNING "Old generation space at ${OP}%"
fi

# Check Perm
PC=`echo "${JSTAT_GC}" | awk '{ printf("%.0f", $10) }'`
PU=`echo "${JSTAT_GC}" | awk '{ printf("%.0f", $11) }'`
PP=$(printf '%i %i' $PU $PC | awk '{ pc=100*$1/$2; i=int(pc); print
(pc-i<0.5)?i:i+1 }')
if [ -n "${PRMG_CHECK}" ] && [ "${PP}" -gt "${PRMG_CHECK}" ] ; then
    nagios_exit WARNING "Perm generation space at ${PP}%"
fi

# Check GC time
JVMT=`echo "${JSTAT_GC}"  | awk '{ print $1 }'`
GCTM=`echo "${JSTAT_GC}"  | awk '{ print $16 }'`
GCTS=`echo "scale=20; ${GCTM} / 1000" | bc`
DIFF=`echo "scale=20; ${GCTS} / ${JVMT}" | bc`
if [ -n "${GCTH_CHECK}" ] && [ `echo "scale=20; ${DIFF} > ${GCTH_CHECK}" | bc`
-eq 1 ]; then
    nagios_exit WARNING "Time spent in GC has exceeded $(echo "scale=3;
${GCTH_CHECK} * 100" | bc)% of time since JVM started"
fi

# Check heap utilization
EGUT=`echo "${JSTAT_GC}"  | awk '{ print $7 }'`
EGCP=`echo "${JSTAT_HP}"  | awk '{ print $3 }'`
OGUT=`echo "${JSTAT_GC}"  | awk '{ print $9 }'`
OGCP=`echo "${JSTAT_HP}"  | awk '{ print $9 }'`
HPTU=`echo "scale=20; ${EGUT} + ${OGUT}" | bc`
HPTC=`echo "scale=20; ${EGCP} + ${OGCP}" | bc`
DIFF=`echo "scale=20; ${HPTU} / ${HPTC}" | bc`
if [ -n "${HPTH_CHECK}" ] && [ `echo "scale=20; ${DIFF} > ${HPTH_CHECK}" | bc`
-eq 1 ]; then
    nagios_exit WARNING "Heap utilization at $(echo "scale=3; ${DIFF} * 100" |
bc | awk '{printf "%f", $0}')% of max capacity"
fi

# Done
nagios_exit OK


You are receiving this mail because: