https://bugzilla.novell.com/show_bug.cgi?id=680113
https://bugzilla.novell.com/show_bug.cgi?id=680113#c3
Ingo Schwarze changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEEDINFO |NEW
InfoProvider|ischwarze@astaro.com |
--- Comment #3 from Ingo Schwarze 2011-04-18 16:12:58 UTC ---
(In reply to comment #2)
Oh indeed, at-3.1.8-massive_batch.patch, which didn't exist last time
i looked in Debian, hides the bug - as opposed to fixing it.
The variable nothing_to_do intends to avoid iteration of the spool
directory each time CHECK_INTERVAL expires when it is known that last
time, there were no waiting jobs, and in the meantime, the directory
did not change - which does not work because the directory access time
going backwards is misinterpreted as "no change".
Now, with at-3.1.8-massive_batch.patch, each time at(1) adds a job to the
spool directory, nothing_to_do is cleared by the signal, such that as long
as at(1) is used, and signals don't get lost, checking the access time of
the spool directory is rendered irrelevant by
the at-3.1.8-massive_batch.patch.
However, with a bit of manipulation, it is still possible to
reproduce the bug:
- schedule a job in 5 minutes
- move the spool file to /tmp
# required such that the spool dir becomes empty
- set the spool dir time to the future, e.g. by one day
- restart the atd
# it finds the spool dir empty, sets the flag nothing_to_do,
# and goes to sleep for CHECK_INTERVAL
- move the spool file back to the spool dir
# that sets the spool dir change time back to the correct time
- wait for the scheduled job time to come and send a SIGCONT signal -
NOT a SIGHUP signal, of course! - to the atd (or equivalently, just
wait for the CHECK_INTERVAL to expire)
# during the upcoming cycle, the atd sees that the spool dir time
# is LESS than the one it remembers, hence does *not* iterate the
# spool dir and goes right back to sleep, even though a job is due
Here is a protocol of that test:
ischwarze04:/root # rpm -q at
at-3.1.8-1069.15.53
ischwarze04:/root # date
Mon Apr 18 17:52:23 CEST 2011
ischwarze04:/root # echo "command" | at +5min
warning: commands will be executed using /bin/sh
job 10 at 2011-04-18 17:57
ischwarze04:/root # atq
10 2011-04-18 17:57 a root
ischwarze04:/root # mv /var/spool/atjobs/a0000a014b67fd /tmp/
ischwarze04:/root # touch -t 04191800 /var/spool/atjobs
ischwarze04:/root # ls -ld /var/spool/atjobs
drwx------ 2 at at 4096 2011-04-19 18:00 /var/spool/atjobs
ischwarze04:/root # /etc/init.d/atd restart
Shutting down service at daemon done
Starting service at daemon done
ischwarze04:/root # ls -ld /var/spool/atjobs
drwx------ 2 at at 4096 2011-04-19 18:00 /var/spool/atjobs
ischwarze04:/root # mv /tmp/a0000a014b67fd /var/spool/atjobs/
ischwarze04:/root # ls -ld /var/spool/atjobs
drwx------ 2 at at 4096 2011-04-18 17:54 /var/spool/atjobs
ischwarze04:/root # atq
10 2011-04-18 17:57 a root
ischwarze04:/root # date
Mon Apr 18 17:54:40 CEST 2011
ischwarze04:/root # date
Mon Apr 18 17:58:21 CEST 2011
ischwarze04:/root # atq
10 2011-04-18 17:57 a root
ischwarze04:/root # kill -CONT 23684
ischwarze04:/root # atq
10 2011-04-18 17:57 a root
Here is the associated strace snippet, showing that a run_loop is
triggered by the SIGCONT, but the atd does not even try to look
into the spool directory:
ischwarze04:/root # strace -p 23684
Process 23684 attached - interrupt to quit
restart_syscall(<... resuming interrupted call ...>
) = ? ERESTART_RESTARTBLOCK (To be restarted)
--- SIGCONT (Continued) @ 0 (0) ---
restart_syscall(<... resuming interrupted call ...>
The bug is now hidden so well that is no longer of much
practical relevance, but it is still a bug.
Changing the comparison operator from "<=" to "==" as suggested
in my old patch makes sure that the atd correctly identifies a
directory whose atime goes backwards as a changing directory and
does not wrongly consider it as unchanged.
Thanks for checking,
Ingo
--
Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.