https://bugzilla.novell.com/show_bug.cgi?id=679857 https://bugzilla.novell.com/show_bug.cgi?id=679857#c0 Summary: atd: race condition of atrm against job execution Classification: openSUSE Product: openSUSE 11.4 Version: Factory Platform: All OS/Version: All Status: NEW Severity: Normal Priority: P5 - None Component: Basesystem AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: ischwarze@astaro.com QAContact: qa@suse.de Found By: --- Blocker: --- Created an attachment (id=419474) --> (http://bugzilla.novell.com/attachment.cgi?id=419474) fix: instead of perr(), call syslog(), free() and return User-Agent: Mozilla/5.0 (X11; U; OpenBSD i386; en-US; rv:1.9.2.13) Gecko/20110216 Firefox/3.6.13 When you use atrm(1) to cancel a job after atd(8) used readdir(3) and stat(2) and decided to run it, but before atd(8) comes round to lock it using link(2), the link(2) call will fail with ENOENT and the main atd(8) daemon process will exit(3) via perr(). In this case, do not exit, but just syslog(3) the race condition and return to the main loop. Reproducible: Sometimes Steps to Reproduce: 1. Compile atd with a long sleep inserted at the beginning of run_file(), atd.c, to make the race condition easier to reproduce. 2. Start the modified atd. 3. Schedule a job for immediate execution; the atd will go into the sleep. 4. remove the job with atrm. Actual Results: The link() syscall in run_file fails with ENOENT, and atd terminates itself, logging: perr("Can't link execution file"); After that, no jobs will ever be started again, because the atd has exited. Expected Results: Log an error and return to the run_loop(). Problem found with at-3.1.8 on the Astaro Security Gateway 8.100 version of SLES11 on the i386 platform; but this is obviously platform and OS independent. Will report upstream to Debian as well. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.