[opensuse-buildservice] OBS 2.5.x + clockwork = ActiveRecord::ConnectionTimeoutError
Hi, A couple of months back, we updated our local OBS server to OBS 2.5.4. Ever since this update, we've had a lot of issues with clockwork and delayed_job. As soon as we updated OBS, /srv/www/obs/api/tmp/clockworkd.clock.output said: I, [2014-09-12T20:30:55.163170 #29041] INFO -- : Triggering 'fetch notifications' E, [2014-09-12T20:31:00.163931 #29041] ERROR -- : could not obtain a database connection within 5.000 seconds (waited 5.000 seconds) (ActiveRecord::ConnectionTimeoutError) /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:190:in `block in wait_poll' /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:181:in `loop' /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:181:in `wait_poll' /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:136:in `block in poll' /usr/lib64/ruby/2.0.0/monitor.rb:211:in `mon_synchronize' /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:146:in `synchronize' /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:134:in `poll' /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:423:in `acquire_connection' /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:356:in `block in checkout' /usr/lib64/ruby/2.0.0/monitor.rb:211:in `mon_synchronize' /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:355:in `checkout' /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:265:in `block in connection' /usr/lib64/ruby/2.0.0/monitor.rb:211:in `mon_synchronize' /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:264:in `connection' /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:294:in `with_connection' /srv/www/obs/api/config/clock.rb:39:in `block in module:Clockwork' /usr/lib64/ruby/gems/2.0.0/gems/clockwork-0.7.0/lib/clockwork/event.rb:51:in `call' /usr/lib64/ruby/gems/2.0.0/gems/clockwork-0.7.0/lib/clockwork/event.rb:51:in `execute' /usr/lib64/ruby/gems/2.0.0/gems/clockwork-0.7.0/lib/clockwork/event.rb:41:in `block in run' This continued for a few days until I realized we weren't receiving any build failure notifications. Since I couldn't debug the issue - and I wasn't aware of the fact that clockwork log file was in "tmp" directory instead of "log" - I simply restarted /etc/init.d/obsapidelayed service. Immediately, our users were bombarded with a build failure mails! Running "mysqladmin processlist", showed that there were more than 34-36 database connections open - almost each and every one of them had "Sleep" mentioned in the "Command" column. The same story repeated a couple of weeks back which triggered me to make the following changes in database.yml: production: .. - pool: 30 + pool: 50 + checkout_timeout: 15 After this change, I managed to keep delayed_job happy for almost 2 weeks. Last week, things once again went to bad state: I, [2014-11-12T16:09:31.940219 #6089] INFO -- : Triggering 'status.refresh' E, [2014-11-12T16:09:36.938688 #6089] ERROR -- : could not obtain a database connection within 15.000 seconds (waited 15.000 seconds) (ActiveRecord::ConnectionTimeoutError) /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:190:in `block in wait_poll' /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:181:in `loop' /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:181:in `wait_poll' /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:136:in `block in poll' /usr/lib64/ruby/2.0.0/monitor.rb:211:in `mon_synchronize' /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:146:in `synchronize' /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:134:in `poll' /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:423:in `acquire_connection' /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:356:in `block in checkout' /usr/lib64/ruby/2.0.0/monitor.rb:211:in `mon_synchronize' /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:355:in `checkout' /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:265:in `block in connection' /usr/lib64/ruby/2.0.0/monitor.rb:211:in `mon_synchronize' /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:264:in `connection' /usr/lib64/ruby/gems/2.0.0/gems/activerecord-4.0.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:294:in `with_connection' /srv/www/obs/api/config/clock.rb:39:in `block in module:Clockwork' /usr/lib64/ruby/gems/2.0.0/gems/clockwork-0.7.0/lib/clockwork/event.rb:51:in `call' /usr/lib64/ruby/gems/2.0.0/gems/clockwork-0.7.0/lib/clockwork/event.rb:51:in `execute' /usr/lib64/ruby/gems/2.0.0/gems/clockwork-0.7.0/lib/clockwork/event.rb:41:in `block in run' As of now, I've 57 connections to the database: # mysqladmin processlist | grep frontend | wc -l 57 (Ignore the "grep", I'm just being lazy for skipping the table borders and the table header. There is just one database served by the MySQL server) Regarding mysql's max connections: # mysqladmin -p variables | grep max_conn Enter password: | max_connect_errors | 10 | | max_connections | 151 | Apparently, we keep filling up ActiveRecord's connection pool without freeing any of the connections. Not sure if the issue occurs because we have set "fetch_notifications" job to be run in a different thread: every(27.seconds, 'fetch notifications', thread: true) do ActiveRecord::Base.connection_pool.with_connection do |sql| # this will return if there is already a thread running UpdateNotificationEvents.new.perform end end (Yes, I changed the duration from 17 to 27 seconds after I increased the timeout in database.yml from 5 to 15) Has someone else found this issue? I can't find any solution other than setting a daily cron job to restart obsapidelayed :-/ Regards, Srinidhi. -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
participants (1)
-
Srinidhi B