[Bug 1167575] New: systemd crashes and performs "Freezing execution."
http://bugzilla.suse.com/show_bug.cgi?id=1167575 Bug ID: 1167575 Summary: systemd crashes and performs "Freezing execution." Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Basesystem Assignee: systemd-maintainers@suse.de Reporter: jslaby@suse.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- systemd started crashing recently on one of my systems. 2020-03-11T07:29:37.523798+01:00 anemoi systemd[1]: Freezing execution. 2020-03-21T00:00:01.961504+01:00 anemoi systemd[1]: Freezing execution. This must be a long-standing problem, perhaps some unit started triggering this recently. From /var/log/zypp/history, even systemd-243 crashed:
2019-12-20 19:56:33|install|systemd|243-5.1|x86_64||repo-oss|50511a4d1f63297e3f93f4f038304d424a83e2fece8b1f294074240ef5f24146| 2020-03-11 08:04:13|install|systemd|244-3.1|x86_64||repo-oss|0847cf87b5bd749d9af214a379b9a9a608c78a140d6548f086878558bf6800ec|
The log around the second crash:
2020-03-20T23:59:01.674547+01:00 anemoi cron[21067]: pam_unix(crond:session): session opened for user xslaby(uid=500) by (uid=0) 2020-03-20T23:59:01.684756+01:00 anemoi CRON[21096]: (xslaby) CMD (/home/xslaby/teco_scripts/rrdfiller_barak) 2020-03-20T23:59:41.991688+01:00 anemoi CRON[21067]: pam_unix(crond:session): session closed for user xslaby 2020-03-20T23:59:42.003397+01:00 anemoi systemd[1]: session-c13913.scope: Succeeded. 2020-03-20T23:59:52.094168+01:00 anemoi systemd[1]: Stopping User Manager for UID 500... 2020-03-20T23:59:52.096021+01:00 anemoi systemd[21086]: Stopped target Main User Target. 2020-03-20T23:59:52.096309+01:00 anemoi systemd[21086]: Stopped target Basic System. 2020-03-20T23:59:52.096531+01:00 anemoi systemd[21086]: Stopped target Paths. 2020-03-20T23:59:52.096743+01:00 anemoi systemd[21086]: Stopped target Sockets. 2020-03-20T23:59:52.097002+01:00 anemoi systemd[21086]: Stopped target Timers. 2020-03-20T23:59:52.097217+01:00 anemoi systemd[21086]: dbus.socket: Succeeded. 2020-03-20T23:59:52.097418+01:00 anemoi systemd[21086]: Closed D-Bus User Message Bus Socket. 2020-03-20T23:59:52.097625+01:00 anemoi systemd[21086]: pulseaudio.socket: Succeeded. 2020-03-20T23:59:52.097807+01:00 anemoi systemd[21086]: Closed Sound System. 2020-03-20T23:59:52.098002+01:00 anemoi systemd[21086]: Reached target Shutdown. 2020-03-20T23:59:52.098192+01:00 anemoi systemd[21086]: systemd-exit.service: Succeeded. 2020-03-20T23:59:52.098370+01:00 anemoi systemd[21086]: Started Exit the Session. 2020-03-20T23:59:52.098556+01:00 anemoi systemd[21086]: Reached target Exit the Session. 2020-03-20T23:59:52.103206+01:00 anemoi systemd: pam_unix(systemd-user:session): session closed for user xslaby 2020-03-20T23:59:52.103444+01:00 anemoi systemd[1]: user@500.service: Succeeded. 2020-03-20T23:59:52.104118+01:00 anemoi systemd[1]: Stopped User Manager for UID 500. 2020-03-20T23:59:52.106812+01:00 anemoi systemd[1]: Stopping User Runtime Directory /run/user/500... 2020-03-20T23:59:52.124967+01:00 anemoi systemd[1]: run-user-500.mount: Succeeded. 2020-03-20T23:59:52.125090+01:00 anemoi systemd[1]: user-runtime-dir@500.service: Succeeded. 2020-03-20T23:59:52.125162+01:00 anemoi systemd[1]: Stopped User Runtime Directory /run/user/500. 2020-03-20T23:59:52.125246+01:00 anemoi systemd[1]: Removed slice User Slice of UID 500. 2020-03-21T00:00:01.943075+01:00 anemoi systemd[1]: Starting Log analyzer and reporter... 2020-03-21T00:00:01.944874+01:00 anemoi kernel: [834281.709961] traps: systemd[1] general protection fault ip:561c9988d324 sp:7fff065993d8 error:0 in systemd[561c997f5000+b4000] 2020-03-21T00:00:01.961209+01:00 anemoi systemd[1]: Caught <SEGV>, dumped core as pid 21138. 2020-03-21T00:00:01.961504+01:00 anemoi systemd[1]: Freezing execution. 2020-03-21T00:00:02.102072+01:00 anemoi systemd-logind[879]: Failed to start user service 'user@500.service', ignoring: Message recipient disconnected from message bus without replying 2020-03-21T00:00:27.126686+01:00 anemoi systemd-logind[879]: Failed to start session scope session-c13914.scope: Connection timed out
Some gdb output ran on the core file:
(gdb) where #0 0x00007f6c35ade247 in kill () at ../sysdeps/unix/syscall-template.S:78 #1 0x0000561c998a898f in crash (sig=11) at ../src/core/main.c:212 #2 <signal handler called> #3 0x0000561c9988d324 in exec_status_reset (s=0x200000000000010) at ../src/core/execute.c:4953 #4 exec_command_reset_status_list_array (c=<optimized out>, n=7) at ../src/core/execute.c:4143 #5 0x0000561c99863755 in service_start (u=0x561c9a4b1b40) at ../src/core/service.c:2493 #6 0x0000561c99888b44 in unit_start (u=0x561c9a4b1b40) at ../src/core/unit.h:613 #7 job_perform_on_unit (j=0x7fff06599498) at ../src/core/job.c:615 #8 0x0000561c99870e3a in job_run_and_invalidate (j=<optimized out>) at ../src/core/job.c:682 #9 manager_dispatch_run_queue (source=<optimized out>, userdata=<optimized out>) at ../src/core/manager.c:2155 #10 manager_dispatch_run_queue (source=<optimized out>, userdata=0x561c9a24ee30) at ../src/core/manager.c:2144 #11 0x00007f6c358e8f42 in source_dispatch (s=s@entry=0x561c9a24f730) at ../src/libsystemd/sd-event/sd-event.c:2868 #12 0x00007f6c358e9361 in sd_event_dispatch (e=e@entry=0x561c9a24f480) at ../src/libsystemd/sd-event/sd-event.c:3243 #13 0x00007f6c358e9528 in sd_event_run (e=0x561c9a24f480, timeout=18446744073709551615) at ../src/libsystemd/sd-event/sd-event.c:3301 #14 0x0000561c998a77e0 in manager_loop (m=0x561c9a24ee30) at ../src/core/manager.c:2921 #15 invoke_main_loop (m=0x561c9a24ee30, saved_rlimit_nofile=<optimized out>, saved_rlimit_memlock=<optimized out>, ret_reexecute=<optimized out>, ret_retval=<optimized out>, ret_shutdown_verb=<optimized out>, ret_fds=<optimized out>, ret_switch_root_dir=<optimized out>, ret_switch_root_init=<optimized out>, ret_error_message=<optimized out>) at ../src/core/main.c:1721 #16 0x0000561c997fb715 in main (argc=<optimized out>, argv=<optimized out>) at ../src/core/main.c:2684
(gdb) up #2 <signal handler called> ... (gdb) #5 0x0000561c99863755 in service_start (u=0x561c9a4b1b40) at ../src/core/service.c:2493 2493 exec_command_reset_status_list_array(s->exec_command, _SERVICE_EXEC_COMMAND_MAX); (gdb) p *s $8 = {meta = {manager = 0x561c9a24ee30, type = UNIT_SERVICE, load_state = UNIT_LOADED, merged_into = 0x0, id = 0x561c9a252790 "unbound-anchor.service", instance = 0x0, names = 0x561c9a337248, dependencies = {0x561c9a24dc30, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x561c9a2d06c8, 0x0, 0x561c9a2d1df8, 0x561c9a33cf08, 0x0, 0x0, 0x561c9a2a0458, 0x0, 0x0, 0x0, 0x561c9a3337e0, 0x561c9a2d40c0}, requires_mounts_for = 0x0, description = 0x561c9a357ae0 "update of the root trust anchor for DNSSEC validation in unbound", documentation = 0x561c9a357c90, fragment_path = 0x561c9a485100 "/usr/lib/systemd/system/unbound-anchor.service", source_path = 0x0, dropin_paths = 0x0, fragment_mtime = 1543944129000000, source_mtime = 0, dropin_mtime = 0, transient_file = 0x0, job = 0x561c9a41bed0, nop_job = 0x0, match_bus_slot = 0x0, get_name_owner_slot = 0x0, bus_track = 0x0, deserialized_refs = 0x0, job_timeout = 18446744073709551615, job_running_timeout = 18446744073709551615, job_running_timeout_set = false, job_timeout_action = EMERGENCY_ACTION_NONE, job_timeout_reboot_arg = 0x0, refs_by_target = 0x0, conditions = 0x0, asserts = 0x0, condition_timestamp = {realtime = 1584745201942002, monotonic = 834281445987}, assert_timestamp = {realtime = 1584745201942010, monotonic = 834281445988}, state_change_timestamp = {realtime = 1584658801280390, monotonic = 747880784368}, inactive_exit_timestamp = {realtime = 1584658801154723, monotonic = 747880658701}, active_enter_timestamp = { realtime = 0, monotonic = 0}, active_exit_timestamp = {realtime = 0, monotonic = 0}, inactive_enter_timestamp = {realtime = 1584658801280390, monotonic = 747880784368}, slice = {source = 0x561c9a4b1b40, target = 0x561c9a256ed0, refs_by_target_next = 0x561c9a356718, refs_by_target_prev = 0x561c9a281668}, units_by_type_next = 0x561c9a356510, units_by_type_prev = 0x561c9a281460, load_queue_next = 0x0, load_queue_prev = 0x0, dbus_queue_next = 0x561c9a30a2e0, dbus_queue_prev = 0x0, cleanup_queue_next = 0x0, cleanup_queue_prev = 0x0, gc_queue_next = 0x0, gc_queue_prev = 0x0, cgroup_realize_queue_next = 0x0, cgroup_realize_queue_prev = 0x0, cgroup_empty_queue_next = 0x0, cgroup_empty_queue_prev = 0x0, cgroup_oom_queue_next = 0x0, cgroup_oom_queue_prev = 0x0, target_deps_queue_next = 0x0, target_deps_queue_prev = 0x0, stop_when_unneeded_queue_next = 0x0, stop_when_unneeded_queue_prev = 0x0, pids = 0x0, sigchldgen = 53009, notifygen = 0, gc_marker = 7933050, load_error = 0, start_ratelimit = {interval = 10000000, burst = 5, num = 1, begin = 834281445990}, start_limit_action = EMERGENCY_ACTION_NONE, success_action = EMERGENCY_ACTION_NONE, failure_action = EMERGENCY_ACTION_NONE, success_action_exit_status = -1, failure_action_exit_status = -1, reboot_arg = 0x0, auto_stop_ratelimit = {interval = 10000000, burst = 16, num = 0, begin = 0}, ref_uid = 4294967295, ref_gid = 4294967295, unit_file_state = _UNIT_FILE_STATE_INVALID, unit_file_preset = -1, cpu_usage_base = 0, cpu_usage_last = 18446744073709551615, oom_kill_last = 0, io_accounting_base = {0, 0, 0, 0}, io_accounting_last = {18446744073709551615, 18446744073709551615, 18446744073709551615, 18446744073709551615}, cgroup_path = 0x0, cgroup_realized_mask = 0, cgroup_enabled_mask = 0, cgroup_invalidated_mask = 0, cgroup_members_mask = 0, cgroup_control_inotify_wd = -1, cgroup_memory_inotify_wd = -1, bpf_device_control_installed = 0x0, ip_accounting_ingress_map_fd = -1, ip_accounting_egress_map_fd = -1, ipv4_allow_map_fd = -1, ipv6_allow_map_fd = -1, ipv4_deny_map_fd = -1, ipv6_deny_map_fd = -1, ip_bpf_ingress = 0x0, ip_bpf_ingress_installed = 0x0, ip_bpf_egress = 0x0, ip_bpf_egress_installed = 0x0, ip_bpf_custom_ingress = 0x0, ip_bpf_custom_ingress_installed = 0x0, ip_bpf_custom_egress = 0x0, ip_bpf_custom_egress_installed = 0x0, ip_accounting_extra = {0, 0, 0, 0}, rewatch_pids_event_source = 0x0, on_failure_job_mode = JOB_REPLACE, collect_mode = COLLECT_INACTIVE, invocation_id = {bytes = "\257P~\220\066?G\037\227\226\250\312hV\247\217", qwords = {2253839642107203759, 10351337276611008151}}, invocation_id_string = "af507e90363f471f9796a8ca6856a78f", stop_when_unneeded = false, default_dependencies = true, refuse_manual_start = false, refuse_manual_stop = false, allow_isolate = false, ignore_on_isolate = false, condition_result = true, assert_result = true, transient = false, perpetual = false, in_load_queue = false, in_dbus_queue = true, in_cleanup_queue = false, in_gc_queue = false, in_cgroup_realize_queue = false, in_cgroup_empty_queue = false, in_cgroup_oom_queue = false, in_target_deps_queue = false, in_stop_when_unneeded_queue = false, sent_dbus_new_signal = true, in_audit = false, on_console = false, cgroup_realized = false, cgroup_members_mask_valid = true, reset_accounting = false, start_limit_hit = false, coldplugged = true, bus_track_add = false, exported_invocation_id = false, exported_log_level_max = false, exported_log_extra_fields = false, exported_log_ratelimit_interval = false, exported_log_ratelimit_burst = false, warned_clamping_cpu_quota_period = false, last_section_private = -1}, type = SERVICE_ONESHOT, restart = SERVICE_RESTART_NO, restart_prevent_status = {status = {bitmaps = 0x0, n_bitmaps = 0, bitmaps_allocated = 0}, signal = {bitmaps = 0x0, n_bitmaps = 0, bitmaps_allocated = 0}}, restart_force_status = {status = {bitmaps = 0x0, n_bitmaps = 0, bitmaps_allocated = 0}, signal = {bitmaps = 0x0, n_bitmaps = 0, bitmaps_allocated = 0}}, success_status = {status = {bitmaps = 0x561c9a3ae250, n_bitmaps = 1, bitmaps_allocated = 9}, signal = {bitmaps = 0x0, n_bitmaps = 0, bitmaps_allocated = 0}}, pid_file = 0x0, restart_usec = 100000, timeout_start_usec = 18446744073709551615, timeout_stop_usec = 90000000, timeout_abort_usec = 90000000, timeout_abort_set = false, runtime_max_usec = 18446744073709551615, watchdog_timestamp = {realtime = 0, monotonic = 0}, watchdog_usec = 0, watchdog_original_usec = 0, watchdog_override_usec = 18446744073709551615, watchdog_override_enable = false, watchdog_event_source = 0x0, exec_command = {0x200000000000000, 0x0, 0x561c9a282960, 0x0, 0x0, 0x0, 0x0}, exec_context = {environment = 0x0, environment_files = 0x0, pass_environment = 0x0, unset_environment = 0x0, rlimit = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x561c9a2cdf00, 0x561c9a357d30, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, working_directory = 0x0, root_directory = 0x0, root_image = 0x0, working_directory_missing_ok = false, working_directory_home = false, oom_score_adjust_set = false, nice_set = false, ioprio_set = false, cpu_sched_set = false, same_pgrp = false, cpu_sched_reset_on_fork = false, non_blocking = false, umask = 18, oom_score_adjust = 0, nice = 0, ioprio = 16384, cpu_sched_policy = 0, cpu_sched_priority = 0, cpu_set = {set = 0x0, allocated = 0}, numa_policy = {type = -1, nodes = {set = 0x0, allocated = 0}}, std_input = EXEC_INPUT_NULL, std_output = EXEC_OUTPUT_JOURNAL, std_error = EXEC_OUTPUT_INHERIT, stdio_as_fds = false, stdio_fdname = {0x0, 0x0, 0x0}, stdio_file = {0x0, 0x0, 0x0}, stdin_data = 0x0, stdin_data_size = 0, timer_slack_nsec = 18446744073709551615, tty_path = 0x0, tty_reset = false, tty_vhangup = false, tty_vt_disallocate = false, ignore_sigpipe = true, keyring_mode = EXEC_KEYRING_PRIVATE, user = 0x561c9a3ae0c0 "unbound", group = 0x0, supplementary_groups = 0x0, pam_name = 0x0, utmp_id = 0x0, utmp_mode = EXEC_UTMP_INIT, no_new_privileges = false, selinux_context_ignore = false, apparmor_profile_ignore = false, smack_process_label_ignore = false, selinux_context = 0x0, apparmor_profile = 0x0, smack_process_label = 0x0, read_write_paths = 0x0, read_only_paths = 0x0, inaccessible_paths = 0x0, mount_flags = 0, bind_mounts = 0x0, n_bind_mounts = 0, temporary_filesystems = 0x0, n_temporary_filesystems = 0, capability_bounding_set = 18446744073709551615, capability_ambient_set = 0, secure_bits = 0, syslog_priority = 30, syslog_level_prefix = true, syslog_identifier = 0x0, log_extra_fields = 0x0, n_log_extra_fields = 0, log_ratelimit_interval_usec = 0, log_ratelimit_burst = 0, log_level_max = -1, private_tmp = false, private_network = false, private_devices = false, private_users = false, private_mounts = false, protect_kernel_tunables = false, protect_kernel_modules = false, protect_kernel_logs = false, protect_control_groups = false, protect_system = PROTECT_SYSTEM_NO, protect_home = PROTECT_HOME_NO, protect_hostname = false, mount_apivfs = false, dynamic_user = false, remove_ipc = false, memory_deny_write_execute = false, restrict_realtime = false, restrict_suid_sgid = false, lock_personality = false, personality = 4294967295, restrict_namespaces = 18446744073709551615, syscall_filter = 0x0, syscall_archs = 0x0, syscall_errno = 0, syscall_whitelist = false, address_families_whitelist = false, address_families = 0x0, network_namespace_path = 0x0, directories = {{paths = 0x0, mode = 493}, {paths = 0x0, mode = 493}, {paths = 0x0, mode = 493}, {paths = 0x0, mode = 493}, {paths = 0x0, mode = 493}}, runtime_directory_preserve_mode = EXEC_PRESERVE_NO, timeout_clean_usec = 18446744073709551615}, kill_context = {kill_mode = KILL_CONTROL_GROUP, kill_signal = 15, restart_kill_signal = 0, final_kill_signal = 9, watchdog_signal = 6, send_sigkill = true, send_sighup = false}, cgroup_context = {cpu_accounting = false, io_accounting = false, blockio_accounting = false, memory_accounting = true, tasks_accounting = true, ip_accounting = false, memory_oom_group = false, delegate = false, delegate_controllers = 0, disable_controllers = 0, cpu_weight = 18446744073709551615, startup_cpu_weight = 18446744073709551615, cpu_quota_per_sec_usec = 18446744073709551615, cpu_quota_period_usec = 18446744073709551615, cpuset_cpus = {set = 0x0, allocated = 0}, cpuset_mems = {set = 0x0, allocated = 0}, io_weight = 18446744073709551615, startup_io_weight = 18446744073709551615, io_device_weights = 0x0, io_device_limits = 0x0, io_device_latencies = 0x0, default_memory_min = 0, default_memory_low = 0, memory_min = 0, memory_low = 0, memory_high = 18446744073709551615, memory_max = 18446744073709551615, memory_swap_max = 18446744073709551615, default_memory_min_set = false, default_memory_low_set = false, memory_min_set = false, memory_low_set = false, ip_address_allow = 0x0, ip_address_deny = 0x0, ip_filters_ingress = 0x0, ip_filters_egress = 0x0, cpu_shares = 18446744073709551615, startup_cpu_shares = 18446744073709551615, blockio_weight = 18446744073709551615, startup_blockio_weight = 18446744073709551615, blockio_device_weights = 0x0, blockio_device_bandwidths = 0x0, memory_limit = 18446744073709551615, device_policy = CGROUP_DEVICE_POLICY_AUTO, device_allow = 0x0, tasks_max = { value = 15, scale = 100}}, state = SERVICE_DEAD, deserialized_state = SERVICE_DEAD, main_exec_status = {start_timestamp = {realtime = 1584658801154309, monotonic = 747880658287}, exit_timestamp = { realtime = 1584658801280220, monotonic = 747880784197}, pid = 29072, code = 1, status = 0}, control_command = 0x0, main_command = 0x0, control_command_id = _SERVICE_EXEC_COMMAND_INVALID, exec_runtime = 0x0, dynamic_creds = { user = 0x0, group = 0x0}, main_pid = 0, control_pid = 0, socket_fd = -1, peer = 0x0, socket_fd_selinux_context_net = false, permissions_start_only = false, root_directory_start_only = false, remain_after_exit = false, guess_main_pid = true, result = SERVICE_SUCCESS, reload_result = SERVICE_SUCCESS, clean_result = SERVICE_SUCCESS, main_pid_known = false, main_pid_alien = false, bus_name_good = false, forbid_restart = false, will_auto_restart = false, start_timeout_defined = false, exec_fd_hot = false, bus_name = 0x0, bus_name_owner = 0x561c9a493250 "yes", status_text = 0x0, status_errno = 0, accept_socket = {source = 0x0, target = 0x0, refs_by_target_next = 0x0, refs_by_target_prev = 0x0}, timer_event_source = 0x0, pid_file_pathspec = 0x0, notify_access = NOTIFY_NONE, notify_state = NOTIFY_UNKNOWN, exec_fd_event_source = 0x0, fd_store = 0x0, n_fd_store = 0, n_fd_store_max = 0, n_keep_fd_store = 0, usb_function_descriptors = 0x0, usb_function_strings = 0x0, stdin_fd = -1, stdout_fd = -1, stderr_fd = -1, n_restarts = 0, flush_n_restarts = true, oom_policy = OOM_STOP} (gdb) p *u $9 = {manager = 0x561c9a24ee30, type = UNIT_SERVICE, load_state = UNIT_LOADED, merged_into = 0x0, id = 0x561c9a252790 "unbound-anchor.service", instance = 0x0, names = 0x561c9a337248, dependencies = {0x561c9a24dc30, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x561c9a2d06c8, 0x0, 0x561c9a2d1df8, 0x561c9a33cf08, 0x0, 0x0, 0x561c9a2a0458, 0x0, 0x0, 0x0, 0x561c9a3337e0, 0x561c9a2d40c0}, requires_mounts_for = 0x0, description = 0x561c9a357ae0 "update of the root trust anchor for DNSSEC validation in unbound", documentation = 0x561c9a357c90, fragment_path = 0x561c9a485100 "/usr/lib/systemd/system/unbound-anchor.service", source_path = 0x0, dropin_paths = 0x0, fragment_mtime = 1543944129000000, source_mtime = 0, dropin_mtime = 0, transient_file = 0x0, job = 0x561c9a41bed0, nop_job = 0x0, match_bus_slot = 0x0, get_name_owner_slot = 0x0, bus_track = 0x0, deserialized_refs = 0x0, job_timeout = 18446744073709551615, job_running_timeout = 18446744073709551615, job_running_timeout_set = false, job_timeout_action = EMERGENCY_ACTION_NONE, job_timeout_reboot_arg = 0x0, refs_by_target = 0x0, conditions = 0x0, asserts = 0x0, condition_timestamp = {realtime = 1584745201942002, monotonic = 834281445987}, assert_timestamp = {realtime = 1584745201942010, monotonic = 834281445988}, state_change_timestamp = {realtime = 1584658801280390, monotonic = 747880784368}, inactive_exit_timestamp = {realtime = 1584658801154723, monotonic = 747880658701}, active_enter_timestamp = { realtime = 0, monotonic = 0}, active_exit_timestamp = {realtime = 0, monotonic = 0}, inactive_enter_timestamp = {realtime = 1584658801280390, monotonic = 747880784368}, slice = {source = 0x561c9a4b1b40, target = 0x561c9a256ed0, refs_by_target_next = 0x561c9a356718, refs_by_target_prev = 0x561c9a281668}, units_by_type_next = 0x561c9a356510, units_by_type_prev = 0x561c9a281460, load_queue_next = 0x0, load_queue_prev = 0x0, dbus_queue_next = 0x561c9a30a2e0, dbus_queue_prev = 0x0, cleanup_queue_next = 0x0, cleanup_queue_prev = 0x0, gc_queue_next = 0x0, gc_queue_prev = 0x0, cgroup_realize_queue_next = 0x0, cgroup_realize_queue_prev = 0x0, cgroup_empty_queue_next = 0x0, cgroup_empty_queue_prev = 0x0, cgroup_oom_queue_next = 0x0, cgroup_oom_queue_prev = 0x0, target_deps_queue_next = 0x0, target_deps_queue_prev = 0x0, stop_when_unneeded_queue_next = 0x0, stop_when_unneeded_queue_prev = 0x0, pids = 0x0, sigchldgen = 53009, notifygen = 0, gc_marker = 7933050, load_error = 0, start_ratelimit = {interval = 10000000, burst = 5, num = 1, begin = 834281445990}, start_limit_action = EMERGENCY_ACTION_NONE, success_action = EMERGENCY_ACTION_NONE, failure_action = EMERGENCY_ACTION_NONE, success_action_exit_status = -1, failure_action_exit_status = -1, reboot_arg = 0x0, auto_stop_ratelimit = {interval = 10000000, burst = 16, num = 0, begin = 0}, ref_uid = 4294967295, ref_gid = 4294967295, unit_file_state = _UNIT_FILE_STATE_INVALID, unit_file_preset = -1, cpu_usage_base = 0, cpu_usage_last = 18446744073709551615, oom_kill_last = 0, io_accounting_base = {0, 0, 0, 0}, io_accounting_last = {18446744073709551615, 18446744073709551615, 18446744073709551615, 18446744073709551615}, cgroup_path = 0x0, cgroup_realized_mask = 0, cgroup_enabled_mask = 0, cgroup_invalidated_mask = 0, cgroup_members_mask = 0, cgroup_control_inotify_wd = -1, cgroup_memory_inotify_wd = -1, bpf_device_control_installed = 0x0, ip_accounting_ingress_map_fd = -1, ip_accounting_egress_map_fd = -1, ipv4_allow_map_fd = -1, ipv6_allow_map_fd = -1, ipv4_deny_map_fd = -1, ipv6_deny_map_fd = -1, ip_bpf_ingress = 0x0, ip_bpf_ingress_installed = 0x0, ip_bpf_egress = 0x0, ip_bpf_egress_installed = 0x0, ip_bpf_custom_ingress = 0x0, ip_bpf_custom_ingress_installed = 0x0, ip_bpf_custom_egress = 0x0, ip_bpf_custom_egress_installed = 0x0, ip_accounting_extra = {0, 0, 0, 0}, rewatch_pids_event_source = 0x0, on_failure_job_mode = JOB_REPLACE, collect_mode = COLLECT_INACTIVE, invocation_id = {bytes = "\257P~\220\066?G\037\227\226\250\312hV\247\217", qwords = {2253839642107203759, 10351337276611008151}}, invocation_id_string = "af507e90363f471f9796a8ca6856a78f", stop_when_unneeded = false, default_dependencies = true, refuse_manual_start = false, refuse_manual_stop = false, allow_isolate = false, ignore_on_isolate = false, condition_result = true, assert_result = true, transient = false, perpetual = false, in_load_queue = false, in_dbus_queue = true, in_cleanup_queue = false, in_gc_queue = false, in_cgroup_realize_queue = false, in_cgroup_empty_queue = false, in_cgroup_oom_queue = false, in_target_deps_queue = false, in_stop_when_unneeded_queue = false, sent_dbus_new_signal = true, in_audit = false, on_console = false, cgroup_realized = false, cgroup_members_mask_valid = true, reset_accounting = false, start_limit_hit = false, coldplugged = true, bus_track_add = false, exported_invocation_id = false, exported_log_level_max = false, exported_log_extra_fields = false, exported_log_ratelimit_interval = false, exported_log_ratelimit_burst = false, warned_clamping_cpu_quota_period = false, last_section_private = -1}
-- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1167575 http://bugzilla.suse.com/show_bug.cgi?id=1167575#c1 Franck Bui <fbui@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |fbui@suse.com --- Comment #1 from Franck Bui <fbui@suse.com> --- Is this reproducible on your side ? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1167575 http://bugzilla.suse.com/show_bug.cgi?id=1167575#c2 --- Comment #2 from Jiri Slaby <jslaby@suse.com> --- (In reply to Franck Bui from comment #1)
Is this reproducible on your side ?
I don't know how -- it happened on midnight, so maybe some cron/timer... But it happens irregularly (twice) as you could see. So, how can I try to reproduce? BTW, the first crash happened during update, but I have no crash file from that:
2020-03-11T07:29:37.229543+01:00 anemoi [RPM][26567]: Transaction ID 5e688551 started 2020-03-11T07:29:37.248027+01:00 anemoi [RPM][26567]: erase qemu-ipxe-1.0.0+-1.1.noarch: success 2020-03-11T07:29:37.398190+01:00 anemoi [RPM][26567]: install qemu-ipxe-1.0.0+-5.1.noarch: success 2020-03-11T07:29:37.403529+01:00 anemoi [RPM][26567]: erase qemu-ipxe-1.0.0+-1.1.noarch: success 2020-03-11T07:29:37.409587+01:00 anemoi [RPM][26567]: install qemu-ipxe-1.0.0+-5.1.noarch: success 2020-03-11T07:29:37.409771+01:00 anemoi [RPM][26567]: Transaction ID 5e688551 finished: 0 2020-03-11T07:29:37.448399+01:00 anemoi [RPM][26568]: Transaction ID 5e688551 started 2020-03-11T07:29:37.450739+01:00 anemoi [RPM][26568]: erase qemu-ksm-4.2.0-1.1.x86_64: success 2020-03-11T07:29:37.485830+01:00 anemoi systemd[1]: Reloading. 2020-03-11T07:29:37.499113+01:00 anemoi kernel: [7039425.326536] traps: systemd[1] general protection fault ip:55d50417777b sp:7fff9396e590 error:0 in systemd[55d504118000+b2000] 2020-03-11T07:29:37.522945+01:00 anemoi systemd[1]: Caught <SEGV>, dumped core as pid 26573. 2020-03-11T07:29:37.523153+01:00 anemoi kernel: [7039425.353028] printk: systemd: 61 output lines suppressed due to ratelimiting 2020-03-11T07:29:37.523798+01:00 anemoi systemd[1]: Freezing execution. 2020-03-11T07:29:37.607147+01:00 anemoi [RPM][26568]: install qemu-ksm-4.2.0-5.1.x86_64: success
qemu-ksm postuninstall does: /usr/bin/systemctl daemon-reload and then /usr/bin/systemctl try-restart ksm.service -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1167575 http://bugzilla.suse.com/show_bug.cgi?id=1167575#c3 --- Comment #3 from Franck Bui <fbui@suse.com> --- (In reply to Jiri Slaby from comment #2)
qemu-ksm postuninstall does: /usr/bin/systemctl daemon-reload and then /usr/bin/systemctl try-restart ksm.service
Did you try to run these 2 commands manually ? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1167575 http://bugzilla.suse.com/show_bug.cgi?id=1167575#c4 --- Comment #4 from Jiri Slaby <jslaby@suse.com> --- (In reply to Franck Bui from comment #3)
(In reply to Jiri Slaby from comment #2)
qemu-ksm postuninstall does: /usr/bin/systemctl daemon-reload and then /usr/bin/systemctl try-restart ksm.service
Did you try to run these 2 commands manually ?
Yes, nothing bad happened. The log showed "systemd[1]: Reloading" as above, but no crash. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1167575 http://bugzilla.suse.com/show_bug.cgi?id=1167575#c5 --- Comment #5 from Franck Bui <fbui@suse.com> --- (In reply to Jiri Slaby from comment #0)
(gdb) p *s [...] exec_command = {0x200000000000000, 0x0, 0x561c9a282960, 0x0, 0x0, 0x0, 0x0},
I checked the code but I cannot see currently where the boggus value "0x200000000000000" could come from. It doesn't look random though, it looks like the value was set to 0 but one bit was not cleared. It might be interesting to see if the other crashes show the same wrong value at the same location. Maybe you could try to test the system RAM... otherwise without a reproducer I'm running out of idea. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1167575 http://bugzilla.suse.com/show_bug.cgi?id=1167575#c6 Jiri Slaby <jslaby@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Component|Basesystem |Kernel Assignee|systemd-maintainers@suse.de |kernel-maintainers@forge.pr | |ovo.novell.com --- Comment #6 from Jiri Slaby <jslaby@suse.com> --- (In reply to Franck Bui from comment #5)
(In reply to Jiri Slaby from comment #0)
(gdb) p *s [...] exec_command = {0x200000000000000, 0x0, 0x561c9a282960, 0x0, 0x0, 0x0, 0x0},
I checked the code but I cannot see currently where the boggus value "0x200000000000000" could come from.
It doesn't look random though, it looks like the value was set to 0 but one bit was not cleared.
It might be interesting to see if the other crashes show the same wrong value at the same location.
Maybe you could try to test the system RAM... otherwise without a reproducer I'm running out of idea.
Yesterday, udev crashed while parsing udev rules file.
#0 0x000055e5889a3bf4 in udev_rules_apply_to_event (rules=0x55e58a2d92e0, event=0x55e58a22c8d0, timeout_usec=180000000, properties_list=0x0) at ../src/udev/udev-rules.c:2268
2268 LIST_FOREACH_SAFE(rule_lines, file->current_line, next_line, file->rule_lines) {
Iterating through the file->rule_lines list in gdb:
{line = 0x55e58a139c60 "SUBSYSTEM", line_number = 65, type = LINE_UPDATE_SOMETHING, ... rule_lines_next = 0x55e58a139e90, rule_lines_prev = 0x55e58a139ae0} {line = 0x55e58a139e00 "SUBSYSTEM", line_number = 66, type = LINE_UPDATE_SOMETHING, ... rule_lines_next = 0x55e58a13a090, rule_lines_prev = 0x55e58a139ce0} {line = 0x55e58a13a010 "SUBSYSTEM", line_number = 67, type = LINE_UPDATE_SOMETHING, ... rule_lines_next = 0x20055e58a13a240, rule_lines_prev = 0x55e58a139e90}
Look at the last rule_lines_next: 0x20055e58a13a240. It's 0x55e58a13a240 ORed with 0x200000000000000 again (it's the very same bit flipped). So this is either bad RAM or kernel corrupts memory. Note that when I fix the address, it contains the next line: (gdb) p *(UdevRuleLine *)0x55e58a13a240 {line = 0x55e58a13a1b0 "SUBSYSTEM", line_number = 68, type = LINE_UPDATE_SOMETHING, ... rule_lines_next = 0x55e58a13a440, rule_lines_prev = 0x55e58a13a090} Let it open for a while, until I run memtest to confirm/exclude RAM failure. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1167575 http://bugzilla.suse.com/show_bug.cgi?id=1167575#c7 Jiri Slaby <jslaby@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |INVALID --- Comment #7 from Jiri Slaby <jslaby@suse.com> --- The RAM is really failing. Sorry for the noise. -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com