[Bug 1231756] New: xe driver unexpectedly does not support Arc A750 (56a1)
https://bugzilla.suse.com/show_bug.cgi?id=1231756 Bug ID: 1231756 Summary: xe driver unexpectedly does not support Arc A750 (56a1) Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: aarch64 OS: openSUSE Tumbleweed Status: NEW Severity: Normal Priority: P5 - None Component: Kernel:Drivers Assignee: kernel-bugs@suse.de Reporter: afaerber@suse.com QA Contact: qa-bugs@suse.de CC: jcheung@suse.com, marc.ruehrschneck@suse.com, mbrugger@suse.com, patrik.jakobsson@suse.com Target Milestone: --- Found By: --- Blocker: --- According to Intel's https://dgpu-docs.intel.com/devices/hardware-table.html the 8086:56A1 Arc A750 dGPU should be enabled since 6.2 and not need force_probe. However, in Tumbleweed aarch64 with 6.11.3 kernel I got: [ 12.232648] [ T661] xe 000d:03:00.0: Your graphics device 56a1 is not officially supported by xe driver in this kernel version. To force Xe probe, use xe.force_probe='56a1' and i915.force_probe='!56a1' module parameters or CONFIG_DRM_XE_FORCE_PROBE='56a1' and CONFIG_DRM_I915_FORCE_PROBE='!56a1' configuration options. and when following those instructions, [ 11.765712][ T577] Setting dangerous option force_probe - tainting kernel [ 11.772912][ T577] xe 000d:03:00.0: Adding to iommu group 18 ... [ 11.968981][ T660] xe 000d:03:00.0: enabling device (0000 -> 0002) ... [ 11.990262][ T660] xe 000d:03:00.0: [drm] Found DG2/G10 (device ID 56a1) display version 13.00 ... [ 12.062198][ T660] xe 000d:03:00.0: [drm] Using GuC firmware from i915/dg2_guc_70.bin version 70.29.2 ... [ 12.100456][ T660] xe 000d:03:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none ... [ 12.100456][ T660] xe 000d:03:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none [ 12.102321][ T705] BTRFS info (device nvme0n1p2): using free-space-tree [ 12.111966][ T660] Unable to handle kernel paging request at virtual address ffffffffc08003cc [ 12.127252][ T660] Mem abort info: [ 12.130732][ T660] ESR = 0x0000000096000006 [ 12.135168][ T660] EC = 0x25: DABT (current EL), IL = 32 bits [ 12.141167][ T660] SET = 0, FnV = 0 [ 12.144909][ T660] EA = 0, S1PTW = 0 [ 12.148739][ T660] FSC = 0x06: level 2 translation fault [ 12.154304][ T660] Data abort info: [ 12.157872][ T660] ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000 [ 12.164045][ T660] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 12.169783][ T660] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 12.175782][ T660] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000080d66e08000 [ 12.183173][ T660] [ffffffffc08003cc] pgd=0000080d67c78003, p4d=0000080d67c78003, pud=0000080d67c79003, pmd=0000000000000000 [ 12.194477][ T660] Internal error: Oops: 0000000096000006 [#1] SMP [ 12.200733][ T660] Modules linked in: xe(+) drm_ttm_helper ttm i2c_algo_bit aes_ce_blk gpu_sched aes_ce_cipher drm_buddy crct10dif_ce polyval_ce video polyval_generic drm_suballoc_helper xhci_pci drm_gpuvm ghash_ce xhci_pci_renesas drm_exec xhci_hcd gf128mul sm4 drm_display_helper nvme sha2_ce sha256_arm64 usbcore nvme_core cec sha1_ce sbsa_gwdt rc_core nvme_auth usb_common xgene_hwmon gpio_dwapb btrfs blake2b_generic libcrc32c xor xor_neon raid6_pq ip6_tables x_tables br_netfilter bridge stp llc efivarfs [ 12.245579][ T660] CPU: 0 UID: 0 PID: 660 Comm: kworker/0:2 Tainted: G U 6.11.3-1-default #1 openSUSE Tumbleweed 1400000003000000474e5500fd56bd985baac2f4 [ 12.260779][ T660] Tainted: [U]=USER [ 12.264429][ T660] Hardware name: ADLINK Ampere Altra Developer Platform/Ampere Altra Developer Platform, BIOS TianoCore 2.04.100.11 (SYS: 2.06.20220308) 10/05/2 [ 12.278930][ T660] Workqueue: events work_for_cpu_fn [ 12.283975][ T660] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 12.291620][ T660] pc : logic_inb+0xc0/0x108 [ 12.295968][ T660] lr : intel_vga_reset_io_mem+0x38/0x68 [xe] [ 12.302000][ T660] sp : ffff800083c7ba90 [ 12.305997][ T660] x29: ffff800083c7ba90 x28: ffffc97776d89408 x27: 0000000000000001 [ 12.313817][ T660] x26: ffff07ff955c4000 x25: ffff07ff8322f0c8 x24: ffffc97776dc0d70 [ 12.321636][ T660] x23: ffff07ff955c5760 x22: 0000000000045404 x21: 0000000000000000 [ 12.329455][ T660] x20: 00000000000003cc x19: ffff07ff8322f000 x18: ffffffffffffffff [ 12.337274][ T660] x17: 2c6d656d2b6f693d x16: ffffc977a1414a68 x15: 6c6f203a6465676e [ 12.345092][ T660] x14: 616863207365646f x13: 205d303636542020 x12: 5b5d363534303031 [ 12.352911][ T660] x11: 65646f6365642c6d x10: 656d2b6f693d7365 x9 : ffffc977a0d26250 [ 12.360730][ T660] x8 : 3a62726161677620 x7 : 205b5d3635343030 x6 : 00000000000000ff [ 12.368549][ T660] x5 : 0000000000000000 x4 : 000000000000000a x3 : 0000000000000000 [ 12.376368][ T660] x2 : 0000000000000000 x1 : 0000000000ffbffe x0 : ffffffffc08003cc [ 12.384187][ T660] Call trace: [ 12.387316][ T660] logic_inb+0xc0/0x108 [ 12.391315][ T660] intel_vga_reset_io_mem+0x38/0x68 [xe 1400000003000000474e5500a641a49136d814c9] [ 12.400545][ T660] hsw_power_well_enable+0x150/0x1d0 [xe 1400000003000000474e5500a641a49136d814c9] [ 12.409854][ T660] intel_power_well_enable+0x74/0xa0 [xe 1400000003000000474e5500a641a49136d814c9] [ 12.419156][ T660] intel_power_well_get+0x2c/0x40 [xe 1400000003000000474e5500a641a49136d814c9] [ 12.428193][ T660] __intel_display_power_get_domain.part.0+0x78/0xc8 [xe 1400000003000000474e5500a641a49136d814c9] [ 12.438876][ T660] intel_power_domains_init_hw+0x8c/0x300 [xe 1400000003000000474e5500a641a49136d814c9] [ 12.448602][ T660] intel_display_driver_probe_noirq+0xa0/0x1f8 [xe 1400000003000000474e5500a641a49136d814c9] [ 12.458761][ T660] xe_display_init_noirq+0x68/0xc8 [xe 1400000003000000474e5500a641a49136d814c9] [ 12.467876][ T660] xe_device_probe+0x2c0/0x590 [xe 1400000003000000474e5500a641a49136d814c9] [ 12.476647][ T660] xe_pci_probe+0x634/0x9b0 [xe 1400000003000000474e5500a641a49136d814c9] [ 12.485151][ T660] local_pci_probe+0x48/0xc0 [ 12.489587][ T660] work_for_cpu_fn+0x24/0x40 [ 12.494019][ T660] process_one_work+0x174/0x418 [ 12.498713][ T660] worker_thread+0x2d4/0x3f8 [ 12.503146][ T660] kthread+0x118/0x130 [ 12.507058][ T660] ret_from_fork+0x10/0x20 [ 12.511318][ T660] Code: d65f03c0 929fffe0 f2b81000 8b000280 (39400000) [ 12.518094][ T660] ---[ end trace 0000000000000000 ]--- Note: the i915 driver does not appear to be available on aarch64, only xe. Is the Intel documentation wrong? Anything to check other than waiting for 6.12 packages to re-test? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1231756 https://bugzilla.suse.com/show_bug.cgi?id=1231756#c1 --- Comment #1 from Patrik Jakobsson <patrik.jakobsson@suse.com> --- Intel supports A750 through the i915 driver. The Xe support is experimental. The idea is that if both drivers can "handle" a GPU then it will be disabled with the force_probe flag in one of them. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1231756 https://bugzilla.suse.com/show_bug.cgi?id=1231756#c2 Andreas Färber <afaerber@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|xe driver unexpectedly does |xe driver oopses with Arc |not support Arc A750 (56a1) |A750 (56a1) on aarch64 CC| |arm-bugs@suse.de, | |ddavis@suse.com, | |mbenes@suse.com, | |schwab@suse.com --- Comment #2 from Andreas Färber <afaerber@suse.com> --- Success had been reported for 6.12 on RISC-V: https://x.com/Rabenda_Issimo/status/1840775703811567916 https://www.reddit.com/r/RISCV/comments/1ftep9u/intel_arc_a770_on_riscv/ It seems we don't have an aarch64 kernel-default 6.12 in Kernel:HEAD currently, so I assume the master branch configs still need to be updated... riscv64 Kernel:HEAD failed to build for 6.12~rc3, but 6.10.5 (20240828) with xe.force_probe=56a1 works okay though: [ 36.416080] [ T1036] snd_hda_intel 0000:0a:00.0: enabling device (0000 -> 0002) [ 36.416144] [ T1036] snd_hda_intel 0000:0a:00.0: Force to snoop mode by module option [ 41.530363] [ T1026] Setting dangerous option force_probe - tainting kernel [ 41.534111] [ T1026] xe 0000:09:00.0: enabling device (0000 -> 0002) [ 41.618801] [ T1026] xe 0000:09:00.0: [drm] Using GuC firmware from i915/dg2_guc_70.bin version 70.29.2 [ 41.622473] [ T1026] xe 0000:09:00.0: [drm] GT0: using 65535 GUC ID(s) [ 41.711650] [ T1026] xe 0000:09:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none [ 41.716110] [ T1026] xe 0000:09:00.0: [drm] Attempting to resize bar from 256MiB -> 8192MiB [ 41.716149] [ T1026] xe 0000:09:00.0: BAR 2 [mem 0x2000000000-0x200fffffff 64bit pref]: releasing [ 41.716218] [ T1026] pcieport 0000:08:01.0: bridge window [mem 0x2000000000-0x200fffffff 64bit pref]: releasing [ 41.716235] [ T1026] pcieport 0000:07:00.0: bridge window [mem 0x2000000000-0x200fffffff 64bit pref]: releasing [ 41.716292] [ T1026] pcieport 0000:07:00.0: bridge window [mem size 0x200000000 64bit pref]: can't assign; no space [ 41.716305] [ T1026] pcieport 0000:07:00.0: bridge window [mem size 0x200000000 64bit pref]: failed to assign [ 41.716320] [ T1026] pcieport 0000:08:01.0: bridge window [mem size 0x200000000 64bit pref]: can't assign; no space [ 41.716331] [ T1026] pcieport 0000:08:01.0: bridge window [mem size 0x200000000 64bit pref]: failed to assign [ 41.716344] [ T1026] xe 0000:09:00.0: BAR 2 [mem size 0x200000000 64bit pref]: can't assign; no space [ 41.716354] [ T1026] xe 0000:09:00.0: BAR 2 [mem size 0x200000000 64bit pref]: failed to assign [ 41.716367] [ T1026] pcieport 0000:02:08.0: PCI bridge to [bus 07-0a] [ 41.716385] [ T1026] pcieport 0000:02:08.0: bridge window [mem 0x60400000-0x61ffffff] [ 41.716399] [ T1026] pcieport 0000:02:08.0: bridge window [mem 0x2000000000-0x2017ffffff 64bit pref] [ 41.716421] [ T1026] pcieport 0000:07:00.0: PCI bridge to [bus 08-0a] [ 41.716437] [ T1026] pcieport 0000:07:00.0: bridge window [mem 0x60400000-0x61ffffff] [ 41.716452] [ T1026] pcieport 0000:07:00.0: bridge window [mem 0x2000000000-0x200fffffff 64bit pref] [ 41.716472] [ T1026] pcieport 0000:08:01.0: PCI bridge to [bus 09] [ 41.716488] [ T1026] pcieport 0000:08:01.0: bridge window [mem 0x60800000-0x61ffffff] [ 41.716502] [ T1026] pcieport 0000:08:01.0: bridge window [mem 0x2000000000-0x200fffffff 64bit pref] [ 41.716543] [ T1026] xe 0000:09:00.0: [drm] Failed to resize BAR2 to 8192M (-ENOSPC). Consider enabling 'Resizable BAR' support in your BIOS [ 41.716563] [ T1026] xe 0000:09:00.0: BAR 2 [mem 0x2000000000-0x200fffffff 64bit pref]: assigned [ 41.716661] [ T1026] xe 0000:09:00.0: [drm] VISIBLE VRAM: 0x0000002000000000, 0x0000000010000000 [ 41.717002] [ T1026] xe 0000:09:00.0: [drm] Small BAR device [ 41.717011] [ T1026] xe 0000:09:00.0: [drm] VRAM[0, 0]: Actual physical size 0x0000000200000000, usable size exclude stolen 0x00000001fc000000, CPU accessible size 0x0000000010000000 [ 41.717024] [ T1026] xe 0000:09:00.0: [drm] VRAM[0, 0]: DPA range: [0x0000000000000000-200000000], io range: [0x0000002000000000-2010000000] [ 41.717037] [ T1026] xe 0000:09:00.0: [drm] VRAM: 0x0000000200000000 is larger than resource 0x0000000010000000 [ 41.717046] [ T1026] xe 0000:09:00.0: [drm] Total VRAM: 0x0000002000000000, 0x0000000200000000 [ 41.717055] [ T1026] xe 0000:09:00.0: [drm] Available VRAM: 0x0000002000000000, 0x00000001fc000000 [ 41.738616] [ T59] xe 0000:09:00.0: [drm] Finished loading DMC firmware i915/dg2_dmc_ver2_08.bin (v2.8) [ 43.147740] [ T1026] xe 0000:09:00.0: [drm] vcs1 fused off [ 43.147773] [ T1026] xe 0000:09:00.0: [drm] vcs3 fused off [ 43.147781] [ T1026] xe 0000:09:00.0: [drm] vcs4 fused off [ 43.147789] [ T1026] xe 0000:09:00.0: [drm] vcs5 fused off [ 43.147796] [ T1026] xe 0000:09:00.0: [drm] vcs6 fused off [ 43.147804] [ T1026] xe 0000:09:00.0: [drm] vcs7 fused off [ 43.147812] [ T1026] xe 0000:09:00.0: [drm] vecs2 fused off [ 43.147820] [ T1026] xe 0000:09:00.0: [drm] vecs3 fused off [ 43.283560] [ T1026] xe 0000:09:00.0: [drm] GT0: CCS_MODE=0 config:00400000, num_engines:1, num_slices:4 [ 43.308597] [ T1026] [drm] Initialized xe 1.1.0 20201103 for 0000:09:00.0 on minor 0 [ 43.790221] [ T1026] xe 0000:09:00.0: [drm] Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS. [ 43.825331] [ T1026] Console: switching to colour frame buffer device 240x67 [ 43.901845] [ T1026] xe 0000:09:00.0: [drm] fb0: xedrmfb frame buffer device [ 43.933816] [ T56] snd_hda_intel 0000:0a:00.0: Force to snoop mode by module option [ 44.062252] [ T56] snd_hda_intel 0000:0a:00.0: bound 0000:09:00.0 (ops i915_audio_component_bind_ops [xe]) [ 44.141316] [ T1201] macb 10090000.ethernet end0: PHY [10090000.ethernet-ffffffff:00] driver [Microsemi VSC8541 SyncE] (irq=POLL) [ 44.141343] [ T1201] macb 10090000.ethernet end0: configuring for phy/gmii link mode [ 44.142219] [ T59] snd_hda_intel 0000:0a:00.0: Unknown capability 0 [ 44.276205] [ T59] input: HDA Intel PCH HDMI/DP,pcm=3 as /devices/platform/soc/e00000000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0/0000:02:08.0/0000:07:00.0/0000:08:04.0/0000:0a:00.0/sound/card0/input1 [ 44.278346] [ T59] input: HDA Intel PCH HDMI/DP,pcm=7 as /devices/platform/soc/e00000000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0/0000:02:08.0/0000:07:00.0/0000:08:04.0/0000:0a:00.0/sound/card0/input2 [ 44.278799] [ T59] input: HDA Intel PCH HDMI/DP,pcm=8 as /devices/platform/soc/e00000000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0/0000:02:08.0/0000:07:00.0/0000:08:04.0/0000:0a:00.0/sound/card0/input3 [ 44.279162] [ T59] input: HDA Intel PCH HDMI/DP,pcm=9 as /devices/platform/soc/e00000000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0/0000:02:08.0/0000:07:00.0/0000:08:04.0/0000:0a:00.0/sound/card0/input4 And 6.11.3 on riscv64 (20241016) works fine as well with xe.force_probe. So it's specifically aarch64 (or Altra) that appears to have a problem here. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1231756 https://bugzilla.suse.com/show_bug.cgi?id=1231756#c3 --- Comment #3 from Andreas Färber <afaerber@suse.com> --- I've verified that i915 is indeed limited to X86: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/driv... So at a minimum we could set CONFIG_DRM_XE_FORCE_PROBE="*" on riscv64 and arm64. Even better would be if upstream would conditionalize .force_probe = 1 for !X86 or rather !CONFIG_DRM_I915, so that no distro workarounds become necessary. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/driv... -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1231756 https://bugzilla.suse.com/show_bug.cgi?id=1231756#c4 --- Comment #4 from Andreas Färber <afaerber@suse.com> --- Mesa does not enable intel and intel_hasvk Vulkan backends for riscv64 yet (it does for Arm), nor does it enable iris Gallium backend for Arm or RISC-V. https://build.opensuse.org/request/show/1208920 -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1231756 https://bugzilla.suse.com/show_bug.cgi?id=1231756#c6 --- Comment #6 from OBSbugzilla Bot <bwiedemann+obsbugzillabot@suse.com> --- This is an autogenerated message for OBS integration: This bug (1231756) was mentioned in https://build.opensuse.org/request/show/1221967 Factory / Mesa -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@suse.com