Bug ID | 1117095 |
---|---|
Summary | vc4: Failed to allocate from CMA, graphics freezes |
Classification | openSUSE |
Product | openSUSE Tumbleweed |
Version | Current |
Hardware | aarch64 |
OS | openSUSE Factory |
Status | NEW |
Severity | Normal |
Priority | P5 - None |
Component | Kernel |
Assignee | kernel-maintainers@forge.provo.novell.com |
Reporter | jimc@math.ucla.edu |
QA Contact | qa-bugs@suse.de |
Found By | --- |
Blocker | --- |
On a Raspberry Pi 3B (not plus) with OpenSuSE Tumbleweed openSUSE-release-20181101-934.1.aarch64 and kernel-default-4.18.15-1.2.aarch64. Kernel command line (/proc/cmdline): BOOT_IMAGE=/boot/Image-4.18.15-1-default root=UUID=38fbf451-5579-43d1-bdd2-84cfd886ad00 loglevel=3 splash=silent plymouth.enable=0 swiotlb=512 cma=300M console=ttyS1,115200n8 console=tty resume=/dev/mmcblk0p3 /boot/efi/config.txt (minus comments): kernel=u-boot.bin gpu_mem=32 force_turbo=0 initial_turbo=30 over_voltage=0 enable_uart=1 avoid_warnings=1 dtoverlay=upstream +upstream-mmc +upstreame-aux-interrupt include ubootconfig.txt arm_control=0x200 include extraconfig.txt dtparam=audio=on dtoverlay=vc4-kms-v3d (similar symptom with vc4-fkms-v3d) /etc/X11/xorg.conf.d/20-kms.conf says: Section "Device" Identifier "kms gfx" Driver "modesetting" #Option "AccelMethod" "none" [Commented out] EndSection /var/log/Xorg.0.log says: modeset(0): [DRI2] DRI driver: vc4 AIGLX: Loaded and initialized vc4 GLX: Initialized DRI2 GL provider for screen 0 In this configuration, glmark2-0.0+git.20180608-1.1.aarch64 runs without freezing or crashing and gets an overall score of 74, whereas with software rendering the score is 17, so GPU acceleration is really happening. >From the LightDM greeter I log in and start the default XFCE desktop. I start various programs and eventually get the symptom complained about; in the simplest case I start one xterm, one xload -update 2 (secs), and xscreensaver-5.37-4.3.aarch64 is active, blanking the screen only, DPMS off after 20 min. I let it incubate overnight. At the start, CmaTotal (from /proc/meminfo) is 307200kB and CmaFree is 206856 kB; CmaFree went up gradually to 241684 kB by the time the screensaver shut off video (DPMS). After 5 hours CmaFree was static at 234252 kB. With no change in CmaFree this message appeared in syslog: Nov 22 01:15:25 orion kernel: [34890.524661] [drm:vc4_bo_create [vc4]] *ERROR* Failed to allocate from CMA: Nov 22 01:15:25 orion kernel: [34890.524683] [drm] kernel: 8100kb BOs (1) Nov 22 01:15:25 orion kernel: [34890.524691] [drm] V3D: 26904kb BOs (121) Nov 22 01:15:25 orion kernel: [34890.524699] [drm] V3D shader: 272kb BOs (65) Nov 22 01:15:25 orion kernel: [34890.524706] [drm] dumb: 48kb BOs (3) Nov 22 01:15:25 orion kernel: [34890.524713] [drm] binner: 16384kb BOs (1) Nov 22 01:15:25 orion kernel: [34890.524721] [drm] total purged BO: 8kb BOs (2) Nov 22 01:15:25 orion kernel: [34890.524741] vc4_v3d 3fc00000.v3d: Failed to allocate memory for tile binning: -12. You may need to enable CMA or give it more memory. In other tests this message appears at the same time that graphics freezes. When I woke up the screensaver, video came on, but the screen was black, except the cursor was visible, confined within the screensaver's authentication box. In other tests the screen content at the time of freezing remains unchanging, but the cursor changes shape according to what it's over, including not changing shape if the program (e.g. xterm) owning the window was killed. Keystrokes directed to an xterm are received and executed (with no visible effect on the screen), e.g. "echo Test File > /tmp/testfile", and the file appears. I can do "DISPLAY=:0 XAUTHORITY=/run/lightdm/root/:0 xwd -root > image.xwd" and the image will be complete and will show the current windows, not those at the time of freezing. The same symptoms can be elicited quicker if I run Firefox or Chromium. Heavy work in the browser did not seem to make the failure happen earlier; the 2 tests (one after the other) were to scroll quickly through 1.16Mb of text/html (no Javascript nor images), then 221 JPEG images in simple HTML pages. The freeze typically happens when I am doing nothing on the RPi, writing up notes on another machine. With either web browser, but not in the simple test case, CmaFree declined in non-reproducible patterns until the freeze occurred, and continued to decline to near zero (like 3000kB). I believe that this "death spiral" behavior is consequential damage from something freezing up, not the actual cause of the freeze. This is a known bug, though the exact symptoms seem to change with small variations in the test conditions, and with one or another kernel commit being excluded. https://github.com/raspberrypi/linux/issues/2680 (2018-09-12, OP cbxbiker61) He reports it began for him with approx. kernel 4.14.62 and someone else reports that it's still there in 4.18.11. Jimc sees it in 4.18.15 . Other forum and bug posters in various distros (Arch, Red Hat) report various similar-sounding problems, starting around 2018-09-xx. Could the SuSE distro managers please identify a combination of commits that gives the best results in the OpenSuSE context and push out that kernel, and keep an eye on progress in finding and killing the actual bug that is causing these freezeups? Thank you. I'm going to try to do the same thing, and I'll report back if I succeed, not a sure thing given my limited skills with git.