[opensuse-arm] JeOS-2017-03-13 image for RPi 1 installs OK, Grub fails on reboot

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Greetings, I was testing: openSUSE-Tumbleweed-ARM-JeOS-raspberrypi.armv6l-2017.03.13-Build1.8.raw.xz from: http://download.opensuse.org/repositories/devel:/ARM:/Factory:/Contrib:/Rasp... On RaspberryPi 1 Model B It installs fine - expands the file system, creates the dracut based init and boots to a usable system. But when I reboot, grub fails with: error: attempt to read or write outside of partition. and drops me to the grub rescue system. Poking around the rescue system, I've found that I can see 4 partitions, one which is ext2 and has the normal Linux files. From here, most of the recovery guides say to set the root and prefix variables and then load the normal module. I can execute the commands: set root=(hd0,gpt2) set prefix=(hd0,gpt2)/boot/grub2 and list the files in the various directories - normal.mod exists in /boot/grub2/arm-efi/ - but when I try: insmod normal I still get the error: attempt to read or write outside of partition. Somewhat interestingly, if I miss-type the prefix (say; /boot/grub) the error changes to a "file not found" one. I was hoping someone might have some ideas about what to try next. - -Alex -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEARECAAYFAljUJnYACgkQYZYc2javKOllFQCeNiB6oZYpnKxgmKsbdUtcQNrO mOIAoJBXe8913EIzBfD04P/PqfPjFv9e =ZZC3 -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-arm+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-arm+owner@opensuse.org

Hi Alex, On 23/03/2017 20:48, Alex Armstrong wrote:
This sounds like the repartitioning failed.
Can you also see the size? IIRC you see partition information with (grub) ls (hd0)
One thing you could try is check on a working system what the partition table and file system look like. It almost sounds like the ext4 partition got resized, but the partition table is still on the old, small size. Alex -- To unsubscribe, e-mail: opensuse-arm+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-arm+owner@opensuse.org

Alexander Graf wrote:
For the second partition, grub tells me that it is ext2, but no other details.
Model: Generic- Multi-Card (scsi) Disk /dev/sdb: 7948MB Sector size (logical/physical): 512B/512B Partition Table: gpt_sync_mbr Number Start End Size File system Name Flags 1 1049kB 211MB 210MB fat16 UEFI boot 2 212MB 430MB 218MB ext4 lxboot 3 431MB 7427MB 6996MB lxroot 4 7428MB 7948MB 520MB linux-swap(v1) lxswap I think you're right - the resizing didn't work as expected. I saw the success message in the log and didn't think to check it further. Additionally when I try to mount partition 2, it fails with this message: EXT4-fs (sdb2): bad geometry: block count 1761625 exceeds size of device (53248 blocks) Is there a minimum size SD card required? -Alex -- To unsubscribe, e-mail: opensuse-arm+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-arm+owner@opensuse.org

On 27/03/2017 23:15, Alex Armstrong wrote:
The partition table looks pretty sane to me. You have an 8GB disk and / is properly expanded to 7GB.
Ok, so something really is broken there ;). Good. It looks like your block size is 4kb (default size for ext4 IIRC). 53248 blocks translate to 218MB (with 1000 bytes as kbyte) while 1761625 blocks would be 7215MB. If I had to guess, I'd say someone dropped the requirement for a separate boot partition but forgot to update the partitioning script in the JeOS package. If I'm right, /dev/sdb3 should not contain a valid file system. In that case, can you manually try to fix it up for now? Remove partitions 2 and 3. Then create a new partition from beginning of current partition 2 until end of partition 3. Switch to unit type sector (unit s I think in parted) to make sure they really are aligned. Then try to mount that new partition. Does it work? If so, does it boot? Alex -- To unsubscribe, e-mail: opensuse-arm+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-arm+owner@opensuse.org

On 27/03/2017 23:27, Alexander Graf wrote:
In fact, is there any particular reason you're using the downstream kernel image? Do you get the same breakage with the upstream based one from here? http://download.opensuse.org/ports/armv6hl/tumbleweed/images/ Alex -- To unsubscribe, e-mail: opensuse-arm+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-arm+owner@opensuse.org

Alexander Graf wrote:
----- Testing the image from: http://download.opensuse.org/ports/armv6hl/tumbleweed/images/ yields different results. In summary, it seems that the partition table is not completely written - leaving some entries from the previous table around and leading to much confusion at boot time. I noticed a few errors in the initial boot messages, but nothing that stopped it from completing start up. Hopefully I'm missing something obvious. ----- Below are my troubleshooting steps. ### First Try: The very first time I tried the image, it gave me a working system that rebooted to a working system. But it had trouble finding the swap partition, so I (foolishly) took the SD card out and inspected it with parted on a separate machine. I saw that the swap partition was still #4 (as it had been from the previous tests) and so I changed the 4 to a 3 and wrote the changes. After I did that it wouldn't boot at all. ### Second Try: It boots to a working system, but reboot fails to even bring up the boot loader. No u-boot, grub, etc. I'm working thru a serial terminal at the moment, so would expect to see something. When I inspect the SD card on a different computer, I get this: GPT fdisk (gdisk) version 0.8.7 Type device filename, or press <Enter> to exit: /dev/sdb Partition table scan: MBR: protective BSD: not present APM: not present GPT: present Found valid GPT with protective MBR; using GPT. and then trying to list the partitions: Command (? for help): p Disk /dev/sdb: 15523840 sectors, 7.4 GiB Logical sector size: 512 bytes Disk identifier (GUID): 100CFB37-F6C4-48FF-B891-09C49CAB0A5F Partition table holds up to 128 entries First usable sector is 34, last usable sector is 15523806 Partitions will be aligned on 2048-sector boundaries Total free space is 15523773 sectors (7.4 GiB) Number Start (sector) End (sector) Size Code Name Noting at all in the GPT table. Which doesn't seem right. The MBR table shows one partition: Expert command (? for help): o Disk size is 15523840 sectors (7.4 GiB) MBR disk identifier: 0x00000000 MBR partitions: Number Boot Start Sector End Sector Status Code 1 1 15523839 primary 0xEE Which apparently takes up the entire disk. ### Third Try: Now, I'm using the same SD card for all of this, and the partition table has been written a number of times. So, thinking the card might be going bad, I formatted it with YaST - 3 partitions in a similar way to what the RPi wants: fat, ext4 and swap. That went OK and I was able to mount the partitions fine. Next I dd'ed the image back onto the card and tried again. This time it rebooted to a fully functional grub - not just the rescue system. But complained: error: no such device: ce7e3539... I could list the various partitions and the first (fat) one showed the expected RPi files. But the other two gave an error: No known filesystem detected - Partitions start at 217099KiB - Total size 7289184KiB And wouldn't show any files. The interesting thing is that, on inspection, the partition table entries were exactly the same size as I had written with YaST. And I would expect them to slightly different (I'm not that good). So, seemingly it doesn't write a the partition table at all, and your left with whatever was there to begin with. So the first time I did it I had a valid setup from before and it worked - but not the swap. And the rest of the times it wasn't even close. ### Fourth Try: Next I tried reformatting the card as full FAT partition - then dding image and testing. After it did it's initial boot, it rebooted to the full grub environment. Inspecting the SD card partition table with gdisk give me this: GPT fdisk (gdisk) version 0.8.7 Partition table scan: MBR: hybrid BSD: not present APM: not present GPT: present Found valid GPT with hybrid MBR; using GPT. Command (? for help): p Disk /dev/sdb: 15523840 sectors, 7.4 GiB Logical sector size: 512 bytes Disk identifier (GUID): 100CFB37-F6C4-48FF-B891-09C49CAB0A5F Partition table holds up to 128 entries First usable sector is 34, last usable sector is 15523806 Partitions will be aligned on 2048-sector boundaries Total free space is 4029 sectors (2.0 MiB) Number Start (sector) End (sector) Size Code Name 1 2048 15521791 7.4 GiB 0700 primary So there is one GPT table entry. And looking at the MBR table: Expert command (? for help): o Disk size is 15523840 sectors (7.4 GiB) MBR disk identifier: 0x00000000 MBR partitions: Number Boot Start Sector End Sector Status Code 1 2048 15521791 primary 0x0C 2 1 2047 primary 0xEE 4 15521792 15523839 primary 0x83 Note that the GPT entry is the same as I crated for the initial FAT partition. The three MBR entries are new though. ### And the initial boot error messages: skiped writing MBR ID for armv6l GPT fdisk (gdisk) version 1.0.1 Caution! After loading partitions, the CRC doesn't check out! Partition table scan: MBR: MBR only BSD: not present APM: not present GPT: damaged Found valid MBR and corrupt GPT. Which do you want to use? (Using the GPT MAY permit recovery of GPT data.) ... Recovery/transformation command (? for help): Warning! Mismatched GPT and MBR partition! MBR partition 4, of type 0x83, has no corresponding GPT partition! You may continue, but this condition might cause data loss in the future! Final checks complete. About to write GPT data. THIS WILL OVERWRITE EXISTING PARTITIONS!! My next thoughts are to try, in the initial running system, somehow writing the partition table. Seemingly difficult in a running system, so it'll have to wait till tomorrow. -Alex -- To unsubscribe, e-mail: opensuse-arm+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-arm+owner@opensuse.org

Alex Armstrong wrote:
Using the JeOS 2017.03.13-Build1.10 fixes this problem. Many thanks. Two things of note: - It doesn't add a swap partition - which causes an error on boot up - but I don't need one for my testing. And the errors are removed by editing fstab. - Group 'lock' GID 54 is missing - which causes systemd to complain, but adding the group fixes things. -Alex -- To unsubscribe, e-mail: opensuse-arm+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-arm+owner@opensuse.org

On 03/28/2017 08:27 AM, Alexander Graf wrote:
I just had the same thing on a Raspberry Pi 3 with openSUSE-Leap42.3-ARM-JeOS-raspberrypi3.aarch64-2017.07.26-Build1.1.raw.xz from http://download.opensuse.org/ports/aarch64/distribution/leap/42.3/appliances...
In my case, I have: # parted /dev/mmcblk0 GNU Parted 3.1 Using /dev/mmcblk0 Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) print Model: SD SA16G (sd/mmc) Disk /dev/mmcblk0: 15.5GB Sector size (logical/physical): 512B/512B Partition Table: msdos Disk Flags: Number Start End Size Type File system Flags 1 1049kB 211MB 210MB primary fat16 lba, type=0c 2 212MB 430MB 218MB primary ext4 type=83 3 431MB 15.0GB 14.5GB primary type=83 4 15.0GB 15.5GB 519MB primary linux-swap(v1) type=83
That trick worked for me. For the record: (parted) unit s (parted) print Model: SD SA16G (sd/mmc) Disk /dev/mmcblk0: 30253056s Sector size (logical/physical): 512B/512B Partition Table: msdos Disk Flags: Number Start End Size Type File system Flags 1 2048s 411651s 409604s primary fat16 lba, type=0c 2 413696s 839683s 425988s primary ext4 type=83 3 841728s 29238300s 28396573s primary type=83 4 29239296s 30253022s 1013727s primary linux-swap(v1) type=83 (parted) rm 2 (parted) rm 3 (parted) mkpart Partition type? primary/extended? primary File system type? [ext2]? ext4 Start? 413696s End? 29238300s (parted) p Model: SD SA16G (sd/mmc) Disk /dev/mmcblk0: 30253056s Sector size (logical/physical): 512B/512B Partition Table: msdos Disk Flags: Number Start End Size Type File system Flags 1 2048s 411651s 409604s primary fat16 lba, type=0c 2 413696s 29238300s 28824605s primary ext4 type=83 4 29239296s 30253022s 1013727s primary linux-swap(v1) type=83 Rebooted, came up fine (although it takes about a minute and a half to get from the grub screen to the login prompt, and the screen is blank for all this time -- is that normal?) Regards, Tim -- Tim Serong Senior Clustering Engineer SUSE tserong@suse.com
participants (3)
-
Alex Armstrong
-
Alexander Graf
-
Tim Serong