Comment # 12 on bug 1231312 from pallas wept
(In reply to Jean Delvare from comment #9)
> Answering some of the concerns listed in the comments, to make things as
> clear as possible.

Hi Jean, 
Thanks for your help! 

> This was extended to AMD-based x86 systems in kernel v6.11.

Hey, that's me :)

> Therefore you should NEVER perform automated actions on i2c devices by their i2c bus number. 

I certainly wouldn't do that. I mentioned the changing address because, while
I'm aware that it has the potential to change, it never has, until this change.
I guess,  what I was trying to communicate is that it was very clear to me that
something unusual had happened to my i2c busses. This doesn't surprise or
trouble me now, knowing of the driver changes you've mentioned.

> If you are writing to /sys/bus/i2c/devices/i2c-*/new_device to
> manually instantiate a device, you need to first find the right i2c bus by
> checking /sys/bus/i2c/devices/i2c-*/name.

I appreciate your warnings, because I do know how bad it could be to send
messages to the wrong address - the strange there here is that I'm confident
that I didn't. I'm pretty sure *something* did, I've seen what it does, and
this looked *exactly* like it (I panicked appropriately). 

But I followed my own system's build instructions, and literally the first
thing they say to do, is to check the address first. I double checked my bash
history just now and confirmed that I did check the address first, but I did so
by running `sudo i2cdetect -l` - is that sufficient? Maybe this was the error
of my ways? Anyway, this was likely why I noticed the changing address - I have
the original output of i2cdetect in that document, and the output is different,
now.

It definitely behaved as if I'd sent the instantiation commands to the GPU's
i2c bus, or something similarly terrifying. The memory modules did work via
jc42 immediately after the instantiation, and did not, prior to it -  I don't
know if that's an indication that the address was correct, though? 

It was only after a subsequent reboot, and during the kernel boot messages,
that the monitor started to flash, I immediately powered off, factory reset the
monitor and powered it off, powered on again, and only then, it wouldn't even
POST, with alternating error codes indicating GPU, PCI and memory failures.
Terrifying D:

> This i2c bus number instability is clearly inconvenient and this is the very
> reason why I started working on automating the instantiation of SPD EEPROM a
> few years ago and upstream is currently working on extending it to more
> systems and to memory module thermal sensors devices.

Thanks for the work you're putting in! The new features are worth a few
teething problems. I'm sure we'll knock this one over :)

(In reply to Jean Delvare from comment #10)
> How do you "add new [jc42] devices"?

I have a long document with lots of comments but the actual commands and
#process:

sudo i2cdetect -l
# Find the SMBus adapter and query it to make sure the RAM is on it. Example is
for i2c-6, which is (still) correct for <=6.10 but>= 6.11, it's always on 1:
sudo i2cdetect -y 6
# Confirm the memory by the four sticks in the address ranges 0x18-0x1b and
0x50-0x53
# Instantiate the devices
sudo modprobe jc42
echo jc42 0x18 | sudo tee /sys/bus/i2c/devices/i2c-6/new_device
echo jc42 0x19 | sudo tee /sys/bus/i2c/devices/i2c-6/new_device
echo jc42 0x1a | sudo tee /sys/bus/i2c/devices/i2c-6/new_device
echo jc42 0x1b | sudo tee /sys/bus/i2c/devices/i2c-6/new_device

I hope that's not too far off the mark?

(In reply to Takashi Iwai from comment #11)
> FWIW, the *-1 kernel is the reverts of commits:
> b3e992f69c239b0eb99c408c1ca9cd4253d2e7ad
>   hwmon: (jc42)  Strengthen detect function 
> So, if *-1 works for you, it means the jc42 commits that mattered.

I broke the strengthened detect function. That's how strong I am. I'm just that
strong. :D
I'm sorry.

I hope some of this information is useful for you.


You are receiving this mail because: