On Wed, Sep 09, 2009 at 10:48:34AM -0400, Alex Deucher wrote:
On Wed, Sep 9, 2009 at 7:39 AM, Luc Verhaegen
wrote: Now, what i also remember noticing then is that the code behind DRM_RADEON_CP_IDLE did a whole lot more than just checking whether the CP was idle.
Perhaps the naming is a bit too exact (it's a bit more than just CP idle, but that's what we want). It waits for the command fifo to drain and then waits for the entire GUI engine to idle (CP, 2D, and 3D). Bit 31 of RBMM_STATUS, will be set if: 2D is busy or 3D is busy or Command fifo is not empty or CP is busy or CSQ is not empty or Ring buffer is not empty
You can also poll other bits of that reg for the status of individual engine blocks.
Why is there this reason to wait for all engines to be idle here? Isn't there this unix mindset which kind of says: do one small thing, but do it well?
This is not a different part of the engine, this is what you want to poll to get the busy status. I pasted the code that actually polls to wait for the engines (CP, 2D, and 3D) to be idle. The function used in the drm directly as well as in the CP_IDLE ioctl.
We do wait for the CP to idle. See above. We could probably add better hang detection by checking whether the RTPR is actually progressing when we time-out waiting for the fifo or busy bit to clear. Unfortunately, if one of the engine blocks has hung, the CP may still be fetching stuff until it all falls over.
If the PTRs are still moving, there is no point in checking whether the CP claims to be idle. Check that first, then when the RPTR reaches the WPTR, start waiting on the CP to become idle. Then, return 0, so that the caller can then go and granularly check in on whichever engine it also requires to idle in whatever way it knows.
The ioctl is not broken although perhaps we need some better logic for detecting whether the GPU has actually hung. However, in most cases it has hung if the ioctl fails. Generally this is caused by a bad command stream or combination of command streams. Unfortunately, sorting that out is hard since the command that actually hung the chip may not be the one currently being processed. The new kms-enabled drm adds some debugfs features for dumping IBs and the command fifo which makes this easier to sort out.
I think that the ioctl is broken. One should care about only the state of CP here, and not of anything else. I just threw a tiny bit of money at getting some r5xx hw again, so it is possible that i take a much closer look to this stuff again somewhere next week. Luc Verhaegen. -- To unsubscribe, e-mail: radeonhd+unsubscribe@opensuse.org For additional commands, e-mail: radeonhd+help@opensuse.org