Marlier, Ian wrote:
I've looked in /etc/smartd.conf and at the man pages for smartd.conf and smartctl but I'm still confused about exactly what I need to put in the config file! :( There's way too much info and not enough examples.
It could be better, but it could be a whole lot worse, too...
Agreed. The man pages are a lot better than some.
The most basic info you need to know about the 3ware cards is this: The 3ware 9XXX cards use /dev/twaX as their device nodes, and smart will happily take that as a parameter. It does require an additional "device" parameter as well, though.
`smartctl -a /dev/twa0 -d 3ware,0` should show you the SMART data for the disk attached to port 0 on the 3ware controller card. `smartctl -a /dev/twa0 -d 3ware,1` the info for port 1, and so on.
Thanks. That works. I didn't think to try that because I haven't got smartd set up properly so I didn't think there'd be any data there.
I'm also not sure whether to enable any testing with smart. I do have 3dm2 running so I get some info from that via the browser and mail.
This is an interesting question, actually.
The 3ware controller does proactively monitor the SMART data on attached disks, so theoretically you don't need to bother setting up smartd. I have one disk that just dropped out of a RAID-5 on a 3ware card because one of its SMART attributes entered a failure state.
That said, I like having SMART monitoring set up as well, if only because it's a good double-check when something happens. I've had cases where a disk drops offline because of a SMART failure, and both the 3ware card and smartd will notify of those failures. But, I've also had cases where the problem is the port on the 3ware card, or the SATA cable connecting the drive to the controller, and in those cases the lack of notification from smartd is a good tip to start at the controller end of the connection and work back to the drive. (It did take a fair amount of experience to figure out when to look where, and what notifications "paired up" -- and there isn't a great shorthand.)
That's very useful info.
Here's an example from one of my servers:
<snip>
If you have other questions, let me know -- this is hard-won experience and I'm more than happy to pass it along :-)
Excellent! That example and the rest of the advice was exactly what I was hoping for. Thanks again, Dave -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org