I've looked in /etc/smartd.conf and at the man pages for smartd.conf and smartctl but I'm still confused about exactly what I need to put in the config file! :( There's way too much info and not enough examples.
It could be better, but it could be a whole lot worse, too... The most basic info you need to know about the 3ware cards is this: The 3ware 9XXX cards use /dev/twaX as their device nodes, and smart will happily take that as a parameter. It does require an additional "device" parameter as well, though. `smartctl -a /dev/twa0 -d 3ware,0` should show you the SMART data for the disk attached to port 0 on the 3ware controller card. `smartctl -a /dev/twa0 -d 3ware,1` the info for port 1, and so on.
I'm also not sure whether to enable any testing with smart. I do have 3dm2 running so I get some info from that via the browser and mail.
This is an interesting question, actually. The 3ware controller does proactively monitor the SMART data on attached disks, so theoretically you don't need to bother setting up smartd. I have one disk that just dropped out of a RAID-5 on a 3ware card because one of its SMART attributes entered a failure state. That said, I like having SMART monitoring set up as well, if only because it's a good double-check when something happens. I've had cases where a disk drops offline because of a SMART failure, and both the 3ware card and smartd will notify of those failures. But, I've also had cases where the problem is the port on the 3ware card, or the SATA cable connecting the drive to the controller, and in those cases the lack of notification from smartd is a good tip to start at the controller end of the connection and work back to the drive. (It did take a fair amount of experience to figure out when to look where, and what notifications "paired up" -- and there isn't a great shorthand.)
I have a system with a couple of RAID arrays controlled by 3ware 9500s-8 controllers and I'm trying to configure SMART for them. I recently installed opensuse 10.3 on the machine and see things like this in /var/log/messages:
Feb 6 10:16:11 suse1 smartd[4133]: Device: /dev/sdb, opened Feb 6 10:16:11 suse1 smartd[4133]: Device /dev/sdb, please try adding '-d 3ware,N' Feb 6 10:16:11 suse1 smartd[4133]: Device /dev/sdb, you may need to replace /dev/sdb with /dev/twaN or /dev/tweN
The log file entries that you're seeing are because of the default smart configuration file, which includes this line (on 10.3; similar though not identical lines have been in that file going back to 9.X or before): DEVICESCAN -m root@localhost -M exec /usr/lib/smartmontools/smart-notify According to the inline comments, DEVICESCAN tells smartd to look for any ATA and SCSI devices to monitor -- but of course /dev/sdb, the 3ware RAID, isn't a "real" device. It also tells smartd to ignore any other contents of the smartd.conf file.
I'm nervous about experimenting because this is my main file server. Has anybody got an example configuration I could copy?
You need to comment out (or remove) all of the "DEVICESCAN" lines, and then add your test/monitoring definitions to the end of the file (or the top of the file, or wherever, as long as all of the DEVICESCAN lines are commented). Here's an example from one of my servers: ####### smartd configuration for drives attached to a 3ware 9XXX series card ### ### Proactively monitor the disks, and if any SMART attributes trip to "pre-fail" or "fail" state send ### an e-mail to sys@xxxxxxxxxxxxxxx.com /dev/twa0 -d 3ware,0 -a -m sys@xxxxxxxxxxxxxxx.com /dev/twa0 -d 3ware,1 -a -m sys@xxxxxxxxxxxxxxx.com /dev/twa0 -d 3ware,2 -a -m sys@xxxxxxxxxxxxxxx.com /dev/twa0 -d 3ware,3 -a -m sys@xxxxxxxxxxxxxxx.com /dev/twa0 -d 3ware,4 -a -m sys@xxxxxxxxxxxxxxx.com /dev/twa0 -d 3ware,5 -a -m sys@xxxxxxxxxxxxxxx.com /dev/twa0 -d 3ware,6 -a -m sys@xxxxxxxxxxxxxxx.com /dev/twa0 -d 3ware,7 -a -m sys@xxxxxxxxxxxxxxx.com ### ### At smartd startup, send an e-mail to sys@xxxxxxxxxxxxxxxxx.com just to confirm mail delivery /dev/twa0 -d 3ware,0 -m sys@xxxxxxxxxxxxxxx.com -M test ### ### Once a week run a SMART "short test" on each disk ### tests are scheduled to run on Sunday morning ### port 0 gets tested between 1AM and 2AM ### port 1 gets tested between 2AM and 3AM, etc /dev/twa0 -d 3ware,0 -a -s L/../../7/01 /dev/twa0 -d 3ware,1 -a -s L/../../7/02 /dev/twa0 -d 3ware,2 -a -s L/../../7/03 /dev/twa0 -d 3ware,3 -a -s L/../../7/04 /dev/twa0 -d 3ware,4 -a -s L/../../7/05 /dev/twa0 -d 3ware,5 -a -s L/../../7/06 /dev/twa0 -d 3ware,6 -a -s L/../../7/07 /dev/twa0 -d 3ware,7 -a -s L/../../7/08 ###### End 3ware 9XXX config If you have other questions, let me know -- this is hard-won experience and I'm more than happy to pass it along :-) - Ian -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org