* Greg Freemyer (freemyer@NorcrossGroup.com) [030509 13:36]:
SuSE seems to recommend "heartbeat" for HA clusters.
"heartbeat" is also known as Linux-HA.
I don't know if we'd recommend that for ftp though. It's hard to imagine the address take-over and things like that working properly with a complex protocol like ftp since you have two connections (the data and the control) plus all the complications of passive versus active mode. Of course, I'm not a cluster person so I don't really know for sure. Hopefully Lars will have time to answer.
In the past we've used load-balancing routers and round-robin dns for ftp.suse.com when needed.
Your right. During a fail-over, all logged in ftp users would be disconnected and forced to login again (to the new HA service node). It may even be worse than that, but hopefully the ftp client would fairly cleanly handle the loss of a socket. Linux-HA with the appropriate shared disk (or replicated disk) would ensure any uploaded data survived the failure. I think the ncftp client may even re-login and resume the upload/download from where it was. To do it seamlessly seems very difficult to me. (And I do a lot of clustering.) routers and round-robin dns does not help with a seam-less failure. I think SSI Linux has a goal to do this type of seamless HA fail-over, but I think they have a long way to go before they are even close. For now, SSI Linux is close to a 1.0 release with LVS used to do load-balancing, but seamless failovers are not supported. FYI: I've been tracking SSI Linux and there is no way I would put the 1.0 release into a mission-critical setup. In this case I think 1.0 means: Consider for use in limited production environments. Few SPOFs still exist. Greg -- Greg Freemyer