-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Alexey Eremenko wrote:
Alternatively, you can write some custom scripts to share hard disk data between nodes, use "webmin" to manage nodes together, and use Cisco's NAT with Load Balancing to use both servers.
Scripts ? to "share" ? Seems we've got quite a different definition of "sharing" here ;) And I think you're really oversimplifying "clustering". For sharing data between nodes, you definitely need shared storage (FC-attached, possibly SCSI, or for testing purposes it even works with Firewire) or a multipathed SAN. Needless to say, that's horrendously expensive. And with directly shared storage options (i.e. not SAN), you need a clustering-capable filesystem (GFS or OCFS2) that manages the dlocks between the nodes. A cheaper but less performant options is drbd. When possible, it's advisable to avoiding shared storage completely, e.g. by storing data in a replicated MySQL (when possible, depends on the application(s)). Load balancing is the most trivial issue to solve. High availability ("hot standby", aka "active-passive") isn't that easy and, depending on the application and its needs, it can be very expensive. Load balancing _and_ high availability ("active-active") is typically even a lot more complex than "just" high availability. If it's about "clustering" as in "HA clustering" (HA=High Availability) (which has almost nothing to do with "HPC clustering" (HPC=High Performance Computing), then heartbeat is also a very, very interesting option. It's limited to two nodes, but it's pretty easy to configure. Pity, what it doesn't do is monitoring services (just the nodes). You'll need another tool for that (e.g. monit or mon) and combine the two (which, surprisingly, isn't that easy to do nor is there that much documentation about it). When it's more about load-balancing, LVS (Linux Virtual Server) is a good candidate. As it operates the packet rewriting at kernel-level, it's pretty fast. It doesn't combine all that easily with HA though.
This should work for basic workloads... such as HTTP server.
That's just load-balancing, which is really trivial to do (with lots and
lots of options on how to implement it) for stateless protocols such as
HTTP.
It becomes more complex already when you have stateful (or
"conversational") protocols (EJB/JRMP/RMI, XMPP/Jabber, ...) or even
HTTP with sessions.
Unless you don't care about users having to start their shopping cart
all over again when a server dies, HTTP with sessions needs session
replication. Session replication can be achieved by persisting sessions
in a replicated database (slow), using e.g. Tomcat and its clustered
in-memory session replication mechanisms (fast), or doing "sticky
load-balancing" with a load-balancer upfront that is smart enough to
understand session URLs and cookies and to redirect traffic to the
cluster node that originated the session (fastest, but only
load-balances the first request and doesn't help all that much wrt HA).
cheers
- --
-o) Pascal Bleser http://linux01.gwdg.de/~pbleser/
/\\