[opensuse] New Subject - building clusters with OpenSuse
Thanks!
I checked Science and Productivity and its not there -
However, before we do that - it might be worthwhile to consider what
else do we need for clustering? It would be really great to see
OpenSuse shipped on clusters of all sizes from the little guys to
biggies but to make that easy the missing parts should be supplied.
Here's my first cut at a list.......
- hearbeat (not sure its totally necessary but should be semi-easy)
- OpenPBS http://www.openpbs.org/
- Oscar http://oscar.openclustergroup.org/
- OpenMPI http://www.open-mpi.org/
What else?
Michael
On 6/15/07, Sunny
On 6/14/07, Michael Folsom
wrote: Don't really think this is about yast - the rpm for heartbeat isn't included in OpenSuse 10.2 or 10.3 Alpha 4 and it is in SLES10. If heartbeat isn't there you really don't need the yast2-heartbeat module.
How does one petition the folks developing OpenSuse to include heartbeat?
http://en.opensuse.org/Wishlist
-- Svetoslav Milenov (Sunny)
Even the most advanced equipment in the hands of the ignorant is just a pile of scrap. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On 6/15/07, Michael Folsom
Thanks!
I checked Science and Productivity and its not there -
However, before we do that - it might be worthwhile to consider what else do we need for clustering? It would be really great to see OpenSuse shipped on clusters of all sizes from the little guys to biggies but to make that easy the missing parts should be supplied.
Here's my first cut at a list....... - hearbeat (not sure its totally necessary but should be semi-easy) - OpenPBS http://www.openpbs.org/ - Oscar http://oscar.openclustergroup.org/ - OpenMPI http://www.open-mpi.org/
What else?
OCFS2 - (Oracle's Cluster FS) - it is already in the vanilla kernel and included in SLES10. http://oss.oracle.com/projects/ocfs2/dist/documentation/ocfs2_faq.html#SLES drdb - part of SLES10 and integrates well with heartbeat. Lustre - Don't know status wrt opensuse OpenSSI - Lots of work since they are still at 2.6.11 kernel, but they are going to do a kernel upgrade soon, but probably only to 2.6.16 or so first. OpenGFS - Don't know status wrt opensuse, but it is a Redhat team that does most of the r&d. Greg -- Greg Freemyer The Norcross Group Forensics for the 21st Century -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Greg Freemyer wrote:
On 6/15/07, Michael Folsom
wrote: Thanks!
I checked Science and Productivity and its not there -
However, before we do that - it might be worthwhile to consider what else do we need for clustering? It would be really great to see OpenSuse shipped on clusters of all sizes from the little guys to biggies but to make that easy the missing parts should be supplied.
Here's my first cut at a list....... - hearbeat (not sure its totally necessary but should be semi-easy) - OpenPBS http://www.openpbs.org/ - Oscar http://oscar.openclustergroup.org/ - OpenMPI http://www.open-mpi.org/
What else? http://openmosix.sourceforge.net/
openMosix is a Linux kernel extension for single-system image clustering which turns a network of ordinary computers into a supercomputer. 'What is openMosix useful for?' /1/ openMosix allows you to join together multiple computers running the Linux operating system, and have them appear to the user as one large multiple-processor computer. For example, suppose you had two computers, A and B joined in an openMosix cluster. Without openMosix, if you ran two programs on A they would only get 50% of the CPU time each. With openMosix, one of the programs could migrate 'automagically' to B, so both processes would run at 100% CPU. As far as the user is concerned, A now behaves like a two-CPU SMP computer with twice the CPU power available. /1/ http://howto.x-tend.be/openMosixWiki/index.php/FAQ#.27What_is_openMosix.3F.2... Regards, -- Patrick Kirsch - Quality Assurance Department SUSE Linux Products GmbH GF: Markus Rex, HRB 16746 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Michael Folsom wrote:
I checked Science and Productivity and its not there -
However, before we do that - it might be worthwhile to consider what else do we need for clustering? It would be really great to see OpenSuse shipped on clusters of all sizes from the little guys to biggies but to make that easy the missing parts should be supplied.
I think building, running, and maintaining an HPC Linux cluster requires more than just a collection of software tools. Anyway...
Here's my first cut at a list....... - hearbeat (not sure its totally necessary but should be semi-easy) - OpenPBS http://www.openpbs.org/ - Oscar http://oscar.openclustergroup.org/ - OpenMPI http://www.open-mpi.org/
http://ganglia.sourceforge.net/ (might be part of oscar) http://munin.projects.linpro.no/ http://www.clusterresources.com/pages/products/torque-resource-manager.php http://www.csm.ornl.gov/torc/C3/ ... -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
I absolutely agree -
I was just wondering aloud if the inclusion of a few basic tools may
make it an easier choice for the cluster building crowd. There is no
way that any distro could do it all -
M-
On 6/15/07, Thomas Hertweck
Michael Folsom wrote:
I checked Science and Productivity and its not there -
However, before we do that - it might be worthwhile to consider what else do we need for clustering? It would be really great to see OpenSuse shipped on clusters of all sizes from the little guys to biggies but to make that easy the missing parts should be supplied.
I think building, running, and maintaining an HPC Linux cluster requires more than just a collection of software tools. Anyway...
Here's my first cut at a list....... - hearbeat (not sure its totally necessary but should be semi-easy) - OpenPBS http://www.openpbs.org/ - Oscar http://oscar.openclustergroup.org/ - OpenMPI http://www.open-mpi.org/
http://ganglia.sourceforge.net/ (might be part of oscar) http://munin.projects.linpro.no/ http://www.clusterresources.com/pages/products/torque-resource-manager.php http://www.csm.ornl.gov/torc/C3/ ...
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Alternatively, you can write some custom scripts to share hard disk data between nodes, use "webmin" to manage nodes together, and use Cisco's NAT with Load Balancing to use both servers. This should work for basic workloads... such as HTTP server. I'm not sure how good or bad openSUSE is at "clustering" compared to Debian and RedHat. -- -Alexey Eremenko "Technologov" -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Alexey Eremenko wrote:
Alternatively, you can write some custom scripts to share hard disk data between nodes, use "webmin" to manage nodes together, and use Cisco's NAT with Load Balancing to use both servers.
Scripts ? to "share" ? Seems we've got quite a different definition of "sharing" here ;) And I think you're really oversimplifying "clustering". For sharing data between nodes, you definitely need shared storage (FC-attached, possibly SCSI, or for testing purposes it even works with Firewire) or a multipathed SAN. Needless to say, that's horrendously expensive. And with directly shared storage options (i.e. not SAN), you need a clustering-capable filesystem (GFS or OCFS2) that manages the dlocks between the nodes. A cheaper but less performant options is drbd. When possible, it's advisable to avoiding shared storage completely, e.g. by storing data in a replicated MySQL (when possible, depends on the application(s)). Load balancing is the most trivial issue to solve. High availability ("hot standby", aka "active-passive") isn't that easy and, depending on the application and its needs, it can be very expensive. Load balancing _and_ high availability ("active-active") is typically even a lot more complex than "just" high availability. If it's about "clustering" as in "HA clustering" (HA=High Availability) (which has almost nothing to do with "HPC clustering" (HPC=High Performance Computing), then heartbeat is also a very, very interesting option. It's limited to two nodes, but it's pretty easy to configure. Pity, what it doesn't do is monitoring services (just the nodes). You'll need another tool for that (e.g. monit or mon) and combine the two (which, surprisingly, isn't that easy to do nor is there that much documentation about it). When it's more about load-balancing, LVS (Linux Virtual Server) is a good candidate. As it operates the packet rewriting at kernel-level, it's pretty fast. It doesn't combine all that easily with HA though.
This should work for basic workloads... such as HTTP server.
That's just load-balancing, which is really trivial to do (with lots and
lots of options on how to implement it) for stateless protocols such as
HTTP.
It becomes more complex already when you have stateful (or
"conversational") protocols (EJB/JRMP/RMI, XMPP/Jabber, ...) or even
HTTP with sessions.
Unless you don't care about users having to start their shopping cart
all over again when a server dies, HTTP with sessions needs session
replication. Session replication can be achieved by persisting sessions
in a replicated database (slow), using e.g. Tomcat and its clustered
in-memory session replication mechanisms (fast), or doing "sticky
load-balancing" with a load-balancer upfront that is smart enough to
understand session URLs and cookies and to redirect traffic to the
cluster node that originated the session (fastest, but only
load-balances the first request and doesn't help all that much wrt HA).
cheers
- --
-o) Pascal Bleser http://linux01.gwdg.de/~pbleser/
/\\
2007/6/15, Pascal Bleser
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Alexey Eremenko wrote:
Alternatively, you can write some custom scripts to share hard disk data between nodes, use "webmin" to manage nodes together, and use Cisco's NAT with Load Balancing to use both servers.
Scripts ? to "share" ? Seems we've got quite a different definition of "sharing" here ;) And I think you're really oversimplifying "clustering".
For sharing data between nodes, you definitely need shared storage (FC-attached, possibly SCSI, or for testing purposes it even works with Firewire) or a multipathed SAN. Needless to say, that's horrendously expensive. And with directly shared storage options (i.e. not SAN), you need a clustering-capable filesystem (GFS or OCFS2) that manages the dlocks between the nodes.
A cheaper but less performant options is drbd. When possible, it's advisable to avoiding shared storage completely, e.g. by storing data in a replicated MySQL (when possible, depends on the application(s)).
Load balancing is the most trivial issue to solve. High availability ("hot standby", aka "active-passive") isn't that easy and, depending on the application and its needs, it can be very expensive. Load balancing _and_ high availability ("active-active") is typically even a lot more complex than "just" high availability.
If it's about "clustering" as in "HA clustering" (HA=High Availability) (which has almost nothing to do with "HPC clustering" (HPC=High Performance Computing), then heartbeat is also a very, very interesting option. It's limited to two nodes, but it's pretty easy to configure. Pity, what it doesn't do is monitoring services (just the nodes). You'll need another tool for that (e.g. monit or mon) and combine the two (which, surprisingly, isn't that easy to do nor is there that much documentation about it).
When it's more about load-balancing, LVS (Linux Virtual Server) is a good candidate. As it operates the packet rewriting at kernel-level, it's pretty fast. It doesn't combine all that easily with HA though.
This should work for basic workloads... such as HTTP server.
That's just load-balancing, which is really trivial to do (with lots and lots of options on how to implement it) for stateless protocols such as HTTP. It becomes more complex already when you have stateful (or "conversational") protocols (EJB/JRMP/RMI, XMPP/Jabber, ...) or even HTTP with sessions. Unless you don't care about users having to start their shopping cart all over again when a server dies, HTTP with sessions needs session replication. Session replication can be achieved by persisting sessions in a replicated database (slow), using e.g. Tomcat and its clustered in-memory session replication mechanisms (fast), or doing "sticky load-balancing" with a load-balancer upfront that is smart enough to understand session URLs and cookies and to redirect traffic to the cluster node that originated the session (fastest, but only load-balances the first request and doesn't help all that much wrt HA).
cheers - -- -o) Pascal Bleser http://linux01.gwdg.de/~pbleser/ /\\
_\_v The more things change, the more they stay insane. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFGczP3r3NMWliFcXcRAgkRAKCokXIfuDwrH/TSTTqV62P7bJVCsgCfdGaE DjgWqtFXzvZ4pEAvg9pKkWk= =eV8u -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
The new Heartbeat2 with OCFSv2 would be a killer combination in some future, the only problem with heartbeat2 is the lack of documentation.... Maybe adding openSSI we could get something similar to TruCluster... Ciro -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On 6/15/07, Ciro Iriarte
The new Heartbeat2 with OCFSv2 would be a killer combination in some future, the only problem with heartbeat2 is the lack of documentation.... Maybe adding openSSI we could get something similar to TruCluster...
Ciro -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
OCFSv2 is part of the vanilla 10.2 distro. and is available via yast2. And you can find heartbeat 2.0.7 at http://download.opensuse.org/distribution/10.2/repo/oss/suse/i586/ I'm not sure why I'm not finding those same RPMs via yast2. Greg -- Greg Freemyer The Norcross Group Forensics for the 21st Century -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
participants (7)
-
Alexey Eremenko
-
Ciro Iriarte
-
Greg Freemyer
-
Michael Folsom
-
Pascal Bleser
-
Patrick Kirsch
-
Thomas Hertweck