Mailinglist Archive: opensuse (2831 mails)

< Previous Next >
Re: [opensuse] Cluster and SuSE
  • From: Pascal Bleser <pascal.bleser@xxxxxxxxx>
  • Date: Thu, 27 Jul 2006 00:10:19 +0200
  • Message-id: <44C7E84B.9000409@xxxxxxxxx>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Yu Safin wrote:
> On 7/22/06, Pascal Bleser <pascal.bleser@xxxxxxxxx> wrote:
>> Yu Safin wrote:
>>> I currently have an application that runs Oracle under one SuSE server
>>> and Java in-house code in another SuSE server.
>>> I was wondering if anybody can suggest a cluster solution so I can
>>> have my Java code duplicated in two+ SuSE servers.
>>
>> What is that in-house Java application ? Running in a servlet container
>> (e.g. Tomcat), an EJB container (e.g. JBoss), ... ?
>>
>> You should consider two things:
>> 1) a cluster software to provide high availability of the cluster nodes
>> themselves (most importantly: heartbeat checking between the nodes +
>> migrating the cluster's IP address (the "VIP")) - I use heartbeat [1]
>> for that, works nicely and necessary RPMs are shipped with SUSE Linux
>>
>> 2) cluster your Java application: you can do that with Tomcat and JBoss
>> (and Weblogic, and Websphere, and Glassfish, and ...)
>>
>> [1] http://www.linux-ha.org/
>>
>>> How about Oracle, can I cluster Oracle to run on two+ servers?
>>> I am trying to achieve not only higher availability but also put to
>>> work some hardware servers I have on the floor collecting dust.
>>
>> You _can_ cluster Oracle, but... you'll have to set up Oracle RAC, which
>> is both very complex *and* very expensive (different license fees).
>
> The Java application is in a JAR that runs on a schedule (batch).

(I'm afraid this is strongly drifting into off-topicness but well..
maybe it's of some interest to others as well ;))

How is remoting involved ? I mean, to cluster an application, you need
to have actual points where you can load-balance incoming requests.

If you want some help with this, you'll have to give a lot more details
than just "runs on a schedule (batch)".
Are there any clients ? If so, how do the clients trigger the batch ? Or
is it just triggered e.g. by cron ?
What are you looking for, High Availability or Load Balancing ? (or both ?)
Hot standby or parallel processing ?

If it's a plain Java application that does not run inside an EJB
container (like Jboss) or Servlet container (like Tomcat), then it
becomes more difficult to cluster.
First off, you have to take care of the replication of the Jar yourself
(i.e. if you deploy a new version of the jar, or modified configuration
files, etc...).
You can either do that "manually" (with rsync, for example) or with a
shared filesystem.
Shared filesystems are typically very complex and extremely expensive
things (SCSI or FC attached disks that can do DLOCK). Except if you use
a IP and software based solution such as nbd (network block device),
drbd or similar solutions (that are less reliable than physically shared
disks).
OCFS2 (Oracle Cluster FileSystem v2) might be an option as well, haven't
tried it myself though (btw, ocfs2 is included on SUSE Linux.. from 10.0
on, AFAICR):
http://oss.oracle.com/projects/ocfs2/
http://www.eweek.com/article2/0,1895,1847510,00.asp
(OCFS2 is not limited to Oracle RAC databases, it can be used as a
software shared filesystem for virtually anything)

Anyhow, I would say that a "manual" solution would be the best option as
its cheap and your application is probably simple enough so you can
handle it that way.

> is there anything similar to RAC in the open-source community?
> what would be missing with linux-ha compared to RAC?

linux-ha has nothing to do with RAC.

Oracle RAC (Real Application Cluster) is an enterprise option of Oracle
9i and 10g (which, as already said, is *very* expensive).
The most important feature of RAC is that you can effectively have a HA
clustered Oracle database with several nodes.
You can even use every node independently (in parallel).
Oracle RAC knows that it runs in a cluster and that there are several
Oracle database nodes (and workers) that access the database (and shared
filesystem) at the same time. Hence, it has the (extremely complex)
necessary mechanisms to check which node owns a filesystem block,
whether it can write directly to disk or must send the data to the node
that owns the block, etc...
Sorry, totally beyond the scope of this list (I think we're quite
off-topic already ;)), would take half a day to explain the nitty gritty
of how Oracle RAC works.

Anyway, I guess it's not an option for you because it's very complex to
set up (even for seasoned Oracle admins), extremely expensive both
because of its licenses and because it requires high-performance (hence,
physical) shared disks (typically Fiber Channel + SAN or JBODs, JBODs
being the cheapest option, starting at 10K USD for the slowest ones).

AFAIK there's nothing comparable to RAC, neither opensource nor
proprietary/commercial.

What comes closest as opensource is MySQL Cluster [1] but I wouldn't bet
my business on it (yet ?). Dunno. I guess it has to mature a little -
it's not like Oracle didn't take several years to have a well-working RAC.
[1] http://www.mysql.com/products/database/cluster/

The cheapest and simplest option is to move the database on its own
cluster (of two nodes, for example), behind the application cluster.
Instead of having something like RAC that can do failover +
load-balancing at the same time, just set up a hot standby cluster,
where only one of the two database server nodes is active.
The other node is passive and will only be switched to when the first
node fails (that's "hot standby").
Use linux-ha (heartbeat) to monitor the database cluster nodes and
perform the takeover.
To keep the database on the passive node in sync, I would recommend
using MySQL with binary replication (master is the active node, slave is
the passive node) - it's very fast and simple to set up (the most
complex thing being the script to perform the slave->master switch when
doing the takeover).

Anyhow, SUSE Linux comes with all the software you'd need:
- - MySQL (including binary replication)
- - heartbeat for doing the cluster monitoring and resource failover
- - rsync to keep the application in sync on both application cluster nodes
- - drbd or nbd if you want cheap shared storage

Hope this helps, but I'm afraid HA clustering is a very complex topic.
The easiest solution to cluster an application is to have it running
inside e.g. Tomcat or JBoss as they take care of most of the ugly stuff
(but not of the database).
But if your application is batch-oriented and only driven by a
scheduling system (like cron, Quartz, ...), then Tomcat or JBoss are of
no help.

cheers
- --
-o) Pascal Bleser http://linux01.gwdg.de/~pbleser/
/\\ <pascal.bleser@xxxxxxxxx> <guru@xxxxxxxxxxx>
_\_v The more things change, the more they stay insane.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFEx+hLr3NMWliFcXcRAmk5AKCLd2LsUPYoa90L2t5dGqW9m+bqAACgpJ0A
HFSbp/o5Iy0Zz5bKKCt/SfY=
=2O2u
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: opensuse-unsubscribe@xxxxxxxxxxxx
For additional commands, e-mail: opensuse-help@xxxxxxxxxxxx

< Previous Next >