Hi Folks, I've been working on a project for several years now that requires lots of disk space. We've got three servers, each with 40 directly connected 1-TB SATA disks. Now, the project is expanding and will require as many as 98 directly connected disks. I know that I can get the hardware and RAID (JBOD) controllers to mechanize this, but I'm not sure about the OS. How many individual disk (/dev/sdxx) drives can Linux support? BTW, we need directly connected disks because of the bandwidth limits that NFS throws up. I'd be happy to listen to alternatives. Regards, Lew -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Mon, 2009-07-06 at 08:42 -0700, Lew Wolfgang wrote:
Hi Folks, I've been working on a project for several years now that requires lots of disk space. We've got three servers, each with 40 directly connected 1-TB SATA disks. Now, the project is expanding and will require as many as 98 directly connected disks. I know that I can get the hardware and RAID (JBOD) controllers to mechanize this, but I'm not sure about the OS. How many individual disk (/dev/sdxx) drives can Linux support?
Last number I saw was 2,304. But I think that kind of architecture (and even your number) is unmanageable/unsustainable. Your going to need some high-end controllers to get near that number.
BTW, we need directly connected disks because of the bandwidth limits that NFS throws up. I'd be happy to listen to alternatives.
Fiber channel? Although I really suspect that if you tested iSCSI you'd find the bandwidth sufficient (I've just heard the cannot-do-it-due-to-bandwidth allot and found it to be very rarely true). There is really no way to introduce HSM (hierarchical storage management) into your app? If the answer is no, I don't mean to be harsh, but your application is effectively broken (it certainly will stop eventually). Maybe use DASD for recent/active data-sets and migrate less active data-sets to iSCSI attached storage? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Mon, Jul 6, 2009 at 11:53 AM, Adam Tauno
Williams
On Mon, 2009-07-06 at 08:42 -0700, Lew Wolfgang wrote:
Hi Folks,
<snip>
Fiber channel?
HP Left Hand is now supporting 10 Gbit / iSCSI. I haven't seen actual performance numbers, but it is likely as fast or faster than FC and I believe less expensive.
Although I really suspect that if you tested iSCSI you'd find the bandwidth sufficient (I've just heard the cannot-do-it-due-to-bandwidth allot and found it to be very rarely true).
Totally agree. Especially if you put a 10 Gbit card in the server. Not cheap, but not that expensive. Even if the you use a low cost 1 Gbit/sec iSCSI server like the DroboPro you could use raid to effectively merge multiple of those into a very high performance unit. I have not seen any reviews yet of the DroboPro, so if you buy / benchmark one I'd love to see the perf numbers. It will cost considerably more, but the Dell EqualLogic is highly regarded from what I've seen. You can bind up to 4 1 Gbit nics per shelf I believe, and then use software raid in linux to ramp up to even higher performance.
There is really no way to introduce HSM (hierarchical storage management) into your app?
And Adam throws me a softball. I'm presenting the Open HSM project at OLS next week. Not production ready yet, but I hope it gets there in the next year or less. fyi: I was just an advisor on the project. It was developed by a college grad team as part of a series of competitions. They earned the right to present at OLS and have asked me to do the presentation. Anyway with OHSM you can migrate files / folders between storage tiers without having to change the filepath / name. They have it working with ext2 as a prototype so far. The concept is to build a LVM volume from different PVs that are on different classes of storage. Then you use various policies to move data between the tiers. It still takes time to move it, but that can happen in the background, or at night. And if you just need to access the data for a quick project, you don't even have to move it. So as am example you tell OHSM to move a project directory from a slow storage array up to a higher speed array, and then you can immediately start working with the data, and as you work the data is being moved between the tiers transparently to you. My hope is we can talk the ext4 kernel team to accept ohsm into their mainline code. The ext4 version should be very non-intrusive and just have 4 or 5 small patches required. Of course their will be a separate ohsm kernel module, but being more or less standalone that could even be housed in the OBS for a while. The key is getting those small patches into ext4. Greg -- Greg Freemyer Head of EDD Tape Extraction and Processing team Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer Preservation and Forensic processing of Exchange Repositories White Paper - http://www.norcrossgroup.com/forms/whitepapers/tng_whitepaper_fpe.html The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Adam Tauno Williams wrote:
On Mon, 2009-07-06 at 08:42 -0700, Lew Wolfgang wrote:
I've been working on a project for several years now that requires lots of disk space. We've got three servers, each with 40 directly connected 1-TB SATA disks. Now, the project is expanding and will require as many as 98 directly connected disks. I know that I can get the hardware and RAID (JBOD) controllers to mechanize this, but I'm not sure about the OS. How many individual disk (/dev/sdxx) drives can Linux support?
Last number I saw was 2,304. But I think that kind of architecture (and even your number) is unmanageable/unsustainable. Your going to need some high-end controllers to get near that number.
Hi Adam, I was thinking about the 3Ware 9690SA-8I, which supports up to 128 individual disks. I could divide the disks among as many controllers as I can fit in one motherboard. This application is rather non-standard. It basically sorts and shuffles lots of data from one disk to another in strings of six. Not much data processing, but lots of disk I/O. The disks are all individual (no RAID) and have to be hot-swappable and accessible from the front of the rack. The reason for this is that data is fed into it by inserting 1-TB disks from another process. Processed data is delivered to subsequent systems by physically moving some of the disks.
BTW, we need directly connected disks because of the bandwidth limits that NFS throws up. I'd be happy to listen to alternatives.
Fiber channel? Although I really suspect that if you tested iSCSI you'd find the bandwidth sufficient (I've just heard the cannot-do-it-due-to-bandwidth allot and found it to be very rarely true).
I tried iSCSI via Fiber Channel recently and found the bandwidth lacking. Maybe I wasn't doing it right? (It was RHEL 5.2!)
There is really no way to introduce HSM (hierarchical storage management) into your app? If the answer is no, I don't mean to be harsh, but your application is effectively broken (it certainly will stop eventually). Maybe use DASD for recent/active data-sets and migrate less active data-sets to iSCSI attached storage?
Some here might argue that the app is broken! :-) But we have to live within the constraints demanded by the project. I don't think that HSM is a fit for this particular requirement. Thanks for your suggestions. Lew -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
participants (3)
-
Adam Tauno Williams
-
Greg Freemyer
-
Lew Wolfgang