[opensuse-amd64] Ext4 and >16TB - how?

We're trying to create a 27TB partition on openSUSE 11.2 x86_64 and seem to be running into issues with e2fsprogs. Here's what happens:
All hitherto posts related to this problem seem to indicate that the underlying tools, such as mkfs itself, are still 32-bit even on a 64-bit system and that fixing this is top priority. That was as of April 2009 - is it possible that this issue has not been resolved yet? Or, to be more specific: what do I need to do in order to create a 27TB ext4 filesystem? Thanks, Martin -- Rieke Computersysteme GmbH Hellerholz 5 D-82061 Neuried Email: martin@rhm.de -- To unsubscribe, e-mail: opensuse-amd64+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-amd64+help@opensuse.org

Martin Jungowski wrote:
Martin, This is what I get from my system on the binaries: /sbin/mke2fs: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.4, stripped /sbin/mkfs.ext4: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.4, stripped Not sure if this helps or not. -Matt -- To unsubscribe, e-mail: opensuse-amd64+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-amd64+help@opensuse.org

Matt, thanks. I tried that too, but the binaries themselves are 64-bit.
What I meant by "underlying tools" and "32-bit" is the following statement, which - as of today - can still be found on the kernel.org Wiki (see http://ext4.wiki.kernel.org/index.php/Ext4_Howto)
Thus there seems to be some 32-bit limit still implemented as we speak. Is there any way around it? I refuse to believe that Linux cannot handle partitions and filesystems larger than 16TB ;) Martin -- To unsubscribe, e-mail: opensuse-amd64+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-amd64+help@opensuse.org

On Tue, Dec 1, 2009 at 11:42 AM, Martin Jungowski <martin@rhm.de> wrote:
Have you tried JFS? It can support filesystems up to 32 PiB. I have been using it on a 9 TB RAID 6 array for a while now and it seems be working well. It has even survived a sudden power outage or two. Just something to consider. Preston -- To unsubscribe, e-mail: opensuse-amd64+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-amd64+help@opensuse.org

On Tue, Dec 01, Martin Jungowski wrote:
Use XFS. The limit there is 8 EiB and I would think it offers you all the features you are currently expecting from EXT4 (and more). -Daniel -- To unsubscribe, e-mail: opensuse-amd64+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-amd64+help@opensuse.org

Daniel, thanks for your suggestion. I've tried XFS before and we have ruled it out for one simple reason: while creating a 27TB slice worked perfectly fine and was finished within seconds it's impossible to repair an XFS filesystem this large. xfs_check crashes immediately (out of memory), xfs_repair hogs all available memory & swap and brings the machine to a screeching halt. We gave up after 72 hours and had to use the reset button for the first time in years. Since the tools don't seem to be fixed yet (I wonder what "high priority" could possibly mean if that statement was issued more than six months ago and nothing has been done yet) and it thus seems to be impossible to create an Ext4 filesystem larger than 16TB with a block size of 4096 my next question would be: has anyone ever tried a 8192 block size? Do all the filesystem repair/check tools still work? Is there anything that can possibly go wrong? Martin -- Rieke Computersysteme GmbH Hellerholz 5 D-82061 Neuried Email: martin@rhm.de -- To unsubscribe, e-mail: opensuse-amd64+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-amd64+help@opensuse.org

We are using JFS for nearly 10 years without any problems. But we don't use on our file server (we are using a GFS system). Personally I would prefer it over ext-X, xfs, and all others. It's stable and fast. It would be interesting to see how JFS response, when you repair a 30TB partion. Bye the way, does it make sense to use this amount of data on a single server ? Why don't you look for GFS or Cluster file systems ? Bye, Peer Martin Jungowski wrote:
-- Mit freundlichem Gruß Peer-Joachim Koch _________________________________________________________ Max-Planck-Institut fuer Biogeochemie Dr. Peer-Joachim Koch Hans-Knöll Str.10 Telefon: ++49 3641 57-6705 D-07745 Jena Telefax: ++49 3641 57-7705

Bye the way, does it make sense to use this amount of data on a single server ? Why don't you look for GFS or Cluster file systems ?
Clustering has been ruled out for several reasons, one of them being that it's not really necessary. We're dealing with huge amounts of data but we don't need to have them distributed over the network or accessed simultaneously. The system in question is an external replicated backup storage. I'll take a look at JFS. Since all other filesystems on our servers are ReiserFS, and we haven't had a single problem ever since we switched to Reiser five years ago, we've splitted our 27TB array into two 15TB and 12TB partitions. It's a tempororary solution, but at least it works. We'd prefer to have one large 27TB partition but are not willing to take any chances. -Martin -- Rieke Computersysteme GmbH Hellerholz 5 D-82061 Neuried Email: martin@rhm.de -- To unsubscribe, e-mail: opensuse-amd64+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-amd64+help@opensuse.org

Am Freitag 04 Dezember 2009 07:51:22 schrieb Dr.Peer-Joachim Koch:
Bye the way, does it make sense to use this amount of data on a single server ? Why don't you look for GFS or Cluster file systems ?
Is 30 TB still a lot? With contemporary 2TB disks, it's 15 disks plus redundancy (maybe a total of 20 or so). It fits easily in a 5HE server. -- Bernd Paysan "If you want it done right, you have to do it yourself" http://www.jwdt.com/~paysan/

On Fri, Dec 04, Martin Jungowski wrote:
Is there a bug open on it? That's the first time I have heard such a complaint for years. We have quite some customers happily using XFS with filesystems even far beyound your 27TB. -Daniel -- To unsubscribe, e-mail: opensuse-amd64+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-amd64+help@opensuse.org

Martin Jungowski wrote:
Martin, This is what I get from my system on the binaries: /sbin/mke2fs: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.4, stripped /sbin/mkfs.ext4: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.4, stripped Not sure if this helps or not. -Matt -- To unsubscribe, e-mail: opensuse-amd64+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-amd64+help@opensuse.org

Matt, thanks. I tried that too, but the binaries themselves are 64-bit.
What I meant by "underlying tools" and "32-bit" is the following statement, which - as of today - can still be found on the kernel.org Wiki (see http://ext4.wiki.kernel.org/index.php/Ext4_Howto)
Thus there seems to be some 32-bit limit still implemented as we speak. Is there any way around it? I refuse to believe that Linux cannot handle partitions and filesystems larger than 16TB ;) Martin -- To unsubscribe, e-mail: opensuse-amd64+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-amd64+help@opensuse.org

On Tue, Dec 1, 2009 at 11:42 AM, Martin Jungowski <martin@rhm.de> wrote:
Have you tried JFS? It can support filesystems up to 32 PiB. I have been using it on a 9 TB RAID 6 array for a while now and it seems be working well. It has even survived a sudden power outage or two. Just something to consider. Preston -- To unsubscribe, e-mail: opensuse-amd64+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-amd64+help@opensuse.org

On Tue, Dec 01, Martin Jungowski wrote:
Use XFS. The limit there is 8 EiB and I would think it offers you all the features you are currently expecting from EXT4 (and more). -Daniel -- To unsubscribe, e-mail: opensuse-amd64+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-amd64+help@opensuse.org

Daniel, thanks for your suggestion. I've tried XFS before and we have ruled it out for one simple reason: while creating a 27TB slice worked perfectly fine and was finished within seconds it's impossible to repair an XFS filesystem this large. xfs_check crashes immediately (out of memory), xfs_repair hogs all available memory & swap and brings the machine to a screeching halt. We gave up after 72 hours and had to use the reset button for the first time in years. Since the tools don't seem to be fixed yet (I wonder what "high priority" could possibly mean if that statement was issued more than six months ago and nothing has been done yet) and it thus seems to be impossible to create an Ext4 filesystem larger than 16TB with a block size of 4096 my next question would be: has anyone ever tried a 8192 block size? Do all the filesystem repair/check tools still work? Is there anything that can possibly go wrong? Martin -- Rieke Computersysteme GmbH Hellerholz 5 D-82061 Neuried Email: martin@rhm.de -- To unsubscribe, e-mail: opensuse-amd64+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-amd64+help@opensuse.org

We are using JFS for nearly 10 years without any problems. But we don't use on our file server (we are using a GFS system). Personally I would prefer it over ext-X, xfs, and all others. It's stable and fast. It would be interesting to see how JFS response, when you repair a 30TB partion. Bye the way, does it make sense to use this amount of data on a single server ? Why don't you look for GFS or Cluster file systems ? Bye, Peer Martin Jungowski wrote:
-- Mit freundlichem Gruß Peer-Joachim Koch _________________________________________________________ Max-Planck-Institut fuer Biogeochemie Dr. Peer-Joachim Koch Hans-Knöll Str.10 Telefon: ++49 3641 57-6705 D-07745 Jena Telefax: ++49 3641 57-7705

Bye the way, does it make sense to use this amount of data on a single server ? Why don't you look for GFS or Cluster file systems ?
Clustering has been ruled out for several reasons, one of them being that it's not really necessary. We're dealing with huge amounts of data but we don't need to have them distributed over the network or accessed simultaneously. The system in question is an external replicated backup storage. I'll take a look at JFS. Since all other filesystems on our servers are ReiserFS, and we haven't had a single problem ever since we switched to Reiser five years ago, we've splitted our 27TB array into two 15TB and 12TB partitions. It's a tempororary solution, but at least it works. We'd prefer to have one large 27TB partition but are not willing to take any chances. -Martin -- Rieke Computersysteme GmbH Hellerholz 5 D-82061 Neuried Email: martin@rhm.de -- To unsubscribe, e-mail: opensuse-amd64+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-amd64+help@opensuse.org

Am Freitag 04 Dezember 2009 07:51:22 schrieb Dr.Peer-Joachim Koch:
Bye the way, does it make sense to use this amount of data on a single server ? Why don't you look for GFS or Cluster file systems ?
Is 30 TB still a lot? With contemporary 2TB disks, it's 15 disks plus redundancy (maybe a total of 20 or so). It fits easily in a 5HE server. -- Bernd Paysan "If you want it done right, you have to do it yourself" http://www.jwdt.com/~paysan/

On Fri, Dec 04, Martin Jungowski wrote:
Is there a bug open on it? That's the first time I have heard such a complaint for years. We have quite some customers happily using XFS with filesystems even far beyound your 27TB. -Daniel -- To unsubscribe, e-mail: opensuse-amd64+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-amd64+help@opensuse.org
participants (6)
-
Bernd Paysan
-
Daniel Rahn
-
Dr.Peer-Joachim Koch
-
Martin Jungowski
-
Matt Hayes
-
Preston Hagar