[opensuse] Idle Time Garbage Collection (ITGC) On (openSUSE) Linux
Hi, My next question relating to using SSDs on openSUSE is whether they enable or implement the so-called Idle Time Garbage Collection used by SandForce SSD chips (such as the SandForce 1222 chips on the RevoDrive x2 card I bought). Nothing I've found has made it clear whether this is something that requires OS / filesystem support, but the descriptions I've found (which have been very sketchy) suggest filesystem support is required. Randall Schulz -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
El 12/06/11 00:57, Randall R Schulz escribió:
Hi,
My next question relating to using SSDs on openSUSE is whether they enable or implement the so-called Idle Time Garbage Collection used by SandForce SSD chips (such as the SandForce 1222 chips on the RevoDrive x2 card I bought).
afaics, "Idle Time Garbage Collection" is implemented in the controller.
Nothing I've found has made it clear whether this is something that requires OS / filesystem support, but the descriptions I've found (which have been very sketchy) suggest filesystem support is required.
TRIM needs support in the filesystem, ext4 has a support for it, using the "discard" mount option, which is apparently on its infancy. Cheers. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Saturday June 11 2011, Cristian Rodríguez wrote:
El 12/06/11 00:57, Randall R Schulz escribió:
Hi,
My next question relating to using SSDs on openSUSE is whether they enable or implement the so-called Idle Time Garbage Collection used by SandForce SSD chips (such as the SandForce 1222 chips on the RevoDrive x2 card I bought).
afaics, "Idle Time Garbage Collection" is implemented in the controller.
Nothing I've found has made it clear whether this is something that requires OS / filesystem support, but the descriptions I've found (which have been very sketchy) suggest filesystem support is required.
TRIM needs support in the filesystem, ext4 has a support for it, using the "discard" mount option, which is apparently on its infancy.
Cheers.
Thanks. The only thing I can't make sense of is how the controller knows which sectors (or pages or blocks or whatever they call them on SSDs) are being used for file contents and which are not without filesystem interaction. One description I found said the controller "asks the operating system" if a block is in use. TRIM is apparently not relevant because as a RAID device, the TRIM command cannot be passed through to the individual SSD controller chips. (I don't really understand that, either.) These presumed facts are all I've been able to discover. (I did also find that ext4 supports TRIM.) Randall Schulz -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Sunday June 12 2011, Randall R Schulz wrote:
On Saturday June 11 2011, Cristian Rodríguez wrote:
El 12/06/11 00:57, Randall R Schulz escribió:
Hi,
My next question relating to using SSDs on openSUSE is whether they enable or implement the so-called Idle Time Garbage Collection used by SandForce SSD chips (such as the SandForce 1222 chips on the RevoDrive x2 card I bought).
afaics, "Idle Time Garbage Collection" is implemented in the controller.
Yes, that much I understood. I just couldn't see how it could "garbage collect" anything if the OS didn't tell it which logical sectors it was actually using.
Nothing I've found has made it clear whether this is something that requires OS / filesystem support, but the descriptions I've found (which have been very sketchy) suggest filesystem support is required.
TRIM needs support in the filesystem, ext4 has a support for it, using the "discard" mount option, which is apparently on its infancy.
Cheers.
Thanks.
The only thing I can't make sense of is how the controller knows which sectors (or pages or blocks or whatever they call them on SSDs) are being used for file contents and which are not without filesystem interaction. One description I found said the controller "asks the operating system" if a block is in use.
After reading a rather contentious thread on www.ocztechnologyforum.com [1] I think I understand, at least in very rudimentary terms, how the ITGC thing works.
...
[1] <http://www.ocztechnologyforum.com/forum/showthread.php?60219-How-s-GC-gonna-work-for-RAID> Randall Schulz -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Sun, Jun 12, 2011 at 12:45 PM, Randall R Schulz <rschulz@sonic.net> wrote:
On Sunday June 12 2011, Randall R Schulz wrote:
On Saturday June 11 2011, Cristian Rodríguez wrote:
El 12/06/11 00:57, Randall R Schulz escribió:
Hi,
My next question relating to using SSDs on openSUSE is whether they enable or implement the so-called Idle Time Garbage Collection used by SandForce SSD chips (such as the SandForce 1222 chips on the RevoDrive x2 card I bought).
afaics, "Idle Time Garbage Collection" is implemented in the controller.
Yes, that much I understood. I just couldn't see how it could "garbage collect" anything if the OS didn't tell it which logical sectors it was actually using.
Nothing I've found has made it clear whether this is something that requires OS / filesystem support, but the descriptions I've found (which have been very sketchy) suggest filesystem support is required.
TRIM needs support in the filesystem, ext4 has a support for it, using the "discard" mount option, which is apparently on its infancy.
Cheers.
Thanks.
The only thing I can't make sense of is how the controller knows which sectors (or pages or blocks or whatever they call them on SSDs) are being used for file contents and which are not without filesystem interaction. One description I found said the controller "asks the operating system" if a block is in use.
After reading a rather contentious thread on www.ocztechnologyforum.com [1] I think I understand, at least in very rudimentary terms, how the ITGC thing works.
...
[1] <http://www.ocztechnologyforum.com/forum/showthread.php?60219-How-s-GC-gonna-work-for-RAID>
Randall Schulz
I just read it. Very uninformative. It talks not at all about the core issue of data overwrites and how that causes partially valid EBs. And that a "Garbage Collector" is needed even in the absence of file deletes. Think about what one poster actually said. He ran IOmeter against a raid device with SSDs behind it and his performance sucked. Then the next morning it was good again. There is no way a single SSD which is part of a RAID can interpret the filesystem metadata and determine which pages are free. And few if any Raid Controllers pass down trim commands yet. Especially if its Raid5 or 6, you have to trim in units of stripes. ie. Either an entire stripe is valid, or the entire thing is invalid. That the way Raid5 and 6 are designed There is no way a single raid component drive can make that determination. OTOH, if there is lots of overwriting going on due I/Ometer writing to the same sectors over and over then each overwrite leaves a partially used EB behind. Letting the SSD consolidate all those partial EBs into full EBs overnight would cause the performance increase the user saw. Greg -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
All, Since I don't recall this being discussed here before, I wrote a wiki article about it. Please review and correct: http://en.opensuse.org/SDB:SSD_Idle_Time_Garbage_Collection_support Thanks Greg On Sun, Jun 12, 2011 at 1:39 PM, Greg Freemyer <greg.freemyer@gmail.com> wrote:
On Sun, Jun 12, 2011 at 12:45 PM, Randall R Schulz <rschulz@sonic.net> wrote:
On Sunday June 12 2011, Randall R Schulz wrote:
On Saturday June 11 2011, Cristian Rodríguez wrote:
El 12/06/11 00:57, Randall R Schulz escribió:
Hi,
My next question relating to using SSDs on openSUSE is whether they enable or implement the so-called Idle Time Garbage Collection used by SandForce SSD chips (such as the SandForce 1222 chips on the RevoDrive x2 card I bought).
afaics, "Idle Time Garbage Collection" is implemented in the controller.
Yes, that much I understood. I just couldn't see how it could "garbage collect" anything if the OS didn't tell it which logical sectors it was actually using.
Nothing I've found has made it clear whether this is something that requires OS / filesystem support, but the descriptions I've found (which have been very sketchy) suggest filesystem support is required.
TRIM needs support in the filesystem, ext4 has a support for it, using the "discard" mount option, which is apparently on its infancy.
Cheers.
Thanks.
The only thing I can't make sense of is how the controller knows which sectors (or pages or blocks or whatever they call them on SSDs) are being used for file contents and which are not without filesystem interaction. One description I found said the controller "asks the operating system" if a block is in use.
After reading a rather contentious thread on www.ocztechnologyforum.com [1] I think I understand, at least in very rudimentary terms, how the ITGC thing works.
...
[1] <http://www.ocztechnologyforum.com/forum/showthread.php?60219-How-s-GC-gonna-work-for-RAID>
Randall Schulz
I just read it. Very uninformative. It talks not at all about the core issue of data overwrites and how that causes partially valid EBs. And that a "Garbage Collector" is needed even in the absence of file deletes.
Think about what one poster actually said. He ran IOmeter against a raid device with SSDs behind it and his performance sucked. Then the next morning it was good again.
There is no way a single SSD which is part of a RAID can interpret the filesystem metadata and determine which pages are free.
And few if any Raid Controllers pass down trim commands yet.
Especially if its Raid5 or 6, you have to trim in units of stripes. ie. Either an entire stripe is valid, or the entire thing is invalid. That the way Raid5 and 6 are designed There is no way a single raid component drive can make that determination.
OTOH, if there is lots of overwriting going on due I/Ometer writing to the same sectors over and over then each overwrite leaves a partially used EB behind.
Letting the SSD consolidate all those partial EBs into full EBs overnight would cause the performance increase the user saw.
Greg
-- Greg Freemyer Head of EDD Tape Extraction and Processing team Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer CNN/TruTV Aired Forensic Imaging Demo - http://insession.blogs.cnn.com/2010/03/23/how-computer-evidence-gets-retriev... The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Sunday June 12 2011, Greg Freemyer wrote:
All,
Since I don't recall this being discussed here before, I wrote a wiki article about it.
Please review and correct: http://en.opensuse.org/SDB:SSD_Idle_Time_Garbage_Collection_support
Thank you!
Thanks Greg
Randall Schulz -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Sun, Jun 12, 2011 at 1:10 AM, Cristian Rodríguez <crrodriguez@opensuse.org> wrote:
El 12/06/11 00:57, Randall R Schulz escribió:
Hi,
My next question relating to using SSDs on openSUSE is whether they enable or implement the so-called Idle Time Garbage Collection used by SandForce SSD chips (such as the SandForce 1222 chips on the RevoDrive x2 card I bought).
afaics, "Idle Time Garbage Collection" is implemented in the controller.
Nothing I've found has made it clear whether this is something that requires OS / filesystem support, but the descriptions I've found (which have been very sketchy) suggest filesystem support is required.
TRIM needs support in the filesystem, ext4 has a support for it, using the "discard" mount option, which is apparently on its infancy.
Cheers.
Randall, I'm not a SSD designer, etc., but I like to keep up and I have talked to a couple "data recovery" people that have tried to reverse engineer how a SSD works. The below is based on that process, and may not relate to reality at all, but I think it is pretty close. === "mount --discard" is implemented by interlacing trims with normal i/o. For current generation SSDs, this reduces performance instead of increasing it. Windows 7 implements a background discard feature. Thus if you monitor the SATA bus, you will see trim commands coming even when there is no normal i/o. A far more efficient approach. (Apparently SSDs flush their cache when the get a trim command, so interlacing them is like adding tons of flush commands.) There is a userspace tool in 11.4 (and newer): fstrim. It is part of the util-linux package I believe. It is simple userspace tool to invoke a kernel filesystem scan for free blocks and send trim commands to an underlying SSD. If you set fstrim up to run from cron every now and then, you have a good situation. It typically takes less than 30 seconds, so just calling it during boot might be good enough for a laptop that is booted every week or two. As to garbage collection. I don't know what the sandforce does, but in general all modern SSDs need to have a Erase Block (EB) consolidator. Modern SSDs have EBs which are typically 128KB, but they allow those EBs to hold non-contiguous pages of data. The key thing to understand is EB re-use is all or nothing. To re-write any of the data, you have to Erase the entire EB, then re-write it. In very slow / cheap thumb drives, they live with the restriction and write speeds can be as slow MBs/minute. In a decent SSD, they have much more complicated algorithms to make the drive much more responsive. So lets assume a SSD tracks and maps data in 4KB pages, even though it has a 128KB EB. So a newly allocated and written EB would typically have 32 4KB pages in it. If some of the pages are overwritten, the most efficient thing for the SSD to do is mark the pages in the allocated EB free, and queue the new pages to go to a newly allocated EB. When the SSD has 32 pages of new data queued, it will allocate a new EB and write out the data. Seems pretty smart, but think about the original EB. Over time as data is overwritten it goes from 32 of 32 pages in use to less and less pages in use. At some point the SSD controller says, what a waste of space, let me consolidate some of these partially used EBs. That consolidation activity is what I assume Sandforce is calling garbage collection. And you can see it is needed even in the absence of OS level support. And it does not need to understand the filesystem layout etc. The trouble is if your drive is 100% full or close too it, the EBs will tend to be fairly full as well, and so you can end up with all the EBs in use with 75% or more good data. Having a consolidator be efficient with such a high average in-use percentage is very difficult. Thus the concept of trim was created a few years ago. As I understand it, the goal is for the SSD to have greater knowledge of what pages are really in use, vs. not in use. And with that knowledge be able to more efficiently consolidate EBs for background erasing. Thus, my opinion is that the SandForce Idle Time Garbage Collection is implemented in the SSD and does not need Linux support to work, but calling fstrim on occasion will let the SSD know better what data pages are free and thus allow the process to work better. Hope that helps Greg -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Sunday June 12 2011, Greg Freemyer wrote:
...
Randall,
I'm not a SSD designer, etc., but I like to keep up and I have talked to a couple "data recovery" people that have tried to reverse engineer how a SSD works.
The below is based on that process, and may not relate to reality at all, but I think it is pretty close.
...
Hope that helps Greg
Thanks again. I've kind of got two threads going in my mind on this topic. One is I'd like to have a decent understanding of the technology. Your Wiki and this post have helped a lot (actually, I've yet to read that most recent post, but I will). The other thing I'd like is a high-level overview of which filesystems (and which options) work well (in what ways) on SSDs. I've found a couple of articles by the Ts'o guy about properly formatting (partitioning) and SSD for optimum performance. But the Phoronix benchmarks I found looked all over the map and the associated discussion on their bulletin board raised all sorts of questions about which options to use and the consequences for speed and data safety. So far, it's more a confusion to me than any kind of enlightenment. I hope that I can learn enough by Tuesday evening when my SSD card arrives to put it to reasonably good use. C'est la vie. C'est la guerre... Randall Schulz -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Sun, Jun 12, 2011 at 2:54 PM, Randall R Schulz <rschulz@sonic.net> wrote:
On Sunday June 12 2011, Greg Freemyer wrote:
...
Randall,
I'm not a SSD designer, etc., but I like to keep up and I have talked to a couple "data recovery" people that have tried to reverse engineer how a SSD works.
The below is based on that process, and may not relate to reality at all, but I think it is pretty close.
...
Hope that helps Greg
Thanks again.
I've kind of got two threads going in my mind on this topic. One is I'd like to have a decent understanding of the technology. Your Wiki and this post have helped a lot (actually, I've yet to read that most recent post, but I will).
The other thing I'd like is a high-level overview of which filesystems (and which options) work well (in what ways) on SSDs. I've found a couple of articles by the Ts'o guy about properly formatting (partitioning) and SSD for optimum performance. But the Phoronix benchmarks I found looked all over the map and the associated discussion on their bulletin board raised all sorts of questions about which options to use and the consequences for speed and data safety.
The biggest thing to get right is the partition alignment. I would ensure all partitions are on 1 MB boundaries. (Modern Yast Partitioner should do that automatically.) Then read http://en.opensuse.org/SDB:SSD_discard_%28trim%29_support The biggest thing from that is with 11.4, use fstrim called from cron or bootup script. Greg -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
participants (3)
-
Cristian Rodríguez
-
Greg Freemyer
-
Randall R Schulz