[opensuse] SSD with disabled TRIM capability?
I have two SSD in a machine running a fully updated openSUSE 13.1. For the first disk, I can verify that TRIM is supported by the disk's firmware, and that the kernel knows it has this capability: blackbox:~ # hdparm -I /dev/sda | grep TRIM * Data Set Management TRIM supported (limit 8 blocks) * Deterministic read ZEROs after TRIM blackbox:~ # lsblk -D /dev/sda NAME DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO sda 0 512B 2G 1 ├─sda1 0 512B 2G 1 ├─sda2 0 512B 2G 1 ├─sda3 0 512B 2G 1 └─sda4 0 512B 2G 1 However, for the second disc, things look different: blackbox:~ # hdparm -I /dev/sde | grep TRIM * Data Set Management TRIM supported (limit 8 blocks) * Deterministic read data after TRIM blackbox:~ # lsblk -D /dev/sde NAME DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO sde 0 512B 2G 0 ├─vgsys-kvm 0 512B 2G 0 └─vgsys-xfer 0 512B 2G 0 Why, or on what basis, does the kernel decide that the DISC-ZERO capability should be disabled for sde? What difference does it make when lsblk reports a 0 value for DISC-ZERO ? As is evident from the output above, sde is a physical device in the LVM volume group vgsys. I have set "issue_discards = 1" in the devices section of /etc/lvm/lvm.conf (but that doesn't seem to make a difference to lsblk). Suppose the ext4 file system in vgsys-xfer is mounted and fstrim is run on it, will that result in TRIM ATA commands being sent to the SSD? Regards, Olav -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Fri, Jul 11, 2014 at 2:40 PM, Olav Reinert <seroton10@gmail.com> wrote:
I have two SSD in a machine running a fully updated openSUSE 13.1.
For the first disk, I can verify that TRIM is supported by the disk's firmware, and that the kernel knows it has this capability:
blackbox:~ # hdparm -I /dev/sda | grep TRIM * Data Set Management TRIM supported (limit 8 blocks) * Deterministic read ZEROs after TRIM blackbox:~ # lsblk -D /dev/sda NAME DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO sda 0 512B 2G 1 ├─sda1 0 512B 2G 1 ├─sda2 0 512B 2G 1 ├─sda3 0 512B 2G 1 └─sda4 0 512B 2G 1
However, for the second disc, things look different:
blackbox:~ # hdparm -I /dev/sde | grep TRIM * Data Set Management TRIM supported (limit 8 blocks) * Deterministic read data after TRIM blackbox:~ # lsblk -D /dev/sde NAME DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO sde 0 512B 2G 0 ├─vgsys-kvm 0 512B 2G 0 └─vgsys-xfer 0 512B 2G 0
Why, or on what basis, does the kernel decide that the DISC-ZERO capability should be disabled for sde?
What difference does it make when lsblk reports a 0 value for DISC-ZERO ?
As is evident from the output above, sde is a physical device in the LVM volume group vgsys. I have set "issue_discards = 1" in the devices section of /etc/lvm/lvm.conf (but that doesn't seem to make a difference to lsblk).
Suppose the ext4 file system in vgsys-xfer is mounted and fstrim is run on it, will that result in TRIM ATA commands being sent to the SSD?
Both of your drives report that support TRIM, so yes if you have ext4 set to make discard calls, then for both drives deleted files should cause trim commands to be sent to the drive. Also, I think the lsblk output above is right. The DISC-ZERO flag as I understands it is reported as true only if "data read from a discarded sector is guaranteed to be zero." If you look at the hdparm output for the first drive it says "Deterministic read ZEROs after TRIM". Thus that drive satisfies the functional requirement of DISC-ZERO and lsblk is reporting that. If you look at the hdparm of the second drive it says "Deterministic read data after TRIM". Thus the drive itself is not claiming it guarantees to return zeros when discarded sectors are read. For clarity think about this typical timeline: - sector contains valid data - sector discard initiated via a trim command - a period of time later, garbage collection runs - post garbage collection the sector is not even allocated. ie. there is no physical NAND gates structure associated with the logical sector - a sector read command is sent to the drive - the drive says, I got absolutely nothing for you so I'll send back a bunch of zeros Likely both of the drives you have would behave per that timeline, but now lets move the sector read in front of the garbage collection (which may not happen for days, months, years for any given discarded sector). Now you have: - sector contains valid data - sector discard initiated via a trim command - a period of time goes by, but the sector has not been garbage collected - sector is still physically allocated - a sector read command is sent to the drive - the drive says, I have a choice, I can either send the old data because I still have it or I can return zeros. Your first drive says that for that last read command it will return zeros always. Your second drive says that for that last read command it "may" return something other than zeros, but if it does it will keep track of what it returned and always deterministicly return for future reads. The ATA spec provides the drive a way to report which of the above 2 behaviors it follows. Both hdparm and lsblk are simply interpreting what the drive is saying and letting you know what the drive is saying is its behavior. Greg -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Friday 11 July 2014 15.17:46 Greg Freemyer wrote:
On Fri, Jul 11, 2014 at 2:40 PM, Olav Reinert <seroton10@gmail.com> wrote:
I have two SSD in a machine running a fully updated openSUSE 13.1.
blackbox:~ # hdparm -I /dev/sda | grep TRIM
* Data Set Management TRIM supported (limit 8 blocks) * Deterministic read ZEROs after TRIM
blackbox:~ # hdparm -I /dev/sde | grep TRIM
* Data Set Management TRIM supported (limit 8 blocks) * Deterministic read data after TRIM
Your first drive says that for that last read command it will return zeros always.
Your second drive says that for that last read command it "may" return something other than zeros, but if it does it will keep track of what it returned and always deterministicly return for future reads.
The ATA spec provides the drive a way to report which of the above 2 behaviors it follows. Both hdparm and lsblk are simply interpreting what the drive is saying and letting you know what the drive is saying is its behavior.
I did in fact overlook the slight difference in wording of the second TRIM- related message produced by hdparm. Thank you for your explanation - it makes sense now that lsblk reports the way it does. Is there a way of checking whether TRIM-requests are actually sent to the drive? For example, is there a counter somewhere I can look at, similar to the counters for blocks read and written? Olav -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On July 13, 2014 7:05:48 AM EDT, Olav Reinert <seroton10@gmail.com> wrote:
On Friday 11 July 2014 15.17:46 Greg Freemyer wrote:
On Fri, Jul 11, 2014 at 2:40 PM, Olav Reinert <seroton10@gmail.com> wrote:
I have two SSD in a machine running a fully updated openSUSE 13.1.
blackbox:~ # hdparm -I /dev/sda | grep TRIM
* Data Set Management TRIM supported (limit 8 blocks) * Deterministic read ZEROs after TRIM
blackbox:~ # hdparm -I /dev/sde | grep TRIM
* Data Set Management TRIM supported (limit 8 blocks) * Deterministic read data after TRIM
Your first drive says that for that last read command it will return zeros always.
Your second drive says that for that last read command it "may" return something other than zeros, but if it does it will keep track of what it returned and always deterministicly return for future reads.
The ATA spec provides the drive a way to report which of the above 2 behaviors it follows. Both hdparm and lsblk are simply interpreting what the drive is saying and letting you know what the drive is saying is its behavior.
I did in fact overlook the slight difference in wording of the second TRIM- related message produced by hdparm. Thank you for your explanation - it makes sense now that lsblk reports the way it does.
Is there a way of checking whether TRIM-requests are actually sent to the drive? For example, is there a counter somewhere I can look at, similar to the counters for blocks read and written?
If there is, I don't know it. Fyi: most ssd drives treat trim as a synchronous command and flush the cache when it is processed. That has large negative performance effect if issued interspersed with normal I/o patterns. You can use fstrim called via cron to trim the filesystem during idle periods. Some newer drives support asynchronous trim, but I don't know how to tell which drives support it. Greg -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (2)
-
Greg Freemyer
-
Olav Reinert