Mailinglist Archive: opensuse (686 mails)

< Previous Next >
Re: [opensuse] SSD and smartctl: Percentage Used Endurance Indicator: 34 ?
On Sun, Jul 7, 2013 at 1:42 PM, Carlos E. R.
<robin.listas@xxxxxxxxxxxxxx> wrote:
Hash: SHA1

On Sunday, 2013-07-07 at 19:26 +0200, MarkusGMX wrote:

Am 07/07/13 04:00, schrieb Rajko:


(rev 1) == 7 0x008 1 34~ Percentage Used Endurance


Hmm, 33.5GB per day cannot be possible.
The whole /var has only ~18GB which would mean writing the complete /var
approx. twice per day, which surely does not happen.

However... you can not write, say, 100 bytes on flash media, you have to
write the entire 32KiB block. That is, read it, modify the 100 bytes you
need (in memory) and then write again the 32 KiB block. That makes for a
much larger figure.


I don't know where your 32 KiB block came from. Seems too big to be a
page and too small to be a erase block.

From a physical perspective you have to write an entire erase block
(EB) at time. EBs these days are often 2MiB or even bigger and even 3
or 4 years ago were 128KiB or bigger.

In 10 year old flash designs that meant to modify anything EB size or
smaller you did a read/modify/write (RMW) cycle of an entire EB. I
suspect that is what you are describing.

You need to give the SSD devs some credit, they realized that was a
stupid design a long time ago.

From day 1, anything worthy of the name SSD put a mapping layer above
the flash storage to allow smart data tracking algorithms to avoid the
RMW cycle as much as possible. What they do is track the EB's data at
a page level. Let's say a page is 4KB. (The linux kernel uses a 4KB
page most of the time and so does Windows, so it is the most logical
thing for a SSD designer to do as well.)

Now the _firmware_ in the SSD controller requires writes to be a full
page at a time. If you write a full page, then all the firmware does
is put the 4KB in a small cache and invalidate the single page of data
in the EB. Note that is very fast and no data write to the flash has
even taken place yet.

The firmware accumulates data in the cache until there is a full EBs
worth, then it grabs an empty EB and write out the contents of the
flash. the firmware then updates the page mapping so it knows where
to go find that page when a read request comes in.

Notice that there are basically no extraneous EB writes. Every time a
EB is written to it is with a brand new set of data, no RMW cycles at
all. If you write out less than a page worth of data, then a RMW
cycle has to take place, but the linux kernel actually does that
anyway. ie. The linux block layer works with pages. If you write out
data that is smaller than a page in size, then it will implement a RMW
cycle of its own just to keep the data handling in the kernel easy.
Thus the SSD only sees page size reads and writes with normal
filesystem i/o.

The end result is that EBs are now able to grow as big as makes sense
to the chip designers and the SSD controller implements the page
management mapping layer that keeps it all efficient.

So that catches you up to 5+ year old designs. The SSD devs said, I
can be smarter than that and I've got a little microprocessor handling
the mapping anyway, why don't I start adding compression,
de-duplication, consolidation of partially used EBs (garbage
collection), etc.

The reality is that modern SSDs have all of that and more, thus based
on the complexity of the mapping algoritym one of the hardest things
to do is "forensically wipe" a SSD. I tell my clients that in general
it can't be done, they should use physical destruction instead.

(For those about to propose a ATA Security Erase, that is implemented
in the firmware and several SSDs have had faulty implementations that
left the data in place. Basically it can't be trusted as a general

To unsubscribe, e-mail: opensuse+unsubscribe@xxxxxxxxxxxx
To contact the owner, e-mail: opensuse+owner@xxxxxxxxxxxx

< Previous Next >
Follow Ups