On Fri, Jun 03, 2016 at 11:26:22AM +0300, Andrei Borzenkov wrote:
On Fri, Jun 3, 2016 at 11:06 AM, Johannes Thumshirn email@example.com wrote:
OK, this card above connects the Serial ATA AHCI via PCIe to your CPU instead of using the SoC's or Chipset's AHCI.  is a good read for that.
Do you say that this card internally has SATA link between actual storage and host controller? Do you *know* it? How can it achieve 1170 MB/s (from data sheet) during sequential read in this case?
Do not be confused by reusing AHCI as host controller interface; it does not imply that internal data transfer must be SATA.
OK, I think we're mixing up the physical and protocol layers here. I actually don't care what's the physical connection internally. For the kernel it is SATA with all it's up and downs.
Protocol and kernel driver wise (which I think is what is of interest here) AHCI will be handled by ahci.ko, libahci.ko, scsi_mod.ko and sd.ko. This will give you the beloved /dev/sd* devices.  has a nice diagram for the principal operation.
Yes. And that is exact reason for existence of AHCI emulation in NVMe devices - because they do not require additional drivers and are fully transparent. It does not really mean that NVMe devices are using SATA *internally*.
Correct. For the kernel this is just a plain old libahci/libata device. Which one one hand gives you the warm and cosy feeling of /dev/sd* and the scsi midlayer and all the libata (and sub libraries) goodness. But this also implies only one queue and a maximal queue depth of 32 (with NCQ). This might not be a problem at all for given the target platform is a laptop, but highly I/O intensive parallel workloads (i.e. a kernel compile) highly benefit from the true parallelism of a NVMe device. Take for example the device I've posted as a M.2 NVMe example. It implements 8 Hardware queues
# nvme get-feature /dev/nvme0 -f 7 get-feature:0x07 (Number of Queues), Current value: 0x070007
# ls /sys/block/nvme0n1/mq/ 0 1 2 3 4 5 6 7
These hardware queues now get mapped to per-cpu software queus: # cat /sys/block/nvme0n1/mq/*/cpu_list 0, 1, 12, 13 2, 14 3, 4, 15, 16 5, 17 6, 7, 18, 19 8, 20 9, 10, 21, 22 11, 23
Also no I/O scheduling of any kind is done on these devices.
So if Greg really want's to go down the road and by one of these, I'd go for a true NVMe one.
Let me quote  Section 3 NVMe and AHCI Comparison: "While SATA Express/AHCI has the benefit of legacy software compatibility, the AHCI interface does not deliver optimal performance when talking to a PCIe SSD. This is because AHCI was developed at a time when the purpose of the HBA in a system was to connect the CPU/Memory subsystem with the much slower rotating media-based storage subsystem. Such an interface has some inherent inefficiency when applied to SSD devices, which behave much more like DRAM than spinning media. NVMe has been designed from the ground up to exploit the low latency of today’s PCIe- based SSD’s, and the parallelism of today’s CPU’s, platforms, and applications. [...]"
But enough of that, I didn't intend to start a flame war here.
I do not consider it flame war at all. Technology (both NVMe and M.2) is relatively new, there are many marketing buzzwords but not as much technical facts. So it is good if we all can get better understanding.
Good to hear. Marketing buzzword wise, these cards are called SATA Express IIRC.