Jan Kara changed bug 1168089
What Removed Added
Status CONFIRMED RESOLVED
Resolution --- WONTFIX

Comment # 4 on bug 1168089 from
OK, so I've tracked this down by "bisecting" operations inside reaim workfile
(which was a bit tedious but not too bad). The small workfile that still
reproduces the regression is:

20 disk_dio_rd
20 sync_disk_cp
20 sync_disk_update

You could probably still drop sync_disk_update from it but I didn't really try
since at this point I understood what's going on. The culprit is that
disk_dio_rd opens tmpa.common file with O_DIRECT and reads from it.
sync_disk_cp also opens tmpa.common file and reads from it but without
O_DIRECT. These files are different for different reaim processes (each process
runs in its dedicated dir) but each process alternates between these two
workloads so direct IO read usually has runs on a file with existing page
cache. And iomap code (unlike legacy direct IO) does evict page cache even
during direct IO reads which then forces sync_disk_cp to reread the file from
the disk.

When I changed reaim not to share the same file for DIO and buffered test, the
regression went away. Since mixing of buffered & direct IO is not really
interesting wrt performance, I think there's no kernel bug to fix. But as a
side note there's now 54752de928c "iomap: Only invalidate page cache pages on
direct IO writes" sitting in linux-next which makes iomap DIO code behave the
same way as legacy DIO code in this regard so that will make the regression go
away as well.

I'm somewhat undecided whether we should modify reaim to not use the same file
for direct IO read tests and buffered IO read tests. I don't think the sharing
of the file is particularly interesting usecase but OTOH if all the accesses
are reads, it isn't completely insane either. So I'll probably leave it for
now.


You are receiving this mail because: