Arjan van de Ven
On Fri, 2004-01-02 at 02:30, Peter Osterlund wrote:
The packet writing code has the restriction that a bio must not span a packet boundary. (A packet is 32*2048 bytes.) If the page when mapped to disk starts 2kb before a packet boundary, merge_bvec_fn therefore returns 2048, which is less than len, which is 4096 if the whole page is mapped, so the bio_add_page() call fails.
devicemapper has similar restrictions for raid0 format; in that case it's device-mappers job to split the page/bio. Just as it is UDF's task to do the same I suspect...
Old versions of the packet writing code did just that, but Jens told me that bio splitting was evil, so when the merge_bvec_fn functionality was added to the kernel, I started to use it. http://lists.suse.com/archive/packet-writing/2002-Aug/0044.html If merge_bvec_fn is not supposed to be able to handle the need of the packet writing code, I can certainly resurrect my bio splitting code. Btw, for some reason, this bug is not triggered when using the UDF filesystem on a CDRW. I've only seen it with the ext2 filesystem. -- Peter Osterlund - petero2@telia.com http://w1.894.telia.com/~u89404340
On Fri, Jan 02 2004, Peter Osterlund wrote:
Arjan van de Ven
writes: On Fri, 2004-01-02 at 02:30, Peter Osterlund wrote:
The packet writing code has the restriction that a bio must not span a packet boundary. (A packet is 32*2048 bytes.) If the page when mapped to disk starts 2kb before a packet boundary, merge_bvec_fn therefore returns 2048, which is less than len, which is 4096 if the whole page is mapped, so the bio_add_page() call fails.
devicemapper has similar restrictions for raid0 format; in that case it's device-mappers job to split the page/bio. Just as it is UDF's task to do the same I suspect...
Old versions of the packet writing code did just that, but Jens told me that bio splitting was evil, so when the merge_bvec_fn functionality was added to the kernel, I started to use it.
http://lists.suse.com/archive/packet-writing/2002-Aug/0044.html
Splitting is evil, but unfortunately it's a necessary evil... There are a few kernel helpers to make supporting it easier (see bio_split()). Not so sure how well that'll work for you, you may have to do the grunt work yourself.
If merge_bvec_fn is not supposed to be able to handle the need of the packet writing code, I can certainly resurrect my bio splitting code.
Only partially. Read my email: you _must_ accept a page addition to an empty bio. You can refuse others. For the single page case, you may need to split.
Btw, for some reason, this bug is not triggered when using the UDF filesystem on a CDRW. I've only seen it with the ext2 filesystem.
Does UDF use mpage? The fact that it doesn't trigger on UDF doesn't mean that packet writing isn't breaking the API :) -- Jens Axboe
Jens Axboe
On Fri, Jan 02 2004, Peter Osterlund wrote:
Arjan van de Ven
writes: On Fri, 2004-01-02 at 02:30, Peter Osterlund wrote:
The packet writing code has the restriction that a bio must not span a packet boundary. (A packet is 32*2048 bytes.) If the page when mapped to disk starts 2kb before a packet boundary, merge_bvec_fn therefore returns 2048, which is less than len, which is 4096 if the whole page is mapped, so the bio_add_page() call fails.
devicemapper has similar restrictions for raid0 format; in that case it's device-mappers job to split the page/bio. Just as it is UDF's task to do the same I suspect...
Old versions of the packet writing code did just that, but Jens told me that bio splitting was evil, so when the merge_bvec_fn functionality was added to the kernel, I started to use it.
http://lists.suse.com/archive/packet-writing/2002-Aug/0044.html
Splitting is evil, but unfortunately it's a necessary evil... There are a few kernel helpers to make supporting it easier (see bio_split()). Not so sure how well that'll work for you, you may have to do the grunt work yourself.
OK, I'll fix the packet writing code.
If merge_bvec_fn is not supposed to be able to handle the need of the packet writing code, I can certainly resurrect my bio splitting code.
Only partially. Read my email: you _must_ accept a page addition to an empty bio. You can refuse others. For the single page case, you may need to split.
Btw, for some reason, this bug is not triggered when using the UDF filesystem on a CDRW. I've only seen it with the ext2 filesystem.
Does UDF use mpage? The fact that it doesn't trigger on UDF doesn't mean that packet writing isn't breaking the API :)
Agreed, this was not meant to excuse the packet writing code, just some additional trivia. Btw, is this API documented somewhere, or does it have to be reverse engineered by means of understanding implementation details in mpage_writepage() and similar functions? ;) May I suggest this patch? --- linux/drivers/block/ll_rw_blk.c.old 2004-01-02 12:56:55.000000000 +0100 +++ linux/drivers/block/ll_rw_blk.c 2004-01-02 13:07:25.000000000 +0100 @@ -173,9 +173,11 @@ * are dynamic, and thus we have to query the queue whether it is ok to * add a new bio_vec to a bio at a given offset or not. If the block device * has such limitations, it needs to register a merge_bvec_fn to control - * the size of bio's sent to it. Per default now merge_bvec_fn is defined for - * a queue, and only the fixed limits are honored. - * + * the size of bio's sent to it. Note that a block device *must* allow a + * single page to be added to an empty bio. The block device driver may want + * to use the bio_split() function to deal with these bio's. Per default + * no merge_bvec_fn is defined for a queue, and only the fixed limits are + * honored. */ void blk_queue_merge_bvec(request_queue_t *q, merge_bvec_fn *mbfn) { -- Peter Osterlund - petero2@telia.com http://w1.894.telia.com/~u89404340
On Fri, Jan 02 2004, Peter Osterlund wrote:
Does UDF use mpage? The fact that it doesn't trigger on UDF doesn't mean that packet writing isn't breaking the API :)
Agreed, this was not meant to excuse the packet writing code, just some additional trivia.
Btw, is this API documented somewhere, or does it have to be reverse engineered by means of understanding implementation details in mpage_writepage() and similar functions? ;) May I suggest this patch?
--- linux/drivers/block/ll_rw_blk.c.old 2004-01-02 12:56:55.000000000 +0100 +++ linux/drivers/block/ll_rw_blk.c 2004-01-02 13:07:25.000000000 +0100 @@ -173,9 +173,11 @@ * are dynamic, and thus we have to query the queue whether it is ok to * add a new bio_vec to a bio at a given offset or not. If the block device * has such limitations, it needs to register a merge_bvec_fn to control - * the size of bio's sent to it. Per default now merge_bvec_fn is defined for - * a queue, and only the fixed limits are honored. - * + * the size of bio's sent to it. Note that a block device *must* allow a + * single page to be added to an empty bio. The block device driver may want + * to use the bio_split() function to deal with these bio's. Per default + * no merge_bvec_fn is defined for a queue, and only the fixed limits are + * honored. */ void blk_queue_merge_bvec(request_queue_t *q, merge_bvec_fn *mbfn) {
I just looked but could not find anything about it, there's been some talk on this list. But it doesn't look like it ever got documented in text writing. That needs to be fixed for sure, thanks for the patch. It probably wants documenting in fs/bio.c:bio_add_page() too. -- Jens Axboe
Jens Axboe
On Fri, Jan 02 2004, Peter Osterlund wrote:
Old versions of the packet writing code did just that, but Jens told me that bio splitting was evil, so when the merge_bvec_fn functionality was added to the kernel, I started to use it.
http://lists.suse.com/archive/packet-writing/2002-Aug/0044.html
Splitting is evil, but unfortunately it's a necessary evil... There are a few kernel helpers to make supporting it easier (see bio_split()). Not so sure how well that'll work for you, you may have to do the grunt work yourself.
It seems like bio_split() does exactly what I need. The patch below makes 2kb blocksize ext2 work and also fixes the performance problem compared to 4kb blocksize ext2. Thanks to everyone involved for their help. --- linux/drivers/block/pktcdvd.c.old 2004-01-02 16:58:52.000000000 +0100 +++ linux/drivers/block/pktcdvd.c 2004-01-02 16:59:26.000000000 +0100 @@ -2083,11 +2083,23 @@ (unsigned long long)bio->bi_sector, (unsigned long long)(bio->bi_sector + bio_sectors(bio))); - /* Some debug code to make sure the merge_bvec_fn function is working. */ + /* Check if we have to split the bio */ { + struct bio_pair *bp; sector_t last_zone; + int first_sectors; + last_zone = ZONE(bio->bi_sector + bio_sectors(bio) - 1, pd); - BUG_ON(last_zone != zone); + if (last_zone != zone) { + BUG_ON(last_zone != zone + pd->settings.size); + first_sectors = last_zone - bio->bi_sector; + bp = bio_split(bio, bio_split_pool, first_sectors); + BUG_ON(!bp); + pkt_make_request(q, &bp->bio1); + pkt_make_request(q, &bp->bio2); + bio_pair_release(bp); + return 0; + } } /* @@ -2153,6 +2165,15 @@ sector_t zone = ZONE(bio->bi_sector, pd); int used = ((bio->bi_sector - zone) << 9) + bio->bi_size; int remaining = (pd->settings.size << 9) - used; + int remaining2; + + /* + * A bio <= PAGE_SIZE must be allowed. If it crosses a packet + * boundary, pkt_make_request() will split the bio. + */ + remaining2 = PAGE_SIZE - bio->bi_size; + remaining = max_t(int, remaining, remaining2); + BUG_ON(remaining < 0); return remaining; } -- Peter Osterlund - petero2@telia.com http://w1.894.telia.com/~u89404340
participants (2)
-
Jens Axboe
-
Peter Osterlund