[opensuse-buildservice] support for http range requests
Hi, what about adding http range request support [1] to the backend? After I comitted r8169 ("get only the last N bytes of a logfile") last week darix suggested to use the http "Range" header instead of the "start" and "end" query parameters. The advantage of using the range header is that it is already part of the protocol and the range header isn't limited to the "getlogfile()" function (we can use it whenever a file is replied). The attached patch adds initial range header support to the backend (it also supports multiple range requests [2]). Comments, objections etc. are welcome:) If there are no objections I'm going to commit it and the next step will be to implement range support in BSServerEvents::reply_file (or rather in the corresponding "stream" functions). Marcus [1] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35.2 [2] http://www.w3.org/Protocols/rfc2616/rfc2616-sec19.html#sec19.2
On Wed, Oct 14, 2009 at 04:21:11PM +0200, Marcus Hüwe wrote:
what about adding http range request support [1] to the backend? After I comitted r8169 ("get only the last N bytes of a logfile") last week darix suggested to use the http "Range" header instead of the "start" and "end" query parameters. The advantage of using the range header is that it is already part of the protocol and the range header isn't limited to the "getlogfile()" function (we can use it whenever a file is replied).
Dunno, I don't see so many use cases. For the log file, I'd rather have "give me the last 1000 lines" instead of bytes. Cheers, Michael. -- Michael Schroeder mls@suse.de SUSE LINUX Products GmbH, GF Markus Rex, HRB 16746 AG Nuernberg main(_){while(_=~getchar())putchar(~_-1/(~(_|32)/13*2-11)*13);} -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-buildservice+help@opensuse.org
Am Mittwoch, 14. Oktober 2009 16:28:19 schrieb Michael Schroeder:
On Wed, Oct 14, 2009 at 04:21:11PM +0200, Marcus Hüwe wrote:
what about adding http range request support [1] to the backend? After I comitted r8169 ("get only the last N bytes of a logfile") last week darix suggested to use the http "Range" header instead of the "start" and "end" query parameters. The advantage of using the range header is that it is already part of the protocol and the range header isn't limited to the "getlogfile()" function (we can use it whenever a file is replied).
Dunno, I don't see so many use cases. For the log file, I'd rather have "give me the last 1000 lines" instead of bytes.
Yes, me too. However, in general this is really great and important stuff. We should also use it for logfile in the web client by default IMHO. Usually only last ~ 100 lines are interessting and I really hate to download > 10 MB of logfile just for seeing the broken file list errors ;) bye adrian -- Adrian Schroeter SUSE Linux Products GmbH email: adrian@suse.de -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-buildservice+help@opensuse.org
On Wed, Oct 14, 2009 at 9:34 AM, Adrian Schröter <adrian@suse.de> wrote:
Am Mittwoch, 14. Oktober 2009 16:28:19 schrieb Michael Schroeder:
On Wed, Oct 14, 2009 at 04:21:11PM +0200, Marcus Hüwe wrote:
what about adding http range request support [1] to the backend? After I comitted r8169 ("get only the last N bytes of a logfile") last week darix suggested to use the http "Range" header instead of the "start" and "end" query parameters. The advantage of using the range header is that it is already part of the protocol and the range header isn't limited to the "getlogfile()" function (we can use it whenever a file is replied).
Dunno, I don't see so many use cases. For the log file, I'd rather have "give me the last 1000 lines" instead of bytes.
Yes, me too.
However, in general this is really great and important stuff. We should also use it for logfile in the web client by default IMHO. Usually only last ~ 100 lines are interessting and I really hate to download > 10 MB of logfile just for seeing the broken file list errors ;)
From an http perspective there is no such thing as "lines" - only bytes. Using the range header instead of several parameters is, IMO, a more elegant solution and puts the request at the proper layer. I'd rather see Range headers used than yet-another range-like-thing-that-isn't-ranges. In this case, I see it as a correction. ;-)
-- Jon -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-buildservice+help@opensuse.org
On Wed, Oct 14, 2009 at 09:39:10AM -0500, Jon Nelson wrote:
bytes. Using the range header instead of several parameters is, IMO, a more elegant solution and puts the request at the proper layer. I'd rather see Range headers used than yet-another range-like-thing-that-isn't-ranges. In this case, I see it as a correction. ;-)
Yes, but I was talking about a user's point of view. I don't want bytes, I want lines. Cheers, Michael. -- Michael Schroeder mls@suse.de SUSE LINUX Products GmbH, GF Markus Rex, HRB 16746 AG Nuernberg main(_){while(_=~getchar())putchar(~_-1/(~(_|32)/13*2-11)*13);} -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-buildservice+help@opensuse.org
Michael Schroeder wrote:
On Wed, Oct 14, 2009 at 09:39:10AM -0500, Jon Nelson wrote:
bytes. Using the range header instead of several parameters is, IMO, a more elegant solution and puts the request at the proper layer. I'd rather see Range headers used than yet-another range-like-thing-that-isn't-ranges. In this case, I see it as a correction. ;-)
Yes, but I was talking about a user's point of view. I don't want bytes, I want lines.
Cheers, Michael.
Does the server have to exactly respect the range request or can it round to the line and in the return headers inform the client that it gave a slightly different range? -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-buildservice+help@opensuse.org
On Wed, Oct 14, 2009 at 9:44 AM, Luke Imhoff <luke@cray.com> wrote:
Michael Schroeder wrote:
On Wed, Oct 14, 2009 at 09:39:10AM -0500, Jon Nelson wrote:
bytes. Using the range header instead of several parameters is, IMO, a more elegant solution and puts the request at the proper layer. I'd rather see Range headers used than yet-another range-like-thing-that-isn't-ranges. In this case, I see it as a correction. ;-)
Yes, but I was talking about a user's point of view. I don't want bytes, I want lines.
Cheers, Michael.
Does the server have to exactly respect the range request or can it round to the line and in the return headers inform the client that it gave a slightly different range?
The http spec is pretty clear: the server *must* return the request bytes as request, in the order they were requested, or return an error. The client is responsible for taking the raw data and presenting it. This is a classic example of layering - http is about data, the next layer up is about presentation. Let's take for example "I want the last 200 lines". If you were to want *exactly* 200 lines, that's not very easy to do. However, given some leniency, and a reasonable average for "longish" lines of, say, 80 characters (bytes if using ascii), that's about 16000 bytes - scandalously close to 16KB. So, if I were implementing a client, I'd do this: request the last 16KB of the file (Range: bytes=-16384), and after getting the data scan for the first CRLF (or whatever), and start printing lines from that byte. Subsequent requests don't require any scanning. If using http range requests, some of the log tailing can be farmed off entirely to the httpd instead of having to call out to *any* code, if the log file can be located in a predictable location - this is something that cannot be done right now because the interpretation of the paramters requires execution of code outside of the http context (probably ruby). Using the httpd for this is vastly less expensive. "tailing" logs over http is also much easier using range requests, and potentially vastly less expensive if the httpd can service the request directly. -- Jon -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-buildservice+help@opensuse.org
On Wed, Oct 14, 2009 at 9:43 AM, Michael Schroeder <mls@suse.de> wrote:
On Wed, Oct 14, 2009 at 09:39:10AM -0500, Jon Nelson wrote:
bytes. Using the range header instead of several parameters is, IMO, a more elegant solution and puts the request at the proper layer. I'd rather see Range headers used than yet-another range-like-thing-that-isn't-ranges. In this case, I see it as a correction. ;-)
Yes, but I was talking about a user's point of view. I don't want bytes, I want lines.
Oh, so do I! However, I see that as a layering issue and the bytes-to-lines should be take care of by the client (osc, your-favorite-gui, etc....). The way I see it, this change fixes what I see as a layering violation, makes better use of http and doesn't change the fundamental semantics of the existing software (which uses bytes already....) -- Jon -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-buildservice+help@opensuse.org
On 2009-10-14 16:34:56 +0200, Adrian Schröter wrote:
Am Mittwoch, 14. Oktober 2009 16:28:19 schrieb Michael Schroeder:
On Wed, Oct 14, 2009 at 04:21:11PM +0200, Marcus Hüwe wrote:
what about adding http range request support [1] to the backend? After I comitted r8169 ("get only the last N bytes of a logfile") last week darix suggested to use the http "Range" header instead of the "start" and "end" query parameters. The advantage of using the range header is that it is already part of the protocol and the range header isn't limited to the "getlogfile()" function (we can use it whenever a file is replied).
Dunno, I don't see so many use cases. For the log file, I'd rather have "give me the last 1000 lines" instead of bytes.
Yes, me too.
Yes that sounds cool and especially more "userfriendly". The question is if something like that should be implemented in the backend or in the client (which could use the "Range" header to find the last N lines). If the backend would support it, the client would just need to pass a "lines" query parameter to the request (=> no extra code for the webclient, osc etc.). Marcus -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-buildservice+help@opensuse.org
Am Mittwoch, 14. Oktober 2009 17:57:43 schrieb Marcus Hüwe:
On 2009-10-14 16:34:56 +0200, Adrian Schröter wrote:
Am Mittwoch, 14. Oktober 2009 16:28:19 schrieb Michael Schroeder:
On Wed, Oct 14, 2009 at 04:21:11PM +0200, Marcus Hüwe wrote:
what about adding http range request support [1] to the backend? After I comitted r8169 ("get only the last N bytes of a logfile") last week darix suggested to use the http "Range" header instead of the "start" and "end" query parameters. The advantage of using the range header is that it is already part of the protocol and the range header isn't limited to the "getlogfile()" function (we can use it whenever a file is replied).
Dunno, I don't see so many use cases. For the log file, I'd rather have "give me the last 1000 lines" instead of bytes.
Yes, me too.
Yes that sounds cool and especially more "userfriendly". The question is if something like that should be implemented in the backend or in the client (which could use the "Range" header to find the last N lines). If the backend would support it, the client would just need to pass a "lines" query parameter to the request (=> no extra code for the webclient, osc etc.).
Right, I would go for the backend (repserver) also here. -- Adrian Schroeter SUSE Linux Products GmbH email: adrian@suse.de -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-buildservice+help@opensuse.org
On Wed, Oct 14, 2009 at 04:21:11PM +0200, Marcus Hüwe wrote:
If there are no objections I'm going to commit it and the next step will be to implement range support in BSServerEvents::reply_file (or rather in the corresponding "stream" functions).
I think your change breaks the magic "split/join logfile stream request" code in BSWatcher.pm, which currently relies on advancing a "start" cgi parameter and doesn't look at http headers. Cheers, Michael. -- Michael Schroeder mls@suse.de SUSE LINUX Products GmbH, GF Markus Rex, HRB 16746 AG Nuernberg main(_){while(_=~getchar())putchar(~_-1/(~(_|32)/13*2-11)*13);} -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-buildservice+help@opensuse.org
On 2009-10-14 16:44:52 +0200, Michael Schroeder wrote:
On Wed, Oct 14, 2009 at 04:21:11PM +0200, Marcus Hüwe wrote:
If there are no objections I'm going to commit it and the next step will be to implement range support in BSServerEvents::reply_file (or rather in the corresponding "stream" functions).
I think your change breaks the magic "split/join logfile stream request" code in BSWatcher.pm, which currently relies on advancing a "start" cgi parameter and doesn't look at http headers.
Hmm what do you exactly mean? Currently we pass the range header to the BSWatcher::rpc(..) call, which does some "magic" and finally the request hits the worker which simply uses BSHTTP::reply_file(..) to fullfil the range request (in case a range header was specified). So the rpc magic should still work correctly (at least I hope so:) / couldn't find a place where the current code might break the rpc stuff). Marcus -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-buildservice+help@opensuse.org
On Wed, Oct 14, 2009 at 05:48:17PM +0200, Marcus Hüwe wrote:
On 2009-10-14 16:44:52 +0200, Michael Schroeder wrote:
On Wed, Oct 14, 2009 at 04:21:11PM +0200, Marcus Hüwe wrote:
If there are no objections I'm going to commit it and the next step will be to implement range support in BSServerEvents::reply_file (or rather in the corresponding "stream" functions).
I think your change breaks the magic "split/join logfile stream request" code in BSWatcher.pm, which currently relies on advancing a "start" cgi parameter and doesn't look at http headers.
Hmm what do you exactly mean? Currently we pass the range header to the BSWatcher::rpc(..) call, which does some "magic" and finally the request hits the worker which simply uses BSHTTP::reply_file(..) to fullfil the range request (in case a range header was specified). So the rpc magic should still work correctly (at least I hope so:) / couldn't find a place where the current code might break the rpc stuff).
See BSWatcher.pm's rpc_recv_forward_data_handler() function. The idea is that multiple requests to the same logfile get joined in the server and the build client needs only to serve one request. This obviously only makes sense for requests that stream the logfile while the build is running. You could just don't join requests with range requests as a workaround. Cheers, Michael. -- Michael Schroeder mls@suse.de SUSE LINUX Products GmbH, GF Markus Rex, HRB 16746 AG Nuernberg main(_){while(_=~getchar())putchar(~_-1/(~(_|32)/13*2-11)*13);} -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-buildservice+help@opensuse.org
On 2009-10-15 10:50:53 +0200, Michael Schroeder wrote:
On Wed, Oct 14, 2009 at 05:48:17PM +0200, Marcus Hüwe wrote:
On 2009-10-14 16:44:52 +0200, Michael Schroeder wrote:
On Wed, Oct 14, 2009 at 04:21:11PM +0200, Marcus Hüwe wrote:
If there are no objections I'm going to commit it and the next step will be to implement range support in BSServerEvents::reply_file (or rather in the corresponding "stream" functions).
I think your change breaks the magic "split/join logfile stream request" code in BSWatcher.pm, which currently relies on advancing a "start" cgi parameter and doesn't look at http headers.
Hmm what do you exactly mean? Currently we pass the range header to the BSWatcher::rpc(..) call, which does some "magic" and finally the request hits the worker which simply uses BSHTTP::reply_file(..) to fullfil the range request (in case a range header was specified). So the rpc magic should still work correctly (at least I hope so:) / couldn't find a place where the current code might break the rpc stuff).
See BSWatcher.pm's rpc_recv_forward_data_handler() function. The idea is that multiple requests to the same logfile get joined in the server and the build client needs only to serve one request. This obviously only makes sense for requests that stream the logfile while the build is running. You could just don't join requests with range requests as a workaround.
Ah yes you're right. Thanks for the explanation! Finally I understood this "joinable" concept:) Marcus -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-buildservice+help@opensuse.org
participants (5)
-
Adrian Schröter
-
Jon Nelson
-
Luke Imhoff
-
Marcus Hüwe
-
Michael Schroeder