[opensuse-programming] line-oriented TCP protocol?
TCP is of course stream oriented. It does not have a concept of lines. Serial ports, of course, can be configured as stream, line, character, whatever. I have a function that I want to make work on either a serial or a TCP port. I want this to be line oriented. Of course, I know how to do this reading a character at a time and taking appropriate action. On the other hand: On the serial port I can arrange it so reads are satisfied when there is a whole line. That is, each read returns only when a whole line is available. My question: Is there some obscure TCP ioctl() that would let me get the same for TCP streams? I have looked, and I do not see anything in the standard Linux networking docs. I don't really expect that there is such a mode, but I thought I would ask around a bit before doing the character-at-a-time thing. TIA for any pointers. -- Roger Oberholtzer OPQ Systems / Ramböll RST Ramböll Sverige AB Krukmakargatan 21 P.O. Box 17009 SE-104 62 Stockholm, Sweden Office: Int +46 10-615 60 20 Mobile: Int +46 70-815 1696 -- To unsubscribe, e-mail: opensuse-programming+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-programming+help@opensuse.org
First of all, TCP is not stream oriented. It is a protocol that requires a connection with a remote host. The IP protocol is a windowing protocol (in contrast to an old ack/nak protocol). Data sent via IP is transparent as the '\n' has no meaning. There is really no concept of a line in Linux WRT communications. Essentially a line is just some data with a '\n'. I suggest you consult a good book on Unix/Linux network programming, such as /UNIX Network Programming/, by the late W. Richard Stephens. This was at one time the best set of books. I think that this can answer all your questions. In the past I wrote a satellite communications system using either TCP or UDP based on lines. The issue is not so much the IP, TCP or UDP protocols, but how the end-user application handles the data. On 09/30/2010 09:57 AM, Roger Oberholtzer wrote:
TCP is of course stream oriented. It does not have a concept of lines. Serial ports, of course, can be configured as stream, line, character, whatever.
I have a function that I want to make work on either a serial or a TCP port. I want this to be line oriented. Of course, I know how to do this reading a character at a time and taking appropriate action.
On the other hand:
On the serial port I can arrange it so reads are satisfied when there is a whole line. That is, each read returns only when a whole line is available.
My question: Is there some obscure TCP ioctl() that would let me get the same for TCP streams? I have looked, and I do not see anything in the standard Linux networking docs. I don't really expect that there is such a mode, but I thought I would ask around a bit before doing the character-at-a-time thing.
TIA for any pointers.
-- Jerry Feldman <gaf@blu.org> Boston Linux and Unix PGP key id: 537C5846 PGP Key fingerprint: 3D1B 8377 A3C0 A5F2 ECBB CA3B 4607 4319 537C 5846
On Thu, 2010-09-30 at 11:41 -0400, Jerry Feldman wrote:
First of all, TCP is not stream oriented.
Yes, it is. See RFC793. It provides what is in essence a stream and is referred to as such in the specification; a bidirectional stream between two sockets.
a connection with a remote host. The IP protocol is a windowing protocol (in contrast to an old ack/nak protocol).
TCP layers on top of IP to create a reliable streaming protocol. "reliable" being a relative term, of course.
. I suggest you consult a good book on Unix/Linux network programming, such as /UNIX Network Programming/, by the late W. Richard Stephens. This was at one time the best set of books.
+1 But, sadly, it doesn't cover SCTP. -- To unsubscribe, e-mail: opensuse-programming+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-programming+help@opensuse.org
On Thu, 2010-09-30 at 11:41 -0400, Jerry Feldman wrote:
First of all, TCP is not stream oriented. It is a protocol that requires a connection with a remote host. The IP protocol is a windowing protocol (in contrast to an old ack/nak protocol). Data sent via IP is transparent as the '\n' has no meaning.
TCP is very much a stream. In fact, I would call that it's defining characteristic. As opposed to UDP, which is packet based (a write on one side corresponds directly to a read on the other).
There is really no concept of a line in Linux WRT communications. Essentially a line is just some data with a '\n'. I suggest you consult a good book on Unix/Linux network programming, such as /UNIX Network Programming/, by the late W. Richard Stephens. This was at one time the best set of books. I think that this can answer all your questions. In the past I wrote a satellite communications system using either TCP or UDP based on lines. The issue is not so much the IP, TCP or UDP protocols, but how the end-user application handles the data.
Indeed. Lines are imposed by the receiver for whatever nefarious purposes they have. But, some device drivers let you define a termination character so that a read() call is limited to the data between these delimiters, and only satisfied when a delimiter arrives. The serial port driver is such a driver. Unix SVR4.2 has a concept of stream drivers. These were modules that you could stack to process data (e.g. TCP) in a pipeline. So, you could push a 'line locating' stream module on your TCP stream and it would ensure that reads were satisfied according to whatever logic was in the module. These modules were loaded in a program and pushed on the TCP socket file descriptor. It is a powerful concept. I was just curious if there was anything similar in Linux. And I do have the Stevens books. They are indeed excellent. We used them when implementing a network stack in an small embedded device that we make. What I am curious about would be something outside the scope of these books. Currently it is in the end-user app, as you stated. But I am trying to unify the implementation for RS-232 line-orientated communications (which is what I want), with the TCP code that looks for lines in the stream. I thought that before refactoring the code, I would check my options. -- Roger Oberholtzer OPQ Systems / Ramböll RST Ramböll Sverige AB Krukmakargatan 21 P.O. Box 17009 SE-104 62 Stockholm, Sweden Office: Int +46 10-615 60 20 Mobile: Int +46 70-815 1696 -- To unsubscribe, e-mail: opensuse-programming+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-programming+help@opensuse.org
On 10/01/2010 02:50 AM, Roger Oberholtzer wrote:
On Thu, 2010-09-30 at 11:41 -0400, Jerry Feldman wrote:
First of all, TCP is not stream oriented. It is a protocol that requires a connection with a remote host. The IP protocol is a windowing protocol (in contrast to an old ack/nak protocol). Data sent via IP is transparent as the '\n' has no meaning.
TCP is very much a stream. In fact, I would call that it's defining characteristic. As opposed to UDP, which is packet based (a write on one side corresponds directly to a read on the other).
There is really no concept of a line in Linux WRT communications. Essentially a line is just some data with a '\n'. I suggest you consult a good book on Unix/Linux network programming, such as /UNIX Network Programming/, by the late W. Richard Stephens. This was at one time the best set of books. I think that this can answer all your questions. In the past I wrote a satellite communications system using either TCP or UDP based on lines. The issue is not so much the IP, TCP or UDP protocols, but how the end-user application handles the data.
Indeed. Lines are imposed by the receiver for whatever nefarious purposes they have. But, some device drivers let you define a termination character so that a read() call is limited to the data between these delimiters, and only satisfied when a delimiter arrives. The serial port driver is such a driver.
Unix SVR4.2 has a concept of stream drivers. These were modules that you could stack to process data (e.g. TCP) in a pipeline. So, you could push a 'line locating' stream module on your TCP stream and it would ensure that reads were satisfied according to whatever logic was in the module. These modules were loaded in a program and pushed on the TCP socket file descriptor. It is a powerful concept.
I was just curious if there was anything similar in Linux.
And I do have the Stevens books. They are indeed excellent. We used them when implementing a network stack in an small embedded device that we make. What I am curious about would be something outside the scope of these books.
Currently it is in the end-user app, as you stated. But I am trying to unify the implementation for RS-232 line-orientated communications (which is what I want), with the TCP code that looks for lines in the stream. I thought that before refactoring the code, I would check my options.
The entire IP suite is packet driven, TCP, UDP, ICMP. The difference between UDP and TCP is that TCP sets up a connection between the two ends with a guarantee that the entire message (1 or more packets) will be delivered. UDP on the other hand is a datagram where packets are sent to the destination without any checking. A lot of network management protocols, such as SNMP, use UDP and provide end-to-end delivery. That way if a remote host goes down, the application can manage the time outs rather than the packet protocol. What TCP does is to send several packets out up to a window size (usually 7). Each packet is numbered. If the sender detects an error by not receiving the appropriate responses, it may resend packets that fail to arrive. Packets can arrive at the destination out of order, and it is the receiver's responsibility to assemble the packets in the correct order before presenting the data to the application. -- Jerry Feldman <gaf@blu.org> Boston Linux and Unix PGP key id: 537C5846 PGP Key fingerprint: 3D1B 8377 A3C0 A5F2 ECBB CA3B 4607 4319 537C 5846
On Fri, 2010-10-01 at 14:48 -0400, Jerry Feldman wrote:
The entire IP suite is packet driven, TCP, UDP, ICMP. The difference between UDP and TCP is that TCP sets up a connection between the two ends with a guarantee that the entire message (1 or more packets) will be delivered. UDP on the other hand is a datagram where packets are sent to the destination without any checking. A lot of network management protocols, such as SNMP, use UDP and provide end-to-end delivery. That way if a remote host goes down, the application can manage the time outs rather than the packet protocol. What TCP does is to send several packets out up to a window size (usually 7). Each packet is numbered. If the sender detects an error by not receiving the appropriate responses, it may resend packets that fail to arrive. Packets can arrive at the destination out of order, and it is the receiver's responsibility to assemble the packets in the correct order before presenting the data to the application.
Obviously all things that arrive over the Ethernet interface are in discrete packets. I am referring to what the Berkeley socket interface exposes via read/write and recvfrom/sendto. At that level, TCP is a stream of values that bear no correspondence to the packets in which they were delivered over the Ethernet interface. Still, what I was interested in would have be a layer on top of TCP. As Per reminded me, fdopen may provide that layer. I will next determine at what cost I can get this 'service'. Thanks all for the answers. -- Roger Oberholtzer OPQ Systems / Ramböll RST Ramböll Sverige AB Krukmakargatan 21 P.O. Box 17009 SE-104 62 Stockholm, Sweden Office: Int +46 10-615 60 20 Mobile: Int +46 70-815 1696 -- To unsubscribe, e-mail: opensuse-programming+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-programming+help@opensuse.org
On 10/04/2010 04:39 AM, Roger Oberholtzer wrote:
On Fri, 2010-10-01 at 14:48 -0400, Jerry Feldman wrote:
The entire IP suite is packet driven, TCP, UDP, ICMP. The difference between UDP and TCP is that TCP sets up a connection between the two ends with a guarantee that the entire message (1 or more packets) will be delivered. UDP on the other hand is a datagram where packets are sent to the destination without any checking. A lot of network management protocols, such as SNMP, use UDP and provide end-to-end delivery. That way if a remote host goes down, the application can manage the time outs rather than the packet protocol. What TCP does is to send several packets out up to a window size (usually 7). Each packet is numbered. If the sender detects an error by not receiving the appropriate responses, it may resend packets that fail to arrive. Packets can arrive at the destination out of order, and it is the receiver's responsibility to assemble the packets in the correct order before presenting the data to the application.
Obviously all things that arrive over the Ethernet interface are in discrete packets. I am referring to what the Berkeley socket interface exposes via read/write and recvfrom/sendto. At that level, TCP is a stream of values that bear no correspondence to the packets in which they were delivered over the Ethernet interface. Still, what I was interested in would have be a layer on top of TCP. As Per reminded me, fdopen may provide that layer. I will next determine at what cost I can get this 'service'.
Yes. The application can do a lot of things including change the buffer size. The issue might be terminology. The IP protocol suite has nothing to do with Ethernet directly although there are things like ARP packets. The Internet Protocol is a packet oriented protocol. TCP is an end-to-end connection protocol where UDP is connectionless. IP will work over Ethernet (the most ubiquitous), Token Ring, and other hardware protocols. The Berkeley Socket Interface is the application level commands (eg. system calls and library function). IP is tunneled through Ethernet on a LAN, but Ethernet is restricted to a physical subnet. When you send an IP packet, whether TCP or UDP, it tunnels through Ethernet to the gateway where the gateway may tunnel through more Ethernet gateways or through other WAN protocols. -- Jerry Feldman <gaf@blu.org> Boston Linux and Unix PGP key id: 537C5846 PGP Key fingerprint: 3D1B 8377 A3C0 A5F2 ECBB CA3B 4607 4319 537C 5846
[Sorry for jumping in late but I just returned from holiday in beautiful Provence] On Fri, 01 Oct 2010 08:50:37 +0200, Roger Oberholtzer <roger@opq.se> wrote:
Unix SVR4.2 has a concept of stream drivers. These were modules that you could stack to process data (e.g. TCP) in a pipeline.
And that was utter nonsense! Very nice in theory but a nightmare in reality. Streams were invented for serail lines and there the concept of stackable modules made sense. But extending that idea toTCP/IP was a grave mistake. Guess why it was never really implemented in Linux? Philipp -- To unsubscribe, e-mail: opensuse-programming+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-programming+help@opensuse.org
On Mon, 2010-10-11 at 00:11 +0200, Philipp Thomas wrote:
[Sorry for jumping in late but I just returned from holiday in beautiful Provence]
On Fri, 01 Oct 2010 08:50:37 +0200, Roger Oberholtzer <roger@opq.se> wrote:
Unix SVR4.2 has a concept of stream drivers. These were modules that you could stack to process data (e.g. TCP) in a pipeline.
And that was utter nonsense! Very nice in theory but a nightmare in reality. Streams were invented for serail lines and there the concept of stackable modules made sense. But extending that idea toTCP/IP was a grave mistake. Guess why it was never really implemented in Linux?
I never used it. I just saw a connection between streams and what I was/am looking for. I do confess to be confused about why you considered it to be good for serial lines and not TCP/IP. Granted TCP/IP offers more variations than a serial line, and that could make implementation of stream modules more complicated. But that does not mean the streams concept was wrong. As to why it is not in Linux, well, I guess it never really caught on like wild fire on SVR4 (including Solaris at the time, which I think also had streams - does Solaris have streams today?). I think the reason was primarily that it was not a portable concept more than it being a bad one. An application relying on streams would not port to other flavors of Unix. If the core streams implementation had somehow been portable, I venture a guess that it may have been more popular. It moves lots of file handling from the kernel into user space via a well-defined interface. -- Roger Oberholtzer OPQ Systems / Ramböll RST Ramböll Sverige AB Krukmakargatan 21 P.O. Box 17009 SE-104 62 Stockholm, Sweden Office: Int +46 10-615 60 20 Mobile: Int +46 70-815 1696 -- To unsubscribe, e-mail: opensuse-programming+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-programming+help@opensuse.org
On Mon, 11 Oct 2010 08:09:01 +0200, Roger Oberholtzer <roger@opq.se> wrote:
I do confess to be confused about why you considered it to be good for serial lines and not TCP/IP. Granted TCP/IP offers more variations than a serial line, and that could make implementation of stream modules more complicated. But that does not mean the streams concept was wrong.
First of all documentation: 1) http://cm.bell-labs.com/cm/cs/who/dmr/st.html is the original paper from Dennis Ritchie describing streams. 2) The document describing the Linux implementation: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.55.1500&rep=rep1&ty... SVR4 streams suck extremely performance wise. Those performance penalties were accetable for the rather slow serial lines but weren't for networks.
As to why it is not in Linux, well, I guess it never really caught on like wild fire on SVR4 (including Solaris at the time, which I think also had streams - does Solaris have streams today?). I think the reason was primarily that it was not a portable concept more than it being a bad one.
No, the reason was performance. Here's a quote from a post from Dave Miller (taken from http://cryptnet.net/mirrors/texts/kissedagirl.html):
1) Solaris's networking stack, in all of it's incantations (one breed of it was the Lochman code in 2.0, 2.1 and early 2.2 releases, then it was rewritten by another company for 2.3 onward) is SVR4 streams based. The performance penalty, even with lots of tricks, for using a SVR4 streams networking architecture is well known. Someone who happens to have a 2.2 Solaris CD around, or even a 2.3 Solaris CD, should install that thing and run lmbench on it to see what "pure Streams based networking" without the tricks can really do.
Linux on the other hand has a "no bullshit" networking architecture that is not streams based. Yet we also take advantage of the many known networking performance enhancements that exist in the research realm (ie. copy/checksum, the van jacobson hacks, etc.)
portable, I venture a guess that it may have been more popular. It moves lots of file handling from the kernel into user space via a well-defined interface.
And it sucks rocks through straws performance wise. The idea was rather neat, the implementation in svr4 streams was not. That's also why even commercial unix vendors moved to sockets based implementation. Philipp -- To unsubscribe, e-mail: opensuse-programming+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-programming+help@opensuse.org
On Mon, 2010-10-11 at 23:17 +0200, Philipp Thomas wrote:
On Mon, 11 Oct 2010 08:09:01 +0200, Roger Oberholtzer <roger@opq.se> wrote:
I do confess to be confused about why you considered it to be good for serial lines and not TCP/IP. Granted TCP/IP offers more variations than a serial line, and that could make implementation of stream modules more complicated. But that does not mean the streams concept was wrong.
First of all documentation:
1) http://cm.bell-labs.com/cm/cs/who/dmr/st.html is the original paper from Dennis Ritchie describing streams.
2) The document describing the Linux implementation: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.55.1500&rep=rep1&ty...
SVR4 streams suck extremely performance wise. Those performance penalties were accetable for the rather slow serial lines but weren't for networks.
I can see that bad performance on a network would be more noticeable than on a serial port. But I would conclude that streams are generally bad. To be clear, I was not extolling the virtues of streams over sockets. I saw an analogy to a feature I was interested in. Performance is always a concern for me, as I do near-real-time programming (whatever that is...) so streams (as implemented in SVR4 - not as a concept) would probably not fit the bill. Still, this has been an interesting discussion. -- Roger Oberholtzer OPQ Systems / Ramböll RST Ramböll Sverige AB Krukmakargatan 21 P.O. Box 17009 SE-104 62 Stockholm, Sweden Office: Int +46 10-615 60 20 Mobile: Int +46 70-815 1696 -- To unsubscribe, e-mail: opensuse-programming+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-programming+help@opensuse.org
Roger Oberholtzer wrote:
TCP is of course stream oriented. It does not have a concept of lines. Serial ports, of course, can be configured as stream, line, character, whatever.
I have a function that I want to make work on either a serial or a TCP port. I want this to be line oriented. Of course, I know how to do this reading a character at a time and taking appropriate action.
On the other hand:
On the serial port I can arrange it so reads are satisfied when there is a whole line. That is, each read returns only when a whole line is available.
My question: Is there some obscure TCP ioctl() that would let me get the same for TCP streams? I have looked, and I do not see anything in the standard Linux networking docs.
fdopen() ? I'm pretty certain I have done that in the past, appropriately wrapped in select() or poll(). /Per Jessen, Zürich -- To unsubscribe, e-mail: opensuse-programming+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-programming+help@opensuse.org
On Thu, 2010-09-30 at 20:49 +0200, Per Jessen wrote:
fdopen() ? I'm pretty certain I have done that in the past, appropriately wrapped in select() or poll().
Too obious:) As in, why didn't I think of that... I wonder if it does a read of one character at a time (lots of system calls!) or if it reads blocks and fiddles with buffers. I guess I will try and see what strace tells me. Thx. -- Roger Oberholtzer OPQ Systems / Ramböll RST Ramböll Sverige AB Krukmakargatan 21 P.O. Box 17009 SE-104 62 Stockholm, Sweden Office: Int +46 10-615 60 20 Mobile: Int +46 70-815 1696 -- To unsubscribe, e-mail: opensuse-programming+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-programming+help@opensuse.org
Roger Oberholtzer wrote:
On Thu, 2010-09-30 at 20:49 +0200, Per Jessen wrote:
fdopen() ? I'm pretty certain I have done that in the past, appropriately wrapped in select() or poll().
Too obious:) As in, why didn't I think of that...
I wonder if it does a read of one character at a time (lots of system calls!) or if it reads blocks and fiddles with buffers. I guess I will try and see what strace tells me.
I've checked some of my code, and I've used fdopen() for things like line-based protocols (SMTP). AFAICT, there is not much difference between buffered reads on a file and a socket. The one thing is that the socket can get disconnected, but that should get represented appropriately. /Per Jessen, Zürich -- To unsubscribe, e-mail: opensuse-programming+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-programming+help@opensuse.org
participants (5)
-
Adam Tauno Williams
-
Jerry Feldman
-
Per Jessen
-
Philipp Thomas
-
Roger Oberholtzer