[Bug 469222] New: formail does not concatenate folder headers correctly
https://bugzilla.novell.com/show_bug.cgi?id=469222 Summary: formail does not concatenate folder headers correctly Classification: openSUSE Product: openSUSE 11.1 Version: Final Platform: i686 OS/Version: Other Status: NEW Severity: Normal Priority: P5 - None Component: Other AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: per.jessen@enidan.com QAContact: qa@suse.de Found By: --- Created an attachment (id=267494) --> (https://bugzilla.novell.com/attachment.cgi?id=267494) an excerpt from a current email as written by postfix. User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-GB; rv:1.8.1.6) Gecko/20070730 SUSE/2.0.0.6-25 Firefox/2.0.0.6 formail appears to omit whitespace when removing the newlines of a folded header. A formail compiled from the current source does not have this problem. http://www.procmail.org/procmail-3.22.tar.gz Reproducible: Always Steps to Reproduce: 1. Run formail -c <testemail | grep ^Received | head -1 (testemail - see attached) Actual Results: Received: from dingbat.example.com (dingbat.example.com [21.21.21.21])by srv1.example.com (Postfix) with ESMTP id 1709C4D0D3for <catchthismail@jessen.ch>; Tue, 20 Jan 2009 10:12:48 +0100 (CET) Notice: there is no tab/whitespace before 'by' and 'for'. Expected Results: Received: from dingbat.example.com (dingbat.example.com [21.21.21.21]) by srv1.example.com (Postfix) with ESMTP id 1709C4D0D3 for <catchthismail@jessen.ch>; Tue, 20 Jan 2009 10:12:48 +0100 (CET) Notice: there is now tab/whitespace before 'by' and 'for'. This is with new formail compiled from source. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=469222 Marcus Meissner <meissner@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|bnc-team-screening@forge.pr |werner@novell.com |ovo.novell.com | -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=469222 User werner@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=469222#c1 Dr. Werner Fink <werner@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO CC| |odabrunz@novell.com Info Provider| |odabrunz@novell.com --- Comment #1 from Dr. Werner Fink <werner@novell.com> 2009-01-26 03:29:23 MST --- Olaf? The patch procmail-3.22-headerconcat.dif was added by you in 2004, from changelog: Sun May 9 23:24:16 CEST 2004 - od@suse.de - fixed handling of folded headers: delete leading whitespace Do you know more about? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=469222 User odabrunz@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=469222#c2 Olaf Dabrunz <odabrunz@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW Info Provider|odabrunz@novell.com | --- Comment #2 from Olaf Dabrunz <odabrunz@novell.com> 2009-01-26 10:36:30 MST --- The patch should be changed. Done and testing. Rationale: The original patch removes the folding CRLF (or LF in the internal representation) as well as the leading character (whitespace) on the continuation line. This was done as many mail programs at that time constructed folding by inserting CRLF<TAB> or CRLF<SP>, rather than just CRLF. This was permitted by the wording in RFC822 (section 3.1.1), but it introduced the problem that unfolding was unable to reliably restore the original spacing. Nowadays more clients seem to follow RFC2822. The wording there (section 2.2.3) makes clear that only a CRLF may be inserted to fold a line. This enables unfolding to restore the original spacing. Expecting RFC2822-compliant mail clients seems to be the right choice today. The downside is that mails created with older clients may have too little whitespace removed on a continuation line. But that is becoming seldom now, and it should be fixed by upgrading or replacing these clients. Will submit the updated patch as soon as I am done testing. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=469222 Olaf Dabrunz <odabrunz@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=469222 User odabrunz@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=469222#c3 Olaf Dabrunz <odabrunz@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED --- Comment #3 from Olaf Dabrunz <odabrunz@novell.com> 2009-01-26 14:20:29 MST --- Tested on my workstation (10.3). Submitted to STABLE and 11.1. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=469222 User odabrunz@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=469222#c4 --- Comment #4 from Olaf Dabrunz <odabrunz@novell.com> 2009-01-26 14:28:21 MST --- Note that unfolding in procmail is now RFC2822-compliant, but I still see continuation lines that start with an inserted <TAB> (or a <TAB> that replaced a <SPACE>). We probably could implement an additional heuristic that replaces the special cases CRLF<TAB><SPACE> with <SPACE> and CRLF<TAB><OTHER> with <SPACE><OTHER>. I believe this is even a commonly used heuristic, but it would not be compliant with RFC2822 section 2.2.3. Any opinions on this? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=469222 Olaf Dabrunz <odabrunz@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|REOPENED |NEEDINFO Info Provider| |coolo@novell.com -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=469222 User per.jessen@enidan.com added comment https://bugzilla.novell.com/show_bug.cgi?id=469222#c6 --- Comment #6 from Per Jessen <per.jessen@enidan.com> 2009-01-27 01:59:43 MST --- (In reply to comment #4)
Note that unfolding in procmail is now RFC2822-compliant, but I still see continuation lines that start with an inserted <TAB> (or a <TAB> that replaced a <SPACE>).
I think a TAB is for readability, but a continuation line will always have at least one leading whitespace character.
We probably could implement an additional heuristic that replaces the special cases CRLF<TAB><SPACE> with <SPACE> and CRLF<TAB><OTHER> with <SPACE><OTHER>.
If we just remove the CRLF, we are in compliance with RFC-2822. Lines may only be folded where there is whitespace, and the CRLF is inserted before any whitespace.
From RFC-2822:
The general rule is that wherever this standard allows for folding white space (not simply WSP characters), a CRLF may be inserted before any WSP. [snip] Unfolding is accomplished by simply removing any CRLF that is immediately followed by WSP. I think compliance is paramount as formail is typically used in scripts and procmail recipes on different distros, and it's important that it behaves in the same way everywhere. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=469222 Stephan Kulow <coolo@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Info Provider|coolo@novell.com |maint-coord@suse.de -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=469222 Dirk Mueller <dmueller@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Info Provider|maint-coord@suse.de |ast@novell.com -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=469222 User odabrunz@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=469222#c7 --- Comment #7 from Olaf Dabrunz <odabrunz@novell.com> 2009-01-27 06:34:16 MST --- (In reply to comment #6)
(In reply to comment #4)
Note that unfolding in procmail is now RFC2822-compliant, but I still see continuation lines that start with an inserted <TAB> (or a <TAB> that >replaced a <SPACE>).
I think a TAB is for readability, but a continuation line will always have at
I also think it is for readability of the transmitted header.
least one leading whitespace character.
I am sorry that I have not described what this means. Here it is: The problem is that an RFC2822-compliant procmail needs to leave this "folding-<TAB>" intact when it unfolds the mail header. This means that a mail header like this: Subject: [Mutt] #3170: mutt: Attachments are automatically displayed even with 'Content-Disposition: attachment' ^ <TAB> here is transformed into this: Subject: [Mutt] #3170: mutt: Attachments are automatically displayed even with 'Content-Disposition: attachment' ^ | <TAB> here There is a <TAB> (or more) in the unfolded line which is not in the original line. That means that after unfolding has been done by formail, it is not possible to find out which <TAB>s are artefacts of the folding and which are not. I find the resulting line quite disturbing to the reader's eye. To make this more visible in the limited line width of Bugzilla, the line will look like this: Subject: [...] displayed even with 'Content-Disposition: attachment' ^ <TAB> here MUAs like mutt have the same problem during unfolding. But no MUA I have seen shows the <TAB> in the unfolded line. They chose to use an unfolding heuristic rather than comply to RFC-2822 section 2.2.3 in all cases. (Note that this unfolding heuristic is also not compliant to RFC-822 section 3.1.1. This section allowed the folding practice that lead to this problematic situation, but unfolding was defined there in an equivalent way to RFC-2822. I consider this to be a problem in RFC-822. And migrating away from this has not been addressed in RFC-2822. Not addressing this problem leads to implementations that violate the robustness principle, as they are not working with the input they get. IMHO, we would violate the robustness principle if we were simply RFC-2822-compliant.) That is why I proposed this for procmail:
We probably could implement an additional heuristic that replaces the special cases CRLF<TAB><SPACE> with <SPACE> and CRLF<TAB><OTHER> with <SPACE><OTHER>.
I should also make clear that this heuristic does not apply to CRLF<WSP>, where <WSP> is not <TAB>. A CRLF<WSP> will be unfolded to <WSP>, in compliance with RFC-2822 and RFC-822.
If we just remove the CRLF, we are in compliance with RFC-2822. Lines may only be folded where there is whitespace, and the CRLF is inserted before any whitespace.
From RFC-2822:
The general rule is that wherever this standard allows for folding white space (not simply WSP characters), a CRLF may be inserted before any WSP. [snip] Unfolding is accomplished by simply removing any CRLF that is immediately followed by WSP.
(Right. This is section 2.2.3 that I cited in comment #2.)
I think compliance is paramount as formail is typically used in scripts and procmail recipes on different distros, and it's important that it behaves in the same way everywhere.
I agree that (in principle) it should behave the same everywhere. The question is: is this the correct behaviour or should we change it everywhere? I would say this is in need of a migration heuristic. My point is supported by all MUAs I have seen (mutt, KMail, Thunderbird, ...), as they implement a heuristic instead and rather than being fully RFC2822-compliant. A problem is that upstream development for procmail (http://www.procmail.org/) has stopped since 2001. I do not remember if I submitted my patch to them, but I recon it will be difficult to make any change to the upstream procmail. Please also note that we have a few other changes to procmail and they did not make it upstream as well. If we could reactivate upstream, this would be possible. I have not fully researched this, but as a compromise I could try to add a command line option and/or an environment variable to procmail and formail that turns the heuristic on. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=469222 User per.jessen@enidan.com added comment https://bugzilla.novell.com/show_bug.cgi?id=469222#c8 --- Comment #8 from Per Jessen <per.jessen@enidan.com> 2009-01-27 07:14:31 MST --- (In reply to comment #7)
That means that after unfolding has been done by formail, it is not possible to find out which <TAB>s are artefacts of the folding and which are not.
Yeah, I see what you mean.
I find the resulting line quite disturbing to the reader's eye.
I only use formail to concatenate header for processing programmatically, so to me the actual look of the output is of less importance.
That is why I proposed this for procmail:
We probably could implement an additional heuristic that replaces the special cases CRLF<TAB><SPACE> with <SPACE> and CRLF<TAB><OTHER> with <SPACE><OTHER>.
That is almost the same as: simple concatenation = replace of CRLF<WSP>+ with <SP> ? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=469222 User werner@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=469222#c12 Dr. Werner Fink <werner@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |RESOLVED Info Provider|odabrunz@novell.com | Resolution| |FIXED --- Comment #12 from Dr. Werner Fink <werner@novell.com> 2009-02-10 11:19:35 MST --- also submitted to SLES11 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com