-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I have a large mail folder, with 20000 mails, of which probably thousands are duplicates. I can see them with my eyes. I have a thunderbird extension to find duplicates. "Remove Duplicate Messages" by Eyal Rozenberg. https://github.com/eyalroz/removedupes/ But it claims there are no duplicates. I have read the FAQ. Subsequently, I saved to file a message and its duplicate, to do a compare, and yes, there is a difference: X-Spam-Status: No, score=-3.4 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, RCVD_IN_DNSWL_MED,RCVD_IN_ZEN_BLOCKED_OPENDNS,RDNS_NONE,SPF_HELO_NONE, SPF_PASS autolearn=disabled version=3.4.5 X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, RDNS_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=disabled version=3.4.5 The reason of that difference, is that each copy comes from a different run of my sorting script, the spam filter "thought differently". (note to myself: disable spam filtering in that script) Do you know some other tool to find and remove duplicates from a mail folder, where I can tell it to ignore some header? ("Delete" means moving to trash folder) Or maybe remove the spam header. X-Spam-Checker-Version: SpamAssassin 3.4.5 (2021-03-20) on Telcontar.valinor X-Spam-Level: X-Spam-Status: No, score=-3.4 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, RCVD_IN_DNSWL_MED,RCVD_IN_ZEN_BLOCKED_OPENDNS,RDNS_NONE,SPF_HELO_NONE, SPF_PASS autolearn=disabled version=3.4.5 Logic: IF X-Spam-Checker-Version contains Telcontar.valinor, then remove X-Spam-Level: and X-Spam-Status: headers. - -- Cheers Carlos E. R. (from 15.4 x86_64 at Telcontar) -----BEGIN PGP SIGNATURE----- iHoEARECADoWIQQZEb51mJKK1KpcU/W1MxgcbY1H1QUCZKPZtBwccm9iaW4ubGlz dGFzQHRlbGVmb25pY2EubmV0AAoJELUzGBxtjUfVBmIAn2w1CHy/BOtOr+1Dl+CV KNqXEDgvAKCPtbhf4Aa9d4Hh5F3M486zrJupSg== =mTAn -----END PGP SIGNATURE-----
Hello, In the Message; Subject : [oS-en] Finding duplicates in email Message-ID : <f7e5e28d-3a25-1ed0-0f62-9c6b5246a9ec@telefonica.net> Date & Time: Tue, 4 Jul 2023 10:35:00 +0200 (CEST) [CER] == "Carlos E. R." <robin.listas@telefonica.net> has written: CER> I have a large mail folder, with 20000 mails, of which probably CER> thousands are duplicates. I can see them with my eyes. Yah, duplicate e-mails are so depressing. In my case, with the .procmailrc setup, there are no duplicate emails at all. CER> I have a thunderbird extension to find duplicates. "Remove CER> Duplicate Messages" by Eyal CER> Rozenberg. https://github.com/eyalroz/removedupes/ CER> But it claims there are no duplicates. CER> I have read the FAQ. Subsequently, I saved to file a message and CER> its duplicate, to do a compare, and yes, there is a difference: [...] CER> The reason of that difference, is that each copy comes from a CER> different run of my sorting script, the spam filter "thought CER> differently". CER> (note to myself: disable spam filtering in that script) Duplicate mails should be determined by the unique Message-ID for each mail. CER> Do you know some other tool to find and remove duplicates from a CER> mail folder, CER> where I can tell it to ignore some header? CER> ("Delete" means moving to trash folder) Is there no function in Thunderbird to determine duplicate messages by Message-ID? Regards. --- ┏━━┓彡 野宮 賢 mail-to: nomiya @ lake.dti.ne.jp ┃\/彡 ┗━━┛ "A bachelor’s degree still holds prestige as a ticket to the middle class, but its value has received increasing scrutiny. In the last several years, rising tuition and student loan debt have led more Americans to reconsider an investment in postsecondary education." -- Washington Post --
On 2023-07-04 11:21, Masaru Nomiya wrote:
Hello,
In the Message;
Subject : [oS-en] Finding duplicates in email Message-ID : <f7e5e28d-3a25-1ed0-0f62-9c6b5246a9ec@telefonica.net> Date & Time: Tue, 4 Jul 2023 10:35:00 +0200 (CEST)
[CER] == "Carlos E. R." <robin.listas@telefonica.net> has written:
CER> I have a large mail folder, with 20000 mails, of which probably CER> thousands are duplicates. I can see them with my eyes.
Yah, duplicate e-mails are so depressing.
In my case, with the .procmailrc setup, there are no duplicate emails at all.
I was merging mail from the laptop with the desktop machine, and as my sync process would not work for many moons, when I decided to repair that there was a backlog of thousands of mails. I happen to have very complex procmail recipes. The result were duplicates.
CER> I have a thunderbird extension to find duplicates. "Remove CER> Duplicate Messages" by Eyal CER> Rozenberg. https://github.com/eyalroz/removedupes/
CER> But it claims there are no duplicates.
CER> I have read the FAQ. Subsequently, I saved to file a message and CER> its duplicate, to do a compare, and yes, there is a difference: [...] CER> The reason of that difference, is that each copy comes from a CER> different run of my sorting script, the spam filter "thought CER> differently".
CER> (note to myself: disable spam filtering in that script)
Duplicate mails should be determined by the unique Message-ID for each mail.
Not exclusively. For example, when people reply both to the list and to the person, you get two copies of the same mail, slightly different. Same message-id, different received headers, possibly a different footer and reply to. My procmail recipe moves one of those to a different folder for direct duped replies.
CER> Do you know some other tool to find and remove duplicates from a CER> mail folder, CER> where I can tell it to ignore some header?
CER> ("Delete" means moving to trash folder)
Is there no function in Thunderbird to determine duplicate messages by Message-ID?
It is an addon, and it considers several other criteria: x Author x Recipients ('To') x CC List Status Flags x Message ID Number of lines in message x Send time x Size (headers & Body) x Subject Folder x Body Time comparison resolution [seconds] -- Cheers / Saludos, Carlos E. R. (from 15.4 x86_64 at Telcontar)
Hello, In the Message; Subject : Re: [oS-en] Finding duplicates in email Message-ID : <c800e49f-6c34-b328-9b01-30af5a1bb4f3@telefonica.net> Date & Time: Tue, 4 Jul 2023 11:40:20 +0200 [CER] == "Carlos E. R." <robin.listas@telefonica.net> has written: [...] MN> > In my case, with the .procmailrc setup, there are no duplicate emails MN> > at all. CER> I was merging mail from the laptop with the desktop machine, and as my sync CER> process would not work for many moons, when I decided to repair CER> that there was a backlog of thousands of mails. I happen to have CER> very complex procmail recipes. The result were duplicates. Ah, I see. But, the recipe for duplicate emails is not complicated. [...] MN> > Duplicate mails should be determined by the unique Message-ID for each MN> > mail. CER> Not exclusively. CER> For example, when people reply both to the list and to the CER> person, you get two copies of the same mail, slightly CER> different. Same message-id, different received headers, possibly CER> a different footer and reply to. ..? If you look at the To: and Cc: field of an email, you can see multiple recipient's addresses,, don't you. So, one email should be enough. CER> My procmail recipe moves one of those to a different folder for direct duped CER> replies. I can't understand why, but if you need duplicate mailings in some cases, then have it your way. Regards. --- ┏━━┓彡 野宮 賢 mail-to: nomiya @ lake.dti.ne.jp ┃\/彡 ┗━━┛ "The question of who holds the platform and whether the person or organisation holding it is trustworthy has serious and profound implications in these volatile times. Once trust is broken, it is extremely difficult to restore. It is necessary to diversify in advance." -- Financial Times --
On 2023-07-04 12:28, Masaru Nomiya wrote:
Hello,
In the Message;
Subject : Re: [oS-en] Finding duplicates in email Message-ID : <c800e49f-6c34-b328-9b01-30af5a1bb4f3@telefonica.net> Date & Time: Tue, 4 Jul 2023 11:40:20 +0200
[CER] == "Carlos E. R." <robin.listas@telefonica.net> has written:
[...] MN> > In my case, with the .procmailrc setup, there are no duplicate emails MN> > at all.
CER> I was merging mail from the laptop with the desktop machine, and as my sync CER> process would not work for many moons, when I decided to repair CER> that there was a backlog of thousands of mails. I happen to have CER> very complex procmail recipes. The result were duplicates.
Ah, I see.
But, the recipe for duplicate emails is not complicated.
[...] MN> > Duplicate mails should be determined by the unique Message-ID for each MN> > mail.
CER> Not exclusively.
CER> For example, when people reply both to the list and to the CER> person, you get two copies of the same mail, slightly CER> different. Same message-id, different received headers, possibly CER> a different footer and reply to.
..? If you look at the To: and Cc: field of an email, you can see multiple recipient's addresses,, don't you.
So, one email should be enough.
I want them all, they are not actual duplicates, many headers are different.
CER> My procmail recipe moves one of those to a different folder for direct duped CER> replies.
I can't understand why, but if you need duplicate mailings in some cases, then have it your way.
No, it is simply that the Message-ID is not enough criteria to determine a duplicate in my case. Now, and for a different reason that no procmail recipe can detect, I have, in a folder that is not for input, several thousands of duplicates, with different "X-Spam-Status" content. I need some postprocess method to filter a folder with 700000 mails and thousands of duplicates with a method similar to what the thunderbird plugin does. Procmail doesn't do this. Alternatively, I need a method to erase the "X-Spam-Status" header in many thousand of emails. -- Cheers / Saludos, Carlos E. R. (from 15.4 x86_64 at Telcontar)
* Carlos E. R. <robin.listas@telefonica.net> [07-04-23 06:50]: [...]
No, it is simply that the Message-ID is not enough criteria to determine a duplicate in my case.
Now, and for a different reason that no procmail recipe can detect, I have, in a folder that is not for input, several thousands of duplicates, with different "X-Spam-Status" content.
I need some postprocess method to filter a folder with 700000 mails and thousands of duplicates with a method similar to what the thunderbird plugin does. Procmail doesn't do this.
Alternatively, I need a method to erase the "X-Spam-Status" header in many thousand of emails.
I am sure procmail/formail can do that but you will have to craft the recipie some possible examples to use: https://procmail.rwth-aachen.narkive.com/zC2aChER/using-formail-to-remove-he... https://www.linuxquestions.org/questions/linux-server-73/procmail-removing-h... -- (paka)Patrick Shanahan Plainfield, Indiana, USA @ptilopteri http://en.opensuse.org openSUSE Community Member facebook/ptilopteri Photos: http://wahoo.no-ip.org/piwigo paka @ IRCnet oftc
Hello, Patrick. In the Message; Subject : Re: [oS-en] Finding duplicates in email Message-ID : <20230704114158.GS13204@wahoo.no-ip.org> Date & Time: Tue, 4 Jul 2023 07:41:59 -0400 [PS] == Patrick Shanahan <paka@opensuse.org> has written: [...] PS> > Alternatively, I need a method to erase the "X-Spam-Status" header in many PS> > thousand of emails. PS> I am sure procmail/formail can do that but you will have to craft the recipie PS> some possible examples to use: PS> https://procmail.rwth-aachen.narkive.com/zC2aChER/using-formail-to-remove-he... PS> https://www.linuxquestions.org/questions/linux-server-73/procmail-removing-h... Deleting a certain field is considered mail tampering, so there is no way that such a recipe exists. My fully RFC compliant MUA only allows me to make the specified field invisible. Regards & Good Night. --- ┏━━┓彡 野宮 賢 mail-to: nomiya @ lake.dti.ne.jp ┃\/彡 ┗━━┛ "Bill! You married with Computer. Not with Me!" "No..., with money."
On 2023-07-04 13:41, Patrick Shanahan wrote:
* Carlos E. R. <robin.listas@telefonica.net> [07-04-23 06:50]: [...]
No, it is simply that the Message-ID is not enough criteria to determine a duplicate in my case.
Now, and for a different reason that no procmail recipe can detect, I have, in a folder that is not for input, several thousands of duplicates, with different "X-Spam-Status" content.
I need some postprocess method to filter a folder with 700000 mails and thousands of duplicates with a method similar to what the thunderbird plugin does. Procmail doesn't do this.
Alternatively, I need a method to erase the "X-Spam-Status" header in many thousand of emails.
I am sure procmail/formail can do that but you will have to craft the recipie
some possible examples to use: https://procmail.rwth-aachen.narkive.com/zC2aChER/using-formail-to-remove-he... https://www.linuxquestions.org/questions/linux-server-73/procmail-removing-h...
Thanks, I will have another look at formail. Seems -x or -X does it. Not my first rodeo with it, but removing headers is new to me. Will have to be tonight or tomorrow, though. -- Cheers / Saludos, Carlos E. R. (from 15.4 x86_64 at Telcontar)
Hello, In the Message; Subject : Re: [oS-en] Finding duplicates in email Message-ID : <4852bccf-a68a-f5fd-601c-3787df11d54c@telefonica.net> Date & Time: Tue, 4 Jul 2023 14:49:12 +0200 [CER] == "Carlos E. R." <robin.listas@telefonica.net> has written: [...] CER>>> Alternatively, I need a method to erase the "X-Spam-Status" header in many CER>>> thousand of emails. PS> > I am sure procmail/formail can do that but you will have to craft the recipie PS> > PS> > some possible examples to use: PS> > https://procmail.rwth-aachen.narkive.com/zC2aChER/using-formail-to-remove-he... PS> > https://www.linuxquestions.org/questions/linux-server-73/procmail-removing-h... CER> Thanks, I will have another look at formail. Seems -x or -X does it. Not my CER> first rodeo with it, but removing headers is new to me. Are you sure? -x headerfield Extract the contents of this headerfield from the header. Line continuations will be left intact; if you want the value on a single line then you’ll also need the -c option. -X headerfield Same as -x, but also preserves/includes the field name. Where does it say you can erase? You guys should understand what it means that an email is a legal voucher. Then you will know what not to do with incoming mail, and you will know that the authors of procmail and other mail utilities are using a format that does not compromise the legal significance of the mail. Fundamental question! In the Message; Subject : Re: [oS-en] Finding duplicates in email Message-ID : <c800e49f-6c34-b328-9b01-30af5a1bb4f3@telefonica.net> Date & Time: Tue, 4 Jul 2023 11:40:20 +0200 [CER] == "Carlos E. R." <robin.listas@telefonica.net> has written: [..] CER> I happen to have very complex procmail recipes. The result were CER> duplicates. [...] Can you tell me how to use procmail with Thunderbird, a MUA? I asked ChatGPT, but she said she doesn't know. Regards. --- ┏━━┓彡 野宮 賢 mail-to: nomiya @ lake.dti.ne.jp ┃\/彡 ┗━━┛ "No Windows, no gains!" ... "Why, I am wrong?" -- Bill --
On Tue, 4 Jul 2023 11:40:20 +0200, "Carlos E. R." <robin.listas@telefonica.net> wrote:
On 2023-07-04 11:21, Masaru Nomiya wrote:
[...] Duplicate mails should be determined by the unique Message-ID for each mail.
Not exclusively.
For example, when people reply both to the list and to the person, you get two copies of the same mail, slightly different. Same message-id, different received headers, possibly a different footer and reply to.
An interesting case, that is. If you receive a reply directly to your email address, then it is a personal message to you. But if later, you receive the same message via the list, then they are both a public message. It is like quantum physics. The nature of one is determined by the observation of the other one at some possibly widely separated time and place. Have fun trying to set up a proper filter for that.
My procmail recipe moves one of those to a different folder for direct duped replies.
-- Robert Webb
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wednesday, 2023-07-05 at 08:39 -0000, Robert Webb via openSUSE Users wrote:
On Tue, 4 Jul 2023 11:40:20 +0200, "Carlos E. R." <robin.listas@telefonica.net> wrote:
On 2023-07-04 11:21, Masaru Nomiya wrote:
[...] Duplicate mails should be determined by the unique Message-ID for each mail.
Not exclusively.
For example, when people reply both to the list and to the person, you get two copies of the same mail, slightly different. Same message-id, different received headers, possibly a different footer and reply to.
An interesting case, that is. If you receive a reply directly to your email address, then it is a personal message to you. But if later, you receive the same message via the list, then they are both a public message. It is like quantum physics. The nature of one is determined by the observation of the other one at some possibly widely separated time and place. Have fun trying to set up a proper filter for that.
I do :-)
My procmail recipe moves one of those to a different folder for direct duped replies.
:0 * ^(it is for this account) { LOG="Logging: section for telefonica-listas (alias) $NL" :0f * ^List-Id: <security.lists.opensuse.org> | $FORMAIL -bfi 'Reply-To: "oS-sec" <opensuse-security@opensuse.org>' :0 a: $HOME/Mail/_Lists/os-security :0f * ^List-Id: <kde.lists.opensuse.org> | $FORMAIL -bfi 'Reply-To: "oS-kde" <kde@lists.opensuse.org>' :0 a: $HOME/Mail/_Lists/os-kde :0f * ^List-Id: <kde3.lists.opensuse.org> | $FORMAIL -bfi 'Reply-To: "oS-kde3" <kde3@lists.opensuse.org>' :0 a: $HOME/Mail/_Lists/os-kde ... :0f * ^List-Id: <users.lists.opensuse.org> | $FORMAIL -bfi 'Reply-To: "OS-en" <users@lists.opensuse.org>' :0 a: $HOME/Mail/_Lists/os-en ... # ··········· Remove dups (sent both to list and direct) :0 : * ^TO_((suse-linux-s|suse-linux-e|suse-security|suse-programming-e)@suse.com|(opensuse-support|opensuse-offtopic|opensuse-es|opensuse-programming|opensuse-project|opensuse-security|opensuse-translation|opensuse-translation-es|opensuse-factory|opensuse)@opensuse.org|evergreen@lists.rosenauer.org|(susefaq-spanish|xine-devel|smartmontools-support)@lists.sourceforge.net|alpine-(alpha|info)@u.washington.edu|sflphone@lists.savoirfairelinux.net|linux-uvc-devel@lists.sourceforge.net|packman@links2linux.de|xfs@oss.sgi.com|linux-xfs@vger.kernel.org|offtopic@sweet-haven.com|postfix-users@postfix.org|shotwell-list@gnome.org|lazarus@lists.lazarus-ide.org|dm-crypt@saout.de|ffmpeg-user.ffmpeg.org) $HOME/Mail/_Lists/in_dups # Catch all para el alias robin.listas@telefonica.net :0 : $HOME/Mail/_Lists/in_rst_3 } #VERBOSE=off #----- End ALIAS robin.listas -------------------------------- This concoction puts the your email to the list in $HOME/Mail/_Lists/os-en, but your direct email to $HOME/Mail/_Lists/in_dups, as desired :-) - -- Cheers, Carlos E. R. (from openSUSE 15.4 x86_64 at Telcontar) -----BEGIN PGP SIGNATURE----- iHoEARECADoWIQQZEb51mJKK1KpcU/W1MxgcbY1H1QUCZKVbgRwccm9iaW4ubGlz dGFzQHRlbGVmb25pY2EubmV0AAoJELUzGBxtjUfVvQ8An1hG5UnfBVzgU4SEztG5 WjvV733FAJ9YL3LkmtTjMizno1oL2givWU1umA== =d4vQ -----END PGP SIGNATURE-----
On 2023-07-04 10:35, Carlos E. R. wrote:
Do you know some other tool to find and remove duplicates from a mail folder, where I can tell it to ignore some header?
I found this: https://serverfault.com/questions/255665/remove-duplicate-messages-from-mail... The easier seems to be this answer: Gnome's Evolution [a graphical mail user agent] has a built-in feature to remove duplicate mail. As explained on this help page, it boils down to: Select the suspect messages (or just all messages) Go to menu Messages, the choose Remove Duplicate Messages. Voilà. P.S. Evolution can access your messages locally (MailDir, MH, Mbox) or over IMAP. But I have to check if deletes immediately or asks, or moves to trash. -- Cheers / Saludos, Carlos E. R. (from 15.4 x86_64 at Telcontar)
* Carlos E. R. <robin.listas@telefonica.net> [07-04-23 09:11]:
On 2023-07-04 10:35, Carlos E. R. wrote:
Do you know some other tool to find and remove duplicates from a mail folder, where I can tell it to ignore some header?
I found this:
https://serverfault.com/questions/255665/remove-duplicate-messages-from-mail...
The easier seems to be this answer:
Gnome's Evolution [a graphical mail user agent] has a built-in feature to remove duplicate mail. As explained on this help page, it boils down to:
Select the suspect messages (or just all messages) Go to menu Messages, the choose Remove Duplicate Messages.
Voilà.
P.S. Evolution can access your messages locally (MailDir, MH, Mbox) or over IMAP.
But I have to check if deletes immediately or asks, or moves to trash.
mutt has a feature to tag duplicates and a feature to delete tagged mails. -- (paka)Patrick Shanahan Plainfield, Indiana, USA @ptilopteri http://en.opensuse.org openSUSE Community Member facebook/ptilopteri Photos: http://wahoo.no-ip.org/piwigo paka @ IRCnet oftc
On 2023-07-04 15:21, Patrick Shanahan wrote:
* Carlos E. R. <robin.listas@telefonica.net> [07-04-23 09:11]:
On 2023-07-04 10:35, Carlos E. R. wrote:
Do you know some other tool to find and remove duplicates from a mail folder, where I can tell it to ignore some header?
I found this:
https://serverfault.com/questions/255665/remove-duplicate-messages-from-mail...
The easier seems to be this answer:
Gnome's Evolution [a graphical mail user agent] has a built-in feature to remove duplicate mail. As explained on this help page, it boils down to:
Select the suspect messages (or just all messages) Go to menu Messages, the choose Remove Duplicate Messages.
Voilà.
P.S. Evolution can access your messages locally (MailDir, MH, Mbox) or over IMAP.
But I have to check if deletes immediately or asks, or moves to trash.
Deletes without asking.
mutt has a feature to tag duplicates and a feature to delete tagged mails.
Huh. I find mutt too hard to use. Alpine I use a lot, but it doesn't have this feature. It seems I have to go for the delete header path. But now I have to take the road. -- Cheers / Saludos, Carlos E. R. (from 15.4 x86_64 at Telcontar)
On Tue, 4 Jul 2023 15:36:31 +0200 Carlos E. R. wrote:
On 2023-07-04 15:21, Patrick Shanahan wrote:
* Carlos E. R. <robin.listas@telefonica.net> [07-04-23 09:11]:
On 2023-07-04 10:35, Carlos E. R. wrote:
Do you know some other tool to find and remove duplicates from a mail folder, where I can tell it to ignore some header?
I found this:
https://serverfault.com/questions/255665/remove-duplicate-messages-from-mail...
The easier seems to be this answer:
Gnome's Evolution [a graphical mail user agent] has a built-in feature to remove duplicate mail. As explained on this help page, it boils down to:
Select the suspect messages (or just all messages) Go to menu Messages, the choose Remove Duplicate Messages.
Voilà.
P.S. Evolution can access your messages locally (MailDir, MH, Mbox) or over IMAP.
But I have to check if deletes immediately or asks, or moves to trash.
Deletes without asking.
mutt has a feature to tag duplicates and a feature to delete tagged mails.
Huh. I find mutt too hard to use. Alpine I use a lot, but it doesn't have this feature.
It seems I have to go for the delete header path. But now I have to take the road.
Claws-Mail can find and delete duplicates in individual folders or the whole mailbox. -- Bob Williams No HTML please. Plain text preferred. https://useplaintext.email/ http://www.catb.org/~esr/faqs/smart-questions.html
On 2023-07-04 16:32, Bob Williams wrote:
On Tue, 4 Jul 2023 15:36:31 +0200 Carlos E. R. wrote:
...
Claws-Mail can find and delete duplicates in individual folders or the whole mailbox.
Deletes directly, or moves to trash? Asks for confirmation or shoots ahead full steam? -- Cheers / Saludos, Carlos E. R. (from Elesar, using openSUSE Leap 15.5)
On Tue, 4 Jul 2023 17:07:51 +0200 Carlos E. R. wrote:
On 2023-07-04 16:32, Bob Williams wrote:
On Tue, 4 Jul 2023 15:36:31 +0200 Carlos E. R. wrote:
...
Claws-Mail can find and delete duplicates in individual folders or the whole mailbox.
Deletes directly, or moves to trash? Asks for confirmation or shoots ahead full steam?
Depends whether you've set moving/deletion to execute immediately or wait for confirmation. -- Bob Williams No HTML please. Plain text preferred. https://useplaintext.email/ http://www.catb.org/~esr/faqs/smart-questions.html
On 2023-07-04 17:20, Bob Williams wrote:
On Tue, 4 Jul 2023 17:07:51 +0200 Carlos E. R. wrote:
On 2023-07-04 16:32, Bob Williams wrote:
On Tue, 4 Jul 2023 15:36:31 +0200 Carlos E. R. wrote:
...
Claws-Mail can find and delete duplicates in individual folders or the whole mailbox.
Deletes directly, or moves to trash? Asks for confirmation or shoots ahead full steam?
Depends whether you've set moving/deletion to execute immediately or wait for confirmation.
Ah, perfect, it is configurable. I'll have a look when I get back home. -- Cheers / Saludos, Carlos E. R. (from Elesar, using openSUSE Leap 15.5)
On 2023-07-04 17:38, Carlos E. R. wrote:
On 2023-07-04 17:20, Bob Williams wrote:
On Tue, 4 Jul 2023 17:07:51 +0200 Carlos E. R. wrote:
On 2023-07-04 16:32, Bob Williams wrote:
On Tue, 4 Jul 2023 15:36:31 +0200 Carlos E. R. wrote:
...
Claws-Mail can find and delete duplicates in individual folders or the whole mailbox.
Deletes directly, or moves to trash? Asks for confirmation or shoots ahead full steam?
Depends whether you've set moving/deletion to execute immediately or wait for confirmation.
Ah, perfect, it is configurable. I'll have a look when I get back home.
I just tried. I setup a small folder with some duplicates, including list mail that had also private copies. It went ahead deleting without questions, deleting all of them. This will not do, I want to keep the private copies. It is probably using the message-id alone. I do not see where to configure it (there is no search inside the general configuration dialog). -- Cheers / Saludos, Carlos E. R. (from 15.4 x86_64 at Telcontar)
On 2023-07-04 15:10, Carlos E. R. wrote:
On 2023-07-04 10:35, Carlos E. R. wrote:
Do you know some other tool to find and remove duplicates from a mail folder, where I can tell it to ignore some header?
I found this:
https://serverfault.com/questions/255665/remove-duplicate-messages-from-mail...
The easier seems to be this answer:
Gnome's Evolution [a graphical mail user agent] has a built-in feature to remove duplicate mail. As explained on this help page, it boils down to:
Select the suspect messages (or just all messages) Go to menu Messages, the choose Remove Duplicate Messages.
Voilà.
P.S. Evolution can access your messages locally (MailDir, MH, Mbox) or over IMAP.
But I have to check if deletes immediately or asks, or moves to trash.
Well, Evolution is unable to retrieve the list of folders to subscribe to from my local dovecot. It apparently triggered some search text operation on my entire dovecot server which is taking ages: after all, it is several gigabytes in size. [...] I had to stop all mail clients, then while I had lunch, run: doveadm -v force-resync -u cer '*' #doveadm -v fts rescan -v -u cer} doveadm -v index -u cer '*' Then finally evolution could run, find the test folder, and then I could try to delete duplicates. It does the correct thing, which results in not deleting all duplicates. It is not configurable, but it asks before processing. And when it finished, the mails are marked for deletion, but not deleted: it needs an explicit "expunge" operation to actually delete emails. I'll have to try formail next to remove spam headers, then try with Thunderbird, which is the best of the three, when it works. -- Cheers / Saludos, Carlos E. R. (from 15.4 x86_64 at Telcontar)
From: "Carlos E. R." <robin.listas@telefonica.net> Date: Mon, 10 Jul 2023 18:43:16 +0200 . . . I'll have to try formail next to remove spam headers, then try with Thunderbird, which is the best of the three, when it works. -- Cheers / Saludos, Carlos E. R. (from 15.4 x86_64 at Telcontar) If you still need a filter to remove headers from mbox files, you could try this. -- Bob Rogers http://www.rgrjr.com/ #!/usr/bin/perl # # Filter to remove one or more headers from an mbox file. # # [created. -- rgr, 4-Jul-23.] use strict; use warnings; die "$0: Need at least one header name (e.g 'sender') on the command line.\n" unless @ARGV; my $header_regexp = lc(join('|', @ARGV)); @ARGV = (); die "$0: Header names must not have colons or spaces.\n" if $header_regexp =~ /[ :]/; warn "header_regexp '$header_regexp'"; my $in_headers = 0; while (<>) { if (/^From /) { $in_headers = 1; print; } elsif (/^$/) { $in_headers = 0; print; } elsif (! $in_headers) { print; } elsif (/^($header_regexp):/i) { $in_headers = 'skip'; } else { $in_headers = 1 # new header. if /^\S/; print unless $in_headers eq 'skip'; } }
participants (6)
-
Bob Rogers
-
Bob Williams
-
Carlos E. R.
-
Masaru Nomiya
-
Patrick Shanahan
-
Robert Webb