On 2023-07-04 11:21, Masaru Nomiya wrote:
Hello,
In the Message;
Subject : [oS-en] Finding duplicates in email Message-ID : <f7e5e28d-3a25-1ed0-0f62-9c6b5246a9ec@telefonica.net> Date & Time: Tue, 4 Jul 2023 10:35:00 +0200 (CEST)
[CER] == "Carlos E. R." <robin.listas@telefonica.net> has written:
CER> I have a large mail folder, with 20000 mails, of which probably CER> thousands are duplicates. I can see them with my eyes.
Yah, duplicate e-mails are so depressing.
In my case, with the .procmailrc setup, there are no duplicate emails at all.
I was merging mail from the laptop with the desktop machine, and as my sync process would not work for many moons, when I decided to repair that there was a backlog of thousands of mails. I happen to have very complex procmail recipes. The result were duplicates.
CER> I have a thunderbird extension to find duplicates. "Remove CER> Duplicate Messages" by Eyal CER> Rozenberg. https://github.com/eyalroz/removedupes/
CER> But it claims there are no duplicates.
CER> I have read the FAQ. Subsequently, I saved to file a message and CER> its duplicate, to do a compare, and yes, there is a difference: [...] CER> The reason of that difference, is that each copy comes from a CER> different run of my sorting script, the spam filter "thought CER> differently".
CER> (note to myself: disable spam filtering in that script)
Duplicate mails should be determined by the unique Message-ID for each mail.
Not exclusively. For example, when people reply both to the list and to the person, you get two copies of the same mail, slightly different. Same message-id, different received headers, possibly a different footer and reply to. My procmail recipe moves one of those to a different folder for direct duped replies.
CER> Do you know some other tool to find and remove duplicates from a CER> mail folder, CER> where I can tell it to ignore some header?
CER> ("Delete" means moving to trash folder)
Is there no function in Thunderbird to determine duplicate messages by Message-ID?
It is an addon, and it considers several other criteria: x Author x Recipients ('To') x CC List Status Flags x Message ID Number of lines in message x Send time x Size (headers & Body) x Subject Folder x Body Time comparison resolution [seconds] -- Cheers / Saludos, Carlos E. R. (from 15.4 x86_64 at Telcontar)