[opensuse] hypermail to mbox converter (search for help again)
Hi, I just noticed that i probably have to convert our old hypermail archives for some lists to mbox format to be able to archive them with the new system again. There are several perl scripts out there to do that. Unfortunately none of them works out of the box on our old hypermail layout. So i have to hack them up to be able to convert the old archives. _But_ im knee-deep in other things at the moment regarding the new list server. Now you come into play. This is the opportunity to help openSUSE project to archive some important goal. I need someone to help me out with "fixing" one of the scripts below or write something from scratch so that we are able to convert old archives to the new format. Here is a gzipped tar archive with an old hypermail archive from lists.suse.com: http://lists4.opensuse.org/1997-Aug.tar.gz Here are links to some hypermail to mbox scripts: http://archive.ncsa.uiuc.edu/lists/mhonarc/oct98/msg00055.html http://www.tifaware.com/perl/hm2mbox/ http://www.bayesianinvestor.com/hypetombox.pl It would be very cool if someone other than me could do that. If you want to help out, you need more information or anything else then answer to this mail or contact me directly. TIA! Henne -- Henne Vogelsang, Core Services "Rules change. The Game remains the same." - Omar (The Wire) --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-unsubscribe@opensuse.org For additional commands, e-mail: opensuse-help@opensuse.org
On Thu, Jun 08, 2006 at 07:09:52PM +0200, Henne Vogelsang wrote:
Here is a gzipped tar archive with an old hypermail archive from lists.suse.com:
http://lists4.opensuse.org/1997-Aug.tar.gz
Here are links to some hypermail to mbox scripts:
http://archive.ncsa.uiuc.edu/lists/mhonarc/oct98/msg00055.html http://www.tifaware.com/perl/hm2mbox/ http://www.bayesianinvestor.com/hypetombox.pl
After running "./hypetombox.pl -a -d 1997-Aug/ -m mbox.mbox" all you need to do is convert the following lines:
From heiner.lamprecht@student.invalid Bogus date Date: 1 Aug 1997 00:00:56 +0200
to
From heiner.lamprecht@student.invalid Fri Aug 1 00:00:56 1997 Date: 1 Aug 1997 00:00:56 +0200
No idea how to do it, so if somebody beats me in time to do it (or finds an other solution) please don't hesitate to give the solution. What I will be working on is not so much a re-write of the script. because I can't do perl, but a (very ugly) bash aproach to edit the dates, although at this mment I have no idea how to do changes on a previous line or how to get the say of the week. sed -e "s/Bogus date/Fri Jun 9 01:55:45 2006/g" mbox.mbox > 3.mbox That seems to work at least, although it is the wrong dat (DUH). So proof of concept is given. houghi -- This openSUSE mailinglist is about the community. All discussion about the community is welcome. If you have a techical question just subscribe via this email address: suse-linux-e-subscribe@suse.com, post your original email again there, and you will get a straight answer. --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-unsubscribe@opensuse.org For additional commands, e-mail: opensuse-help@opensuse.org
On Fri, Jun 09, 2006 at 02:26:00AM +0200, houghi wrote:
On Thu, Jun 08, 2006 at 07:09:52PM +0200, Henne Vogelsang wrote:
Here is a gzipped tar archive with an old hypermail archive from lists.suse.com:
http://lists4.opensuse.org/1997-Aug.tar.gz
Here are links to some hypermail to mbox scripts:
http://archive.ncsa.uiuc.edu/lists/mhonarc/oct98/msg00055.html http://www.tifaware.com/perl/hm2mbox/ http://www.bayesianinvestor.com/hypetombox.pl
After running "./hypetombox.pl -a -d 1997-Aug/ -m mbox.mbox" all you need to do is convert the following lines:
From heiner.lamprecht@student.invalid Bogus date Date: 1 Aug 1997 00:00:56 +0200
to
From heiner.lamprecht@student.invalid Fri Aug 1 00:00:56 1997 Date: 1 Aug 1997 00:00:56 +0200
No idea how to do it, so if somebody beats me in time to do it (or finds an other solution) please don't hesitate to give the solution.
OK, I have an extremely slow working script. What you can do are two things. 1) edit ./hypetombox.pl and especialy the lines with $received in it. This would be the best option. Unfortunatly I am unable to do it. I can guess what things mean and it did not produce anythin workable. :-( 2) edit the *.html files and then especially the lines above. I have something working, but it is sloooooooooow. It works, but slow. Two directories a total of 1m41.389s The script I have made also converts each directory to a $DIR.mbox file. It is available at http://houghi.org/script/SUSEmail.sh It should be enough info to change it into something fast. houghi -- This openSUSE mailinglist is about the community. All discussion about the community is welcome. If you have a techical question just subscribe via this email address: suse-linux-e-subscribe@suse.com, post your original email again there, and you will get a straight answer. --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-unsubscribe@opensuse.org For additional commands, e-mail: opensuse-help@opensuse.org
Hi, On Friday, June 09, 2006 at 05:32:07, houghi wrote:
On Fri, Jun 09, 2006 at 02:26:00AM +0200, houghi wrote:
On Thu, Jun 08, 2006 at 07:09:52PM +0200, Henne Vogelsang wrote:
Here is a gzipped tar archive with an old hypermail archive from lists.suse.com:
http://lists4.opensuse.org/1997-Aug.tar.gz
Here are links to some hypermail to mbox scripts:
http://archive.ncsa.uiuc.edu/lists/mhonarc/oct98/msg00055.html http://www.tifaware.com/perl/hm2mbox/ http://www.bayesianinvestor.com/hypetombox.pl
After running "./hypetombox.pl -a -d 1997-Aug/ -m mbox.mbox" all you need to do is convert the following lines:
From heiner.lamprecht@student.invalid Bogus date Date: 1 Aug 1997 00:00:56 +0200
to
From heiner.lamprecht@student.invalid Fri Aug 1 00:00:56 1997 Date: 1 Aug 1997 00:00:56 +0200
No idea how to do it, so if somebody beats me in time to do it (or finds an other solution) please don't hesitate to give the solution.
OK, I have an extremely slow working script. What you can do are two things. 1) edit ./hypetombox.pl and especialy the lines with $received in it. This would be the best option. Unfortunatly I am unable to do it. I can guess what things mean and it did not produce anythin workable. :-(
This is what i would like to have. Anyone with good perl regular expressions here?
It is available at http://houghi.org/script/SUSEmail.sh
This produces more bogus date messages for me than the hypetombox.pl script alone does :-( Henne -- Henne Vogelsang, Core Services "Rules change. The Game remains the same." - Omar (The Wire) --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-unsubscribe@opensuse.org For additional commands, e-mail: opensuse-help@opensuse.org
On Fri, Jun 09, 2006 at 11:52:53AM +0200, Henne Vogelsang wrote:
OK, I have an extremely slow working script. What you can do are two things. 1) edit ./hypetombox.pl and especialy the lines with $received in it. This would be the best option. Unfortunatly I am unable to do it. I can guess what things mean and it did not produce anythin workable. :-(
This is what i would like to have. Anyone with good perl regular expressions here?
It is available at http://houghi.org/script/SUSEmail.sh
This produces more bogus date messages for me than the hypetombox.pl script alone does :-(
Strange. I don't get any bogus date message. Do you get them on the directory that you posted (aug'97) or on other directories? I get this: [12:38:03] [~/tmp/mail] houghi@penne : l total 1692 drwxr-xr-x 3 houghi users 168 2006-06-09 12:38 ./ drwxr-xr-x 7 houghi users 392 2006-06-09 12:35 ../ drwxr-xr-x 2 houghi users 60368 2006-06-08 17:40 1997-Aug/ -rw-r--r-- 1 houghi users 1654438 2006-06-08 18:50 1997-Aug.tar.gz -rwxr-xr-x 1 houghi users 8451 2001-12-03 06:31 hypetombox.pl* -rw-r--r-- 1 houghi users 738 2006-06-09 05:27 SUSEmail.sh [12:38:05] [~/tmp/mail] houghi@penne : sh SUSEmail.sh 1881 messages processed (1881 good messages : 0 bogus date messages) houghi -- This openSUSE mailinglist is about the community. All discussion about the community is welcome. If you have a techical question just subscribe via this email address: suse-linux-e-subscribe@suse.com, post your original email again there, and you will get a straight answer. --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-unsubscribe@opensuse.org For additional commands, e-mail: opensuse-help@opensuse.org
On Fri, Jun 09, 2006 at 12:41:51PM +0200, houghi wrote:
Strange. I don't get any bogus date message. Do you get them on the directory that you posted (aug'97) or on other directories?
Still curious if it happend with other directories. I can imagine that it did if the date format changed later on. If the error comes from other directories, it would be nice to know where it came from and have it available to test it. I have found str2time which makes ASCII time from the time. So far so good. The next step is to make it into "ANSI C asctime() format". time2str turns it into HTTP readable stuff. So who knows how to turn ASCII time into ANSI time in perl? I have something, yet it does NOT give the correct output: houghi@penne : diff hypetombox.pl.orig hypetombox.pl 40a41
use HTTP::Date; 143a145,147 elsif($time = str2time($received)) { $date = time2str($time); }
houghi -- This openSUSE mailinglist is about the community. All discussion about the community is welcome. If you have a techical question just subscribe via this email address: suse-linux-e-subscribe@suse.com, post your original email again there, and you will get a straight answer. --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-unsubscribe@opensuse.org For additional commands, e-mail: opensuse-help@opensuse.org
Please have a look at http://www.thomashertweck.de/hypetombox.pl - the command "perl ./hypetombox.pl -d 1997-Aug -m mbox.1997Aug" worked for me without problems for the archive that was provided. The script is using Date::Manip, so only 5 lines of additional code were required. Cheers, Thomas --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-unsubscribe@opensuse.org For additional commands, e-mail: opensuse-help@opensuse.org
On Fri, Jun 09, 2006 at 09:15:32PM +0100, Thomas Hertweck wrote:
Please have a look at http://www.thomashertweck.de/hypetombox.pl - the command "perl ./hypetombox.pl -d 1997-Aug -m mbox.1997Aug" worked for me without problems for the archive that was provided.
The original did give errors. And apparently so with you, otherwise you would not have needed to edit it. ;-)
The script is using Date::Manip, so only 5 lines of additional code were required.
Seems to work flawlesly. If there are still errors, they could be comming from newer archives where the date is set differently. It also shows the differenence between be who works like a monkey writing Shakespear and you who knows what he is doing. :-) Thomas: I hope that you will be reporting that back to the hypetobox developers. Thanks for the solution. houghi -- This openSUSE mailinglist is about the community. All discussion about the community is welcome. If you have a techical question just subscribe via this email address: suse-linux-e-subscribe@suse.com, post your original email again there, and you will get a straight answer. --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-unsubscribe@opensuse.org For additional commands, e-mail: opensuse-help@opensuse.org
houghi wrote:
On Fri, Jun 09, 2006 at 09:15:32PM +0100, Thomas Hertweck wrote:
Please have a look at http://www.thomashertweck.de/hypetombox.pl - the command "perl ./hypetombox.pl -d 1997-Aug -m mbox.1997Aug" worked for me without problems for the archive that was provided.
The original did give errors. And apparently so with you, otherwise you would not have needed to edit it. ;-)
Sure, you and Henne reported that. That's why I have changed it and uploaded a new version ;-) The command mentioned above was of course using the new version of hypetombox.pl - I thought this was obvious.
[...] Seems to work flawlesly. If there are still errors, they could be comming from newer archives where the date is set differently.
That's correct. But it should be fairly easy to adjust the script using Date::Manip. Date::Manip offers various functions and ways to format the output. By the way, there might still be some problems with the German umlauts.
It also shows the differenence between be who works like a monkey writing Shakespear and you who knows what he is doing. :-)
Thomas: I hope that you will be reporting that back to the hypetobox developers.
Well, Henne should test it first, I am sure we might see some other problems (e.g. German umlauts, etc.). Once these problems are fixed, we might inform the original developers. Cheers, Th. --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-unsubscribe@opensuse.org For additional commands, e-mail: opensuse-help@opensuse.org
Hi, On Thursday, June 08, 2006 at 19:09:52, Henne Vogelsang wrote:
Now you come into play. This is the opportunity to help openSUSE project to archive some important goal.
Thanks to Robert, Houghi, Jens-Daniel and Anders i have a working script now :) Henne -- Henne Vogelsang, Core Services "Rules change. The Game remains the same." - Omar (The Wire) --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-unsubscribe@opensuse.org For additional commands, e-mail: opensuse-help@opensuse.org
Henne Vogelsang wrote:
On Thursday, June 08, 2006 at 19:09:52, Henne Vogelsang wrote:
Now you come into play. This is the opportunity to help openSUSE project to archive some important goal.
Thanks to Robert, Houghi, Jens-Daniel and Anders i have a working script now :)
Just one comment: I have followed up Houghi's initial reports on this list and posted a working version of hypetombox.pl on Friday evening. I haven't seen any updates to this posting (except Houghi's reply on Saturday), I haven't seen any follow-up message from Robert, Jens-Daniel, or Anders telling the list that they are working on a solution, or that a solution has already been found earlier. This, from my point of view, is clearly a communication problem! My conclusion from this incident: this was the first time I really tried to help out and it was certainly also the last time! I have lots of work to do and I don't want to waste my time and try to come up with solutions for problems at openSUSE if these solutions are then just ignored. If the solution did not work, somebody should have posted a message and informed me! If somebody had already found a solution earlier than Friday evening, he should have posted a message. This is just annoying, and it was a waste of time... Cheers, Th. --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-unsubscribe@opensuse.org For additional commands, e-mail: opensuse-help@opensuse.org
Hi, On Monday, June 12, 2006 at 19:17:52, Thomas Hertweck wrote:
Henne Vogelsang wrote:
On Thursday, June 08, 2006 at 19:09:52, Henne Vogelsang wrote:
Now you come into play. This is the opportunity to help openSUSE project to archive some important goal.
Thanks to Robert, Houghi, Jens-Daniel and Anders i have a working script now :)
Just one comment: I have followed up Houghi's initial reports on this list and posted a working version of hypetombox.pl on Friday evening.
At that point i already was in the Weekend. On monday i had all the info that i needed to make hypetombox.pl work. Including your mail. So your effort also helped. Sorry that i didnt include you in the thanks above. It was certainly helpful and appreciated Thomas. Henne -- Henne Vogelsang, Core Services "Rules change. The Game remains the same." - Omar (The Wire) --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-unsubscribe@opensuse.org For additional commands, e-mail: opensuse-help@opensuse.org
Henne Vogelsang wrote:
[...]
At that point i already was in the Weekend. On monday i had all the info that i needed to make hypetombox.pl work. Including your mail. So your effort also helped. Sorry that i didnt include you in the thanks above. It was certainly helpful and appreciated Thomas.
Thanks for the thanks, Henne ;-) But this was not the point I wanted to make (I didn't want to claim any publicity, I am happily working in the background). It's about working together in general and about efficiency! If several people work on the solution of one and the same problem and they don't know of each other's efforts, we're duplicating work/efforts and might reinvent the wheel... This is not good and I don't want to work like that! We need to inform others not only when there is work to do, but also when work is in progress or when work has been finished. Otherwise it's highly inefficient and it's not really like working in a team. This concerns every member of the community and I hope that this will improve in future. Cheers, Th. --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-unsubscribe@opensuse.org For additional commands, e-mail: opensuse-help@opensuse.org
participants (3)
-
Henne Vogelsang
-
houghi
-
Thomas Hertweck