On Sun, 12 Aug 2018 13:00:32 +0200 Bernhard Voelker <mail@bernhard-voelker.de> wrote:
On 08/11/2018 01:00 AM, Greg Freemyer wrote:
All,
I don't know the origin of these files, but I have a 100GB of corrupted PST files.
From what I can tell some sort of a processing / extraction tool went haywire and prepended binary junk in the front of the real data. The actual start of the data is a header with !BDN as the first 4 chars.
The prepended junk from what I've seen can be roughly from 10-500 binary octets (chars). It is sort of like ram slack, but at the start of the files. (No idea how that happened).
If I knew for certain that the binary junk didn't have any newlines in it, this sed script would get rid of the junk:
find . -name \*.pst -exec sed -e '1s/^.*!BDN/!BDN/' -i "{}" \;
I know I can write a program to do the same but working in binary and not worrying about intervening newlines.
Is there a relatively straight forward way to accomplish the above?
fyi: I'm going to try and get replacement uncorrupted data files as well, but that might be easier said than done.
# Create a binary testfile file with a '!BDN' marker. $ { cat /usr/bin/cat; printf '!BDN'; cat /usr/bin/cat; } > testfile
# Use awk(1) to print everything after and including the marker. awk 'BEGIN{RS="!BDN";ORS=""} {if(n++)print}' < testfile > testfile2
Ecellent :)
Have a nice day, Berny
You just made mine! -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org