On Monday 12 November 2007 22:24, Chris Arnold wrote:
Stefan Hundhammer wrote:
perl -p -i -e 's/oldtext/newtext/' *.html
/oldtext will be lines of html and /newtext will be lines of php. Will perl still be able to do it and if so, do i need to escape some of the code in oldtext and newtext? Example: <HTML> the < and > and <?php ;?> does any of that need to be escaped?
I did some experimenting, and admittedly there are some caveats with that stuff. But here is a skeleton for you: perl -p -i -0777 \ -e 's/^.*<!DOCTYPE\s+html.*<body>/PHP-Header\n/si;' \ -e 's:</body>.*$:<?PHP-Footer?>\n:si;' \ -e 's:moreoldstuff:newoldstuff:g;' \ *.html Note: This is one single line of shell command. I just reformatted it for better legibility. Let's take this apart. perl -p : This reads the files specified on the command line as input files line by line and prints each single line. If you don't do anything else, this is a glorified "copy" command. But with regular search-and-replace, this becomes more like a "sed" call. Note there is also "perl -n" which does not print; you'd have to append 'p' ("print") to each regexp replace to write something to the output file, or use the regular perl "print" command. -i : This does all changes in-place, i.e. you don't need to supply an input and an output file. Otherwise you'd have to write your own loop in the shell and do something like perl -p -e'<do something>' <infile >outfile perl -i does that loop for you, reads from each infile from the command line, writes to a new file and renames the new file afterwards to the name of the infile. You can also specify a backup file extension: "perl -i.bak" will back up all old infiles to "infile.bak". -e : This specifies one perl expression. You can use several -e args, but then you need to delimit all (except the last one) with a semicolon. A bit unlike "sed", unfortunately. By default, perl reads one single line from the infile, processes it with all your -e expressions and (with -p) writes it to the outfile. This is what most people need in most cases. You can use that as a "sed" substitute with in-place editing - this is what I wrote in my first post to this thread: perl -p -i -e 's/oldstuff/newstuff/g' The '/g' at the end tells perl to do that globally, i.e., more than once. Otherwise it would just replace one single time. Just like "sed". In your special case, though, you want to search and replace over multiple lines. That's a bit tricker. For one thing, you need to tell perl to read more than just one line at a time. For example, the entire file at once. This is what -0777 does: It changes the input record delimiter from \n (newline) to character 0777 (octal) which doesn't exist, thus the entire file is read at once. (See "man perlrun"). Then, you also have to make perl match more than just one single line in a regular expression: s/oldstuff/morestuff/s Note that this is necessary in addition to having the whole file in one single string. I also added /i to match case-insensitive which makes sense for HTML tags. Quoting regular expressions is another tricky part. You have to watch carefully which characters in "oldstuff" have a special meaning in perl's (very powerful!) regular expressions. But typically that's not a problem because that part is hand-written, not variable stuff coming from a file. \s is useful: It's a shorthand for "any kind of whitespace character" - blank, tab, newline. "\s+" means "at least one whitespace character, but maybe more (any number)". In the replace text there are a lot less characters with special meaning. $1, $2, ... $9 come to mind; they are placeholders for an expression in parentheses in "oldstuff". If your search regexp contains slashes, it makes sense to use some other delimiter character; this is what I did in the other -e expressions: s:oldstuff/with/slashes:newstuff: You could also escape every single slash with a backslash, but that's tedious, error-prone and it looks ugly: s:oldstuff\/with\/slashes/newstuff/ Did I forget something? Probably. But hey, I don't want to deprive you of all intellectual challenges. ;-) I hope I gave you some good starting points, though. More info: man perlre (Perl regular expressions) man perlrun (Perl command line switches) man perlop (Perl quoting) CU -- Stefan Hundhammer <sh@suse.de> Penguin by conviction. YaST2 Development SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) Nürnberg, Germany -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org