[opensuse] BASH - rankinism - best way to trim string and add ellipses (or other char)?
DH, All, I want a function 'ellipse' that will take a string 'str' and trim it to length 'len' and set the 'end' number (default = 3) of characters to 'chr' (default='.'). So if I call it like this: myvar=0123456789 echo "mydefvar: $(ellipse $myvar 7)" echo "mynewvar: $(ellipse $myvar 7 3 '*')" I get: mydefvar: 0123... mynewvar: 0123*** What I have come up with is the following, but I would like comment on how to make it better, etc.. Is there a way to eliminate the for loop and just use string substitution to overwrite the last 'end' number of characters with 'chr' that would be more efficient? What says the brain trust. ellipse() { # validate sufficient parameters passed [[ -z $1 ]] || [[ -z $2 ]] && { echo "ERROR: Insufficient input in fxn ellipse"; return 1; } str="$1" lim="$2" end=${3:-3} chr=${4:-.} # validate integers [ $lim -eq $lim ] 2>/dev/null || return 1 [ $end -eq $end ] 2>/dev/null || return 1 [[ ${#str} -gt $lim ]] && newstr=${str:0:$((lim-end))} { for((i=$((lim-end));i<$lim;i++)); do newstr=${newstr}${chr} done echo "$newstr" } || { echo "$str" } } -- David C. Rankin, J.D.,P.E. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Hello, On Tue, 29 Nov 2011, David C. Rankin wrote:
I want a function 'ellipse' that will take a string 'str' and trim it to length 'len' and set the 'end' number (default = 3) of characters to 'chr' (default='.'). So if I call it like this:
myvar=0123456789
echo "mydefvar: $(ellipse $myvar 7)" echo "mynewvar: $(ellipse $myvar 7 3 '*')"
I get:
mydefvar: 0123... mynewvar: 0123***
What I have come up with is the following, but I would like comment on how to make it better, etc.. Is there a way to eliminate the for loop and just use string substitution to overwrite the last 'end' number of characters with 'chr' that would be more efficient? What says the brain trust.
I can't think of a way, but that's probably because I've "perl" in the back of my head.
ellipse() {
# validate sufficient parameters passed [[ -z $1 ]] || [[ -z $2 ]] && { echo "ERROR: Insufficient input in fxn ellipse"; return 1; }
You need to quote $1 and $2 here. Try with empty $1 or $2 to see why.
str="$1" lim="$2" end=${3:-3} chr=${4:-.}
# validate integers [ $lim -eq $lim ] 2>/dev/null || return 1 [ $end -eq $end ] 2>/dev/null || return 1
[[ ${#str} -gt $lim ]] && newstr=${str:0:$((lim-end))} { for((i=$((lim-end));i<$lim;i++)); do newstr=${newstr}${chr} done
echo "$newstr" } || { echo "$str" } }
JFTR: if test ${#str} -gt $lim; then newstr=${str:0:$((lim-end))}; perl -e "printf qq[%s%${end}s\n], qq[$newstr], qq[$chr] x $end;" else printf "$str\n"; fi Of course you could truncate $str in perl to etc. pp, but once using perl you should probably do the whole script (or at least the function) in perl, otherwise it'll be much slower. HTH, -dnh -- Who do I have to kill to get some attention around here! -- Georgia 'George' Lass, Dead Like Me -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 11/30/2011 07:32 AM, David Haller wrote:
JFTR:
if test ${#str} -gt $lim; then newstr=${str:0:$((lim-end))}; perl -e "printf qq[%s%${end}s\n], qq[$newstr], qq[$chr] x $end;" else printf "$str\n"; fi
Of course you could truncate $str in perl to etc. pp, but once using perl you should probably do the whole script (or at least the function) in perl, otherwise it'll be much slower.
HTH, -dnh
It does, that was exactly what I was looking for in return. I would have been disappointed if you would have come up with a much simpler way in BASH :) -- David C. Rankin, J.D.,P.E. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Tuesday 29 November 2011 21:57:42 David C. Rankin wrote:
I want a function 'ellipse' that will take a string 'str' and trim it to length 'len' and set the 'end' number (default = 3) of characters to 'chr' (default='.'). So if I call it like this:
myvar=0123456789
echo "mydefvar: $(ellipse $myvar 7)" echo "mynewvar: $(ellipse $myvar 7 3 '*')"
I get:
mydefvar: 0123... mynewvar: 0123***
What I have come up with is the following, but I would like comment on how to make it better, etc.. Is there a way to eliminate the for loop and just use string substitution to overwrite the last 'end' number of characters with 'chr' that would be more efficient? What says the brain trust.
ellipse() {
# validate sufficient parameters passed [[ -z $1 ]] || [[ -z $2 ]] && { echo "ERROR: Insufficient input in fxn ellipse"; return 1; }
str="$1" lim="$2" end=${3:-3} chr=${4:-.}
# validate integers [ $lim -eq $lim ] 2>/dev/null || return 1 [ $end -eq $end ] 2>/dev/null || return 1
[[ ${#str} -gt $lim ]] && newstr=${str:0:$((lim-end))} { for((i=$((lim-end));i<$lim;i++)); do newstr=${newstr}${chr} done
echo "$newstr" } || { echo "$str" } }
If you're still inclined to go the bash way, perhaps something like this would work (parameter validation and other sanity checks omitted): function ellipse() { str="${1}" lim="${2}" rem=${3:-3} char=${4:-.} [[ ${#str} -gt ${lim} ]] && { begin=${str:0:$((lim-rem))}; end=${str:$((lim-rem)):${rem}}; echo ${begin}${end//?/${char}}; } || echo "${str}" } -- Martti -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 12/03/2011 12:36 PM, Martti Laaksonen wrote:
[[ ${#str} -gt ${lim} ]]&& { begin=${str:0:$((lim-rem))}; end=${str:$((lim-rem)):${rem}}; echo ${begin}${end//?/${char}}; } || echo "${str}"
OK, That is pretty darn slick! Thanks Martti. I wouldn't have thought about string substitution using a single char wildcard as the substring no matter how many times I looked at ${string//substring/replacement} :) Now the question becomes -- which takes less clock cycles? ANSWER - both so close it's hard to call even with 100 repeated calls. For those interested a brief synopsis of the testing was: Hmm... elm.sh=Martti's way; eld.sh=David's way (first run nothing in cache): 10:15 providence:~/scr/dev> time sh elm.sh my-very-long-string-of-stuff 24 4 + my-very-long-string-++++ real 0m0.010s user 0m0.003s sys 0m0.003s 10:17 providence:~/scr/dev> time sh eld.sh my-very-long-string-of-stuff 24 4 + my-very-long-string-++++ real 0m0.011s user 0m0.007s sys 0m0.003s Martti's wins -- now let's run it a second time with the files in cache: 10:17 providence:~/scr/dev> time sh elm.sh my-very-long-string-of-stuff 24 4 + my-very-long-string-++++ real 0m0.010s user 0m0.007s sys 0m0.003s 10:19 providence:~/scr/dev> time sh eld.sh my-very-long-string-of-stuff 24 4 + my-very-long-string-++++ real 0m0.010s user 0m0.003s sys 0m0.003s Now -- I just don't get it -- results are opposite. (although the times are both so small as to be in the noise) Oh well, I still like the elegance of the string substitution that eliminates the for loop. OK, what about 100 times: time for ((i=0;i<100;i++)); do sh elm.sh my-very-long-string-of-stuff 24 4 +; done real 0m1.021s user 0m0.587s sys 0m0.220s time for ((i=0;i<100;i++)); do sh eld.sh my-very-long-string-of-stuff 24 4 +; done real 0m1.005s user 0m0.577s sys 0m0.220s For those interested, the test code was: 10:19 providence:~/scr/dev> cat elm.sh #!/bin/bash function emartti() { # validate sufficient parameters passed [[ -z "$1" ]] || [[ -z "$2" ]] && { echo "ERROR: Insufficient input in fxn emartti"; return 1; } str="${1}" lim="${2}" rem=${3:-3} char=${4:-.} # validate integers [ $lim -eq $lim ] 2>/dev/null || return 1 [ $rem -eq $rem ] 2>/dev/null || return 1 [[ ${#str} -gt ${lim} ]] && { begin=${str:0:$((lim-rem))}; end=${str:$((lim-rem)):${rem}}; echo ${begin}${end//?/${char}}; } || echo "${str}" } [[ -n $4 ]] && { emartti "$1" "$2" "$3" "$4" exit 0 } [[ -n $3 ]] && { emartti "$1" "$2" "$3" exit 0 } [[ -n $2 ]] && { emartti "$1" "$2" exit 0 } [[ -z $1 ]] || [[ -z $2 ]] && { echo "ERROR: Insufficient input in fxn emartti"; exit 1; } exit 0 10:22 providence:~/scr/dev> cat eld.sh #!/bin/bash ellipse() { # validate sufficient parameters passed [[ -z "$1" ]] || [[ -z "$2" ]] && { echo "ERROR: Insufficient input in fxn ellipse"; return 1; } str="$1" lim="$2" end=${3:-3} chr=${4:-.} # validate integers [ $lim -eq $lim ] 2>/dev/null || return 1 [ $end -eq $end ] 2>/dev/null || return 1 [[ ${#str} -gt $lim ]] && { newstr=${str:0:$((lim-end))} for((i=$((lim-end));i<$lim;i++)); do newstr=${newstr}${chr} done echo "$newstr" } || { echo "$str" } } [[ -n $4 ]] && { ellipse "$1" "$2" "$3" "$4" exit 0 } [[ -n $3 ]] && { ellipse "$1" "$2" "$3" exit 0 } [[ -n $2 ]] && { ellipse "$1" "$2" exit 0 } [[ -z $1 ]] || [[ -z $2 ]] && { echo "ERROR: Insufficient input in fxn ellipse"; exit 1; } exit 0 -- David C. Rankin, J.D.,P.E. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Hello, On Wed, 07 Dec 2011, David C. Rankin wrote:
On 12/03/2011 12:36 PM, Martti Laaksonen wrote:
[[ ${#str} -gt ${lim} ]]&& { begin=${str:0:$((lim-rem))}; end=${str:$((lim-rem)):${rem}}; echo ${begin}${end//?/${char}}; } || echo "${str}"
OK,
That is pretty darn slick! Thanks Martti. I wouldn't have thought about string substitution using a single char wildcard as the substring no matter how many times I looked at ${string//substring/replacement} :)
Me too. At least at the time.
Hmm... elm.sh=Martti's way; eld.sh=David's way (first run nothing in cache):
10:15 providence:~/scr/dev> time sh elm.sh my-very-long-string-of-stuff 24 4 + my-very-long-string-++++
real 0m0.010s user 0m0.003s sys 0m0.003s
10:17 providence:~/scr/dev> time sh eld.sh my-very-long-string-of-stuff 24 4 + my-very-long-string-++++
real 0m0.011s user 0m0.007s sys 0m0.003s
Martti's wins -- now let's run it a second time with the files in cache:
That's noise. That's basically the startup time of the shell. And BTW: both versions use bash-specific stuff (I think, you should use 'bash'). ==== If bash is invoked with the name sh, it tries to mimic the startup behavior of historical versions of sh as closely as possible, while conforming to the POSIX standard as well. ====
10:17 providence:~/scr/dev> time sh elm.sh my-very-long-string-of-stuff 24 4 + my-very-long-string-++++
real 0m0.010s user 0m0.007s sys 0m0.003s
10:19 providence:~/scr/dev> time sh eld.sh my-very-long-string-of-stuff 24 4 + my-very-long-string-++++
real 0m0.010s user 0m0.003s sys 0m0.003s
Now -- I just don't get it -- results are opposite. (although the times are both so small as to be in the noise) Oh well, I still like the elegance of the string substitution that eliminates the for loop.
Same again. "Noise". Rule of thumb: anything running under ~10s is not reliable. Note the "user" times in above 4 examples, they're a clear indication of the measurement being "noise".
OK, what about 100 times:
time for ((i=0;i<100;i++)); do sh elm.sh my-very-long-string-of-stuff 24 4 +; done
real 0m1.021s user 0m0.587s sys 0m0.220s
time for ((i=0;i<100;i++)); do sh eld.sh my-very-long-string-of-stuff 24 4 +; done
real 0m1.005s user 0m0.577s sys 0m0.220s
Still basically noise. But the explanation is easy: both versions use shell-builtins only, and you can bet that the ${v//p/r} internally expands to some loop, with the substition done inside the loop. With the loop version, you have a truncation, an explicit loop and simple concatenations inside the loop, which _might_ be faster than the substitutions. Please run again (outer-) looping (say 2000 times instead of 100?) so often, that the faster one is over 10s running time, but that's only out of academical interest ;) -dnh, BTW: one more thing I like about perl: it has the "Benchmark" module where you can e.g. compare stuff, with a sane default minimum running time ;) -- Wow, I'm being shot at from both sides. That means I *must* be right. :-) -- Larry Wall in <199710211959.MAA18990@wall.org> -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 07/12/11 13:35, David C. Rankin wrote: :
10:15 providence:~/scr/dev> time sh elm.sh my-very-long-string-of-stuff 24 4 + my-very-long-string-++++
real 0m0.010s user 0m0.003s sys 0m0.003s
Does "rankinism" equals or includes masochism ? :D -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Hello, On Thu, 08 Dec 2011, Cristian Rodríguez wrote:
On 07/12/11 13:35, David C. Rankin wrote:
10:15 providence:~/scr/dev> time sh elm.sh my-very-long-string-of-stuff 24 4 + my-very-long-string-++++
real 0m0.010s user 0m0.003s sys 0m0.003s
Does "rankinism" equals or includes masochism ? :D
You should try "hallerisms" ;P -dnh, I even got verbed on the german list ... -- Sigmonster was here! -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (4)
-
Cristian Rodríguez
-
David C. Rankin
-
David Haller
-
Martti Laaksonen