[opensuse] grep sed and awk question
I'm regex'd out of it at the moment. Given a string like this: lynn:*:3000002some other stuff100hellolynn: Is there a way to get the 3000002 into a variable v1 and the 100 into a variable v2? bash? Any recommended starting point? Thanks, L x -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 02/13/2012 12:05 AM, lynn wrote:
I'm regex'd out of it at the moment. Given a string like this:
lynn:*:3000002some other stuff100hellolynn:
Is there a way to get the 3000002 into a variable v1 and the 100 into a variable v2? bash? Any recommended starting point? Thanks, L x
Thinking out loud: v1="lynn:*:3000002some other stuff100hellolynn:";echo "${v1//[!0-9]}" gives: 3000002100 But I may have: lynn2:*:3000002some other stuff100hellolynn: which gives: 23000002100 How to select only the 3000002 Thanks -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
* lynn <lynn@steve-ss.com> [120213 00:40]:
On 02/13/2012 12:05 AM, lynn wrote:
I'm regex'd out of it at the moment. Given a string like this:
lynn:*:3000002some other stuff100hellolynn:
Is there a way to get the 3000002 into a variable v1 and the 100 into a variable v2? bash? Any recommended starting point? Thanks, L x
Thinking out loud: v1="lynn:*:3000002some other stuff100hellolynn:";echo "${v1//[!0-9]}" gives: 3000002100
But I may have: lynn2:*:3000002some other stuff100hellolynn: which gives: 23000002100
How to select only the 3000002 Thanks --
Does it have to be grep, sed and awk? cut -c 9-15 awk -F\: '{print $3}' |cut -c 1-7
On 2/12/2012 6:39 PM, lynn wrote:
On 02/13/2012 12:05 AM, lynn wrote:
I'm regex'd out of it at the moment. Given a string like this:
lynn:*:3000002some other stuff100hellolynn:
Is there a way to get the 3000002 into a variable v1 and the 100 into a variable v2? bash? Any recommended starting point? Thanks, L x
Thinking out loud: v1="lynn:*:3000002some other stuff100hellolynn:";echo "${v1//[!0-9]}" gives: 3000002100
But I may have: lynn2:*:3000002some other stuff100hellolynn: which gives: 23000002100
How to select only the 3000002 Thanks
it's not the most efficient, but I think something like: string="lynn:*:3000002some other stuff100hellolynn:" v1=`echo $string | sed -e "s/^.*:\*://" -e "s/[a-z ].*//"` v2=`echo $string | sed -e "s/^.*:\*:[a-z0-9]* [a-z ]*//" \ -e "s/[a-z]*://"` echo v1 $v1 echo v2 $v2 might be somewhat usable. assuming high performance isn't the rule of the day. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
lynn wrote:
I'm regex'd out of it at the moment. Given a string like this:
lynn:*:3000002some other stuff100hellolynn:
Is there a way to get the 3000002 into a variable v1 and the 100 into a variable v2? bash? Any recommended starting point?
Assuming you've got a file with such lines, this might get you started: sed -r -e 's/^[^0-9]*([0-9]+)[^0-9]+([0-9]+).*$/\1 \2/' file -- Per Jessen, Zürich (-9.9°C) -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 02/13/2012 08:27 AM, Per Jessen wrote:
lynn wrote:
I'm regex'd out of it at the moment. Given a string like this:
lynn:*:3000002some other stuff100hellolynn:
Is there a way to get the 3000002 into a variable v1 and the 100 into a variable v2? bash? Any recommended starting point? Assuming you've got a file with such lines, this might get you started:
sed -r -e 's/^[^0-9]*([0-9]+)[^0-9]+([0-9]+).*$/\1 \2/' file
Hi Thanks for the input everyone. It's helped me get started. I should have been more specific. I've narrowed down the task to getting just the first number in a string _but_ the output comes from the wbinfo command e.g. wbinfo -i lynn CACTUS\lynn:*:3000002:100::/home/CACTUS/lynn2:/bin/bash I want to extract the 3000002 I've narrowed it down to this: #!/bin/bash str=$(wbinfo -i $1) echo $str | sed -r 's/^([^.]+).*$/\1/; s/^[^0-9]*([0-9]+).*$/\1/' which gives 3000004. Good. But if the user is called lynn2, it gives 2 So the problem comes down to: how to get the first number in the string _after_ the *: sequence (This would work for wbinfo --group-info too as it is the same format) Thanks, L x -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Mon, Feb 13, 2012 at 10:03 AM, lynn <lynn@steve-ss.com> wrote:
Hi Thanks for the input everyone. It's helped me get started. I should have been more specific. I've narrowed down the task to getting just the first number in a string _but_ the output comes from the wbinfo command e.g.
wbinfo -i lynn CACTUS\lynn:*:3000002:100::/home/CACTUS/lynn2:/bin/bash
I want to extract the 3000002
I've narrowed it down to this:
#!/bin/bash str=$(wbinfo -i $1) echo $str | sed -r 's/^([^.]+).*$/\1/; s/^[^0-9]*([0-9]+).*$/\1/'
which gives 3000004. Good.
But if the user is called lynn2, it gives 2
So the problem comes down to: how to get the first number in the string _after_ the *: sequence
(This would work for wbinfo --group-info too as it is the same format) Thanks, L x
It would appear that the field in the wbinfo output you want to extract has a colon before and after it? Hence can't you use ... wbinfo -i lynn | awk -F":" '{print $3}' Regards Tim -- Tim Hempstead thempstead@gmail.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 02/13/2012 11:40 AM, Tim Hempstead wrote:
On Mon, Feb 13, 2012 at 10:03 AM, lynn<lynn@steve-ss.com> wrote:
Hi Thanks for the input everyone. It's helped me get started. I should have been more specific. I've narrowed down the task to getting just the first number in a string _but_ the output comes from the wbinfo command e.g.
wbinfo -i lynn CACTUS\lynn:*:3000002:100::/home/CACTUS/lynn2:/bin/bash
I want to extract the 3000002
I've narrowed it down to this:
#!/bin/bash str=$(wbinfo -i $1) echo $str | sed -r 's/^([^.]+).*$/\1/; s/^[^0-9]*([0-9]+).*$/\1/'
which gives 3000004. Good.
But if the user is called lynn2, it gives 2
So the problem comes down to: how to get the first number in the string _after_ the *: sequence
(This would work for wbinfo --group-info too as it is the same format) Thanks, L x
It would appear that the field in the wbinfo output you want to extract has a colon before and after it? Hence can't you use ...
wbinfo -i lynn | awk -F":" '{print $3}'
Regards
Tim Hi Tim, hi everyone Yes. The key was that lifesaving colon. In the end I used cut: strgid=$(wbinfo --group-info=$1) gid=$(echo $strgid | cut -d ":" -f 3)
Thanks to everyone who helped. L x -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2/12/2012 6:05 PM, lynn wrote:
I'm regex'd out of it at the moment. Given a string like this:
lynn:*:3000002some other stuff100hellolynn:
Is there a way to get the 3000002 into a variable v1 and the 100 into a variable v2? bash? Any recommended starting point? Thanks, L x
You haven't defined the problem clearly enough for a correct answer. Is there really no other field delimiters (":" in this case) between the * and the end of the line? Is "some other stuff" always the same length? _always_? Can "some other stuff" contain numbers? Basically I have to doubt that the file you are reading really looks like this, or if it does, what is generating it? The only way such a file would be useful is if the fields were all fixed length, because there is no delimiters for most of the line, but then it's odd that there are any delimiters at all in that case. Please describe more about what is creating this file, and provide some more sample records, and don't feel free to modify the line to hide sensitive data, rather create some junk records in the generating application that aren't sensitive in the first place, then supply those without changing them at all. There are a few different ways to do what you asked, but no way to say if they will work on any other input except that specific line above. That isn't very useful. Best guess until you say otherwise is that the fields tat are delimited by :'s are variable length, but the 3rd field DONE=false until $DONE ;do IFS=: read F1 F2 F3 junk|| DONE=true case "$F1" in ""|\#*) continue ;; esac F3_1=${F3:0:7} F3_2=${F3:7:16} F3_3=${F3:23:3} echo -e "Name:\t\"${F1}\"" echo -e " Number A:\t\"${F3_1}\"" echo -e " Description:\t\"${F3_2}\"" echo -e " Number B:\t\"${F3_3}\"" done < file.txt Given this line of input: lynn:*:3000002some other stuff100hellolynn: the "IFS=: read F1 F2 F3 junk" command will read the line and treat ":" as the word separator, so, F1 will be "lynn", F2 will be "*", F3 will everything up to the 3rd ":", and junk will be anything after the 3rd colon. Then inside the "until loop", it counts bytes to extract chunks of $F3 into $F3_1 $F3_2 etc.. ${F3:0:7} means to output 7 bytes starting at byte 0 of $F3 ${F3:7:16} means output 16 bytes starting at byte 7 (counting from 0) of $F3 ${F3:23:3} means output 3 bytes starting at byte 24 (counting from 0) of $F3 Then I stuck in stuff to ignore empty lines and lines that begin with #, and the use of the $DONE variable ensures that the last line is processed even if the file ends right at the end of the line with no trailing newline. So given this sample input: ---- lynn:*:3000002some other stuff100hellolynn: foob:*:6100013some stuff......201hellofoob: snac:*:3000562stuff goes here.111hellosnac: higb:*:5000001gibberish.......007hellohigb: # a comment blah:*:7500022blah blah blah..911helloblah: ---- You get: Name: "lynn" Number A: "3000002" Description: "some other stuff" Number B: "100" Name: "foob" Number A: "6100013" Description: "some stuff......" Number B: "201" Name: "snac" Number A: "3000562" Description: "stuff goes here." Number B: "111" Name: "higb" Number A: "5000001" Description: "gibberish......." Number B: "007" Name: "blah" Number A: "7500022" Description: "blah blah blah.." Number B: "911" Which is nice and all, but only works if the lengths of the number and comment fields are always exactly the same length on every record. -- bkw -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 02/14/2012 03:59 AM, Brian K. White wrote:
On 2/12/2012 6:05 PM, lynn wrote:
I'm regex'd out of it at the moment. Given a string like this:
lynn:*:3000002some other stuff100hellolynn:
Is there a way to get the 3000002 into a variable v1 and the 100 into a variable v2? bash? Any recommended starting point? Thanks, L x
You haven't defined the problem clearly enough for a correct answer.
Is there really no other field delimiters (":" in this case) between the * and the end of the line?
Is "some other stuff" always the same length? _always_?
Can "some other stuff" contain numbers?
Basically I have to doubt that the file you are reading really looks like this, or if it does, what is generating it? The only way such a file would be useful is if the fields were all fixed length, because there is no delimiters for most of the line, but then it's odd that there are any delimiters at all in that case.
Please describe more about what is creating this file, and provide some more sample records, and don't feel free to modify the line to hide sensitive data, rather create some junk records in the generating application that aren't sensitive in the first place, then supply those without changing them at all.
There are a few different ways to do what you asked, but no way to say if they will work on any other input except that specific line above. That isn't very useful.
Best guess until you say otherwise is that the fields tat are delimited by :'s are variable length, but the 3rd field
DONE=false until $DONE ;do IFS=: read F1 F2 F3 junk|| DONE=true case "$F1" in ""|\#*) continue ;; esac F3_1=${F3:0:7} F3_2=${F3:7:16} F3_3=${F3:23:3} echo -e "Name:\t\"${F1}\"" echo -e " Number A:\t\"${F3_1}\"" echo -e " Description:\t\"${F3_2}\"" echo -e " Number B:\t\"${F3_3}\"" done < file.txt
Given this line of input: lynn:*:3000002some other stuff100hellolynn:
the "IFS=: read F1 F2 F3 junk" command will read the line and treat ":" as the word separator, so, F1 will be "lynn", F2 will be "*", F3 will everything up to the 3rd ":", and junk will be anything after the 3rd colon.
Then inside the "until loop", it counts bytes to extract chunks of $F3 into $F3_1 $F3_2 etc.. ${F3:0:7} means to output 7 bytes starting at byte 0 of $F3 ${F3:7:16} means output 16 bytes starting at byte 7 (counting from 0) of $F3 ${F3:23:3} means output 3 bytes starting at byte 24 (counting from 0) of $F3
Then I stuck in stuff to ignore empty lines and lines that begin with #, and the use of the $DONE variable ensures that the last line is processed even if the file ends right at the end of the line with no trailing newline.
So given this sample input: ---- lynn:*:3000002some other stuff100hellolynn: foob:*:6100013some stuff......201hellofoob: snac:*:3000562stuff goes here.111hellosnac: higb:*:5000001gibberish.......007hellohigb:
# a comment blah:*:7500022blah blah blah..911helloblah: ----
You get:
Name: "lynn" Number A: "3000002" Description: "some other stuff" Number B: "100" Name: "foob" Number A: "6100013" Description: "some stuff......" Number B: "201" Name: "snac" Number A: "3000562" Description: "stuff goes here." Number B: "111" Name: "higb" Number A: "5000001" Description: "gibberish......." Number B: "007" Name: "blah" Number A: "7500022" Description: "blah blah blah.." Number B: "911"
Which is nice and all, but only works if the lengths of the number and comment fields are always exactly the same length on every record.
Hi Brian Thank you so much for your effort. This thread has given me insight into the hidden power of Linux. You are right. I had not provided enough info. The strings are from the output of wbinfo -i user and wbinfo --group-info=group and so yes, there are : delimiters and they come at the same place for both user and group. This must have been designed with people like me in mind! cut did it, e.g. for groups: strgid=$(wbinfo --group-info=$1) gid=$(echo $strgid | cut -d ":" -f 3) echo $gid I had marked the thread as [solved] previously and I hope that you don't feel you've wasted your time with your (excellent) explanation here. On the contrary. I have learned a load of new stuff from it. More, I have learned how to ask smarter questions. L x -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Hi Just a bit confused about what the cut command is actually saying. e.g. strgid="suseusers:*:3000028:" gid=$(echo $strgid | cut -d ":" -f 3) $gid=3000028 I count 3 : delimiters I think I understand. But then: strsid="S-1-5-21-2395500911-3560017633-4088823418-1134" pgrp=$(echo $strsid | cut -d "-" -f 8) $pgrp=1134 I count 7 - delimiters I don't understand any more. What's the -f n saying? 'I go to the n-1'th delimiter and give you whatever is between that and the n'th delimiter'? Can I assume that the n'th delimiter could either be a delimiter or an end of line character? If so, what is the end of line delimiter? Ahhggh! Thanks, L x -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Wednesday 15 February 2012 01:00:38 lynn wrote:
strsid="S-1-5-21-2395500911-3560017633-4088823418-1134" pgrp=$(echo $strsid | cut -d "-" -f 8)
$pgrp=1134 I count 7 - delimiters I don't understand any more.
What's the -f n saying? 'I go to the n-1'th delimiter and give you whatever is between that and the n'th delimiter'?
Can I assume that the n'th delimiter could either be a delimiter or an end of line character? If so, what is the end of line delimiter?
-f means "field". S is the first field, 1 the second, 21 the third, and so on. 1134 is the 8th field. Anders -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Wednesday 15 February 2012 01:03:56 Anders Johansson wrote:
On Wednesday 15 February 2012 01:00:38 lynn wrote:
strsid="S-1-5-21-2395500911-3560017633-4088823418-1134" pgrp=$(echo $strsid | cut -d "-" -f 8)
$pgrp=1134 I count 7 - delimiters I don't understand any more.
What's the -f n saying? 'I go to the n-1'th delimiter and give you whatever is between that and the n'th delimiter'?
Can I assume that the n'th delimiter could either be a delimiter or an end of line character? If so, what is the end of line delimiter?
-f means "field". S is the first field, 1 the second, 21 the third, and so
Sorry, 5 is the third field, 21 the fourth. But you get the point Anders -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 15/02/12 01:09, Anders Johansson wrote:
On Wednesday 15 February 2012 01:03:56 Anders Johansson wrote:
On Wednesday 15 February 2012 01:00:38 lynn wrote:
strsid="S-1-5-21-2395500911-3560017633-4088823418-1134" pgrp=$(echo $strsid | cut -d "-" -f 8)
$pgrp=1134 I count 7 - delimiters I don't understand any more.
What's the -f n saying? 'I go to the n-1'th delimiter and give you whatever is between that and the n'th delimiter'?
Can I assume that the n'th delimiter could either be a delimiter or an end of line character? If so, what is the end of line delimiter?
-f means "field". S is the first field, 1 the second, 21 the third, and so
Sorry, 5 is the third field, 21 the fourth. But you get the point
Anders
Yes I get it now. Sorry folks. Thanks, Lynn -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (7)
-
Anders Johansson
-
Brian K. White
-
lynn
-
Marko Koski-Vähälä
-
Per Jessen
-
Tim Hempstead
-
zep