Hi all,
I have a number of strings like below:
//mnt/autocor/43�13'(33")W/[N]
and i'm trying to get the numbers in this string, for example
43[tab]13[tab]33[tab]
please help
thanks ahead
Hi all,
I have a number of strings like below:
//mnt/autocor/43�13'(33")W/[N]
and i'm trying to get the numbers in this string, for example
43[tab]13[tab]33[tab]
please help
thanks ahead
You probably don't want to use cut, it's fast but inflexible (to the best of my knowledge you cant have a variable delimiter). try the following approach:
perl -e 'print join("\t",(split(/[^0-9]+/,$ARGV[0]))),"\n" ;' '//mnt/autocor/43�13(33")W/[N]'
ooops, re-read your question, the following puts a tab on the end of the line also
perl -e 'print join("\t",(split(/[^0-9]+/,$ARGV[0])),"\n") ;' '//mnt/autocor/43�13(33")W/[N]'
perl -ne 'while(/(\d+)/g){print "$1\t"}' inputfile
~/unix.com$ awk -F'[^0-9]' '{s="";for(i=1;i<=NF;i++){if($i!="")s=s$i"\t"}print s}' file
tr -cs '[:digit:]' '[\t*]'
Someone, tell me what's wrong with this code, please?
awk -F'[^0-9]' '{s="";i=0;while(++i<=NF&&$i!="")s=s$i"\t"}$0=s' file
The while loop will abort at the first empty field because $i != ""
evaluates to false. With the sample data provided and with the field separator you're using (any non-digit), the comparision is false for $1
and the body of the loop never executes.
Regards,
Alister
Now it's so obvious that I feel stupid
Thank you Alister
It happens to everyone from time to time. Sometimes, when debugging, we see what we expect instead of what's actually there.
Regards,
Alister
I am rather sure you can help me to have a better understanding of this notation:
Is the wildcard mandatory ? What is it used for ?
(i mean isn't the use of the -s option of tr sufficient and intended for that gathering purpose?) or are there some other reasons to prefer that notation rather than a simple '\t' ?
Thanks in advance
yas:
grep -Eo '[0-9]+' infile | paste - - -
Analogous to the nice tr statement:
sed 's/[^0-9][^0-9]*/\t/g' infile
@Scruti,
I am just wondering about the wildcard in the '[\t*]' notation, why not to just use '\t' instead ? Is that wildcard just a litteral one (part of the [ ] list), or is it interpreted ?
That notation as I used it is not so much a wildcard as a repetition operation. It says to pad the second set with the preceding characer, \t, until the second set's length equals the first's (a number can follow the asterisk to indicate an exact count, [\t*3]
would include three tabs in the set).
Why bother with that? Historically, BSD and SysV tr implementations behaved differently when the second set is shorter than the first. This notation guarantees that this does not occur.
In practice, at least with open source systems, you'll probably never run into this issue.
For more info, see the POSIX man page for tr. Specifically, the EXTENDED DESCRIPTION section for the details of the syntax and APPLICATION USAGE for the history.
Regards,
Alister
The notation with the square brackets and the asterisk makes sure that the number of characters in the second string is equal to the length of the first string. This is the most portable way. If you use a single character for the second string, then it is not guaranteed to work across all implementations...
@Scruti & alister
Dudes, well ... I confess i didn't read all the Posix documentations so far...
Anyway, I will sleep a little less ignorant ! Thx !