Hi,
I have below command in one of the script. Can you please let me know what does the curly braces do over here \{1,\}. The remaining part of the code atleast I am able to understand.
sed -n 's/.*\-\([0-9])\{1,\}\)\-.*/\1/p'
Hi,
I have below command in one of the script. Can you please let me know what does the curly braces do over here \{1,\}. The remaining part of the code atleast I am able to understand.
sed -n 's/.*\-\([0-9])\{1,\}\)\-.*/\1/p'
Hi, it means 1 or more occurrences of the preceding character or sub-expression, in this case: )
, so 1 or more closing parentheses..
Scrutinizer is right. You can use this device to "multiply" a previous expression, similar to a "*", but with added functionality. For instance:
X # matches exactly one single "X"
X* # matches any number of "X"s, including zero
X\{3\} # matches exactly 3 "X"s
X\{1,\} # matches any number of "X"s, from 1 up
X\{,5\} # matches up to 5 "X"s
X\{3,5\} # matches 3 to 5 "X"s, hence either "XXX", "XXXX" or "XXXXX"
Notice, that, instead of a single character like "X" here, you can also modify complex expressions with that modifier. For instance:
|[^|]* # matches a "field" in tabular, pipe-separated data
# i.e. "|field1|field2|field3....."
\(|[^|]*\)\{3\} # matches 3 such fields
I hope this helps.
bakunin
sed -n 's/.*\.\([0-9]\{1,\}\)\..*/\1/p'
I had it slightly wrong, but I got the idea of \{1,\}
However when I have the value such as
echo "testing.123.xyz.456.txt" | sed -n 's/.*\.\([0-9]\{1,\}\)\..*/\1/p'
I am getting value "456" instead of first pattern. Is there anywhere that I am going wrong.
echo "testing.123.xyz.456.txt" | sed -n 's/.*\.\([0-9]\{1,\}\)\..*/\1/p'
The issue is that the red part of the regex matches the part of the string. The `.*' will try to match as much as it can.
Perhaps, modifying your regex a bit:
echo "testing.123.xyz.456.txt" | sed -n 's/[^.]*\.\([0-9]\{1,\}\)\..*/\1/p'
[^.]*
: keep matching any char that is not a literal period. Stops when it does.
Or to get the first field with numbers, try:
sed -n 's/^\([^0-9.].\)*\([0-9]*\).*/\2/p'
or if is always the second field, try:
cut -d. -f2
Great.. This works... is ^ in square brackets used as a negation?
Exactly:
[AB] # matches "A" or "B"
[^AB] # matches any character except "A" or "B"
You need this quite often, because matches in sed are always "greedy". Suppose the following text:
AXXBABABABXABABABAB
the expression /A.*B/
will match the whole string, not just the first four characters! If several possiblities for a match exist always the longest possible one will be taken:
AXXBABABABXABABABAB
A<------ .* ----->B
If you want to match only up to the first "B" you need to:
AXXBABABABXABABABAB
A[^B]*B
AXXB
I hope this helps.
bakunin
Hi,
Thanks for the detailed description.
I tried changing the below sed to fetch the first available numeric, but was unable to get the results.
echo "testing.123.xyz.456.txt" | sed -n 's/[^.]*\.\([0-9]\{1,\}\)\..*/\1/p'
Ideally, my code has to fetch any numeric value that comes first and matching a pattern.
Pattern -> testing.123.xyz.*.txt
Actual Value -> testing.123.xyz.456.txt
----> This should result in 123
If pattern DOES NOT have a * only first available value needs to be extracted
abc123testing456 ----> should result in 123
123abctesting456 ----> should result in 123
Any ideas how we can change the code to achieve the above result
Not quite sure what you mean by "above result". However, try
sed -rn 's/[^0-9]*([0-9]{1,})[^0-9]*.*/\1/p' file
In all the below cases (for eg.) the result should be "1234"
abc.1234.xyz.456.999
abc.1234testing456
abc1234testing456
1234abctesting456
Result :1234
I do not have system in front of me. I can check it tomorrow.
Does the above code which you have given return
"23"
. Please correct me If I am wrong whether the values 1 and 4 gets absorbed by
[^0-9]
Result of sed
script in post#10 applied to your last sample:
1234
1234
1234
1234
$ cat numbers.file
abc.1234.xyz.456.999
abc.1234testing456
nonumbershere
abc1234testing456
voidofnumbers
1234abctesting456
$ perl -nle '/(\d+)/ and print $1' numbers.file
1234
1234
1234
1234
Regular sed:
sed -n 's/^[^0-9]*\([0-9]\{1,\}\).*/\1/p' file
GNU awk:
awk -v FPAT='[0-9]+' 'NF{print $1}' file
Regular awk:
awk 'match($0,/[0-9]+/){print substr($0,RSTART,RLENGTH)}' file