Jin1
September 13, 2013, 3:58am
1
Hi,
Anybody knows how to get the value after the regexp and test it on pattern? The if the pattern matches, it will print the entire line on a separate file.
Here's my raw file:
^_Name^_string^_Apple ^_Color^_string^_Red ^_Code^_string^_121
^_Name^_string^_Banana ^_Code^_string^_123 ^_Color^_string^_Yellow
^_Name^_string^_Citrus ^_Color^_string^_Green ^_Code^_string^_129 ^_Color^_string^_Green
I want to check if the Code string's last digit is within range of [1-5]. The Code^_ is not always on the same column so I can't just use the awk $3.
eg. I specified to search for [1-5], the output will be below:
^_Name^_string^_Apple ^_Color^_string^_Red ^_Code^_string^_121
^_Name^_string^_Banana ^_Code^_string^_123 ^_Color^_string^_Yellow
eg. I specified to search for [19], the output will be below:
^_Name^_string^_Apple ^_Color^_string^_Red ^_Code^_string^_121
^_Name^_string^_Citrus ^_Color^_string^_Green ^_Code^_string^_129 ^_Color^_string^_Green
What I've done is to get the code values,store in a file then grep [range]. But looping takes huge time specially with large files.
By the way, the "^_" is a control character, and the spaces are tabs.
---------- Post updated at 03:58 PM ---------- Previous update was at 03:52 PM ----------
I can also use grep without looping, but I cant use the * in between so no. Here's what's in my mind:
grep Code^_string^_*[1-5]$ [filename]
Something like this?
grep "_Code^_string^_[0-9]\{2\}[1-5]" infile
grep "_Code^_string^_[0-9]\{2\}[19]" infile
How many digits will you have (the above is for 3 digits)? And I suppose you want to check only the last digit?
--ahamed
1 Like
apmcd47
September 13, 2013, 4:16am
3
If you modify that grep to:
grep Code^_string^_[0-9]+[1-5]
it should work for when only numerals appear after the Code/string construct. I have not tested this.
Andrew
1 Like
Jin1
September 13, 2013, 4:27am
4
Digits are inconsistent as well. Can be 3, can be 10. Is the solution above going to work?
apmcd47:
If you modify that grep to:
grep Code^_string^_[0-9]+[1-5]
it should work for when only numerals appear after the Code/string construct. I have not tested this.
Andrew
What does the "+" sign do? Is it like the "*" in ls ?
Jotne
September 13, 2013, 7:02am
5
+
one or more hit
*
0 or more hit
?
0 or 1 hit
Jin1
September 15, 2013, 7:46pm
6
apmcd47:
If you modify that grep to:
grep Code^_string^_[0-9]+[1-5]
it should work for when only numerals appear after the Code/string construct. I have not tested this.
Andrew
Problem is when the line also has irrelevant numbers at the end of the line, it will give a hit to the grep.
apmcd47
September 17, 2013, 3:43am
7
You have tested this? As I said, I had not. Taking another look I can see that it could match something like:
Code^_string^_12127
To stop this you would need to add something to the end of the pattern to match white space, but this would not match the end of the line. You could add a match for white space to the end of the pattern but then it won't match the end of the line. You could have two versions of my pattern: one with white space a the end and the other with the end of line anchor ($) at the end.
Andrew
Try:
grep -E '�Code�string�[0-9]*[1-5]([[:blank:]]|$)' file
where �
stands for that control character ^_
or perhaps just
grep -E '.Code.string.[0-9]*[1-5]([[:blank:]]|$)' file
with the dot as a catchall for whatever single control character is used
--
On Solaris use /usr/xpg4/bin/grep -E
1 Like