I remember telling you a few hours ago that variable expansion doesn't work within single quotes, so $COL1 will never be expanded. On top, even IF $COL1 were expanded, there still is a logical error in your sed command: it would remove $COL1 together with the rest of the line. And, that sed command is executed on the entire source file for every single line in readfile .
With 250 posts in these forums, you may have learned that without detailed error information, analysis and debugging is close to impossible. So please provide:
Any error messages?
Are the output files created?
What be their contents?
Did you run the script with the -x (xtrace) option set?
PS: readfile and source are the same as I posted under this thread.
Hello,
I am sorry for the headache.
I posted in next script that I have more files to be processed.
So, I edited the main post.
What I typed in first post gives expected result.
I need time to check what was wrong at my end.
Thank you
Boris
--- Post updated at 05:09 PM ---
Hello Again,
Here is the output:
root@house:~/test# awk 'NR==FNR {T[]; next} {for (t in T) sub (t.*, t)} 1' readfile source
awk: cmd. line:1: NR==FNR {T[]; next} {for (t in T) sub (t.*, t)} 1
awk: cmd. line:1: ^ syntax error
awk: cmd. line:1: error: invalid subscript expression
awk: cmd. line:1: NR==FNR {T[]; next} {for (t in T) sub (t.*, t)} 1
awk: cmd. line:1: ^ syntax error
awk: cmd. line:1: NR==FNR {T[]; next} {for (t in T) sub (t.*, t)} 1
awk: cmd. line:1: ^ 1 is invalid as number of arguments for sub
Thank you
Boris
--- Post updated at 11:20 PM ---
Hello,
As I faced problems with awk, sorted out in below algorithm shortly:
-First line removed in sourcefile
Grep all lines containing COL1 in sourcefile > output1
Grep all lines not-containing COL1 in sourcefile > output2
Wouldn't it make serious sense to read and try to understand the error message(s)? And, compare your code to the proposal given in post #6 ... see the difference?
No, I do not see difference when I run both seperately.
As I do not understand awk , I put codes given in #6 between echo to see what is printing:
s2.sh
while read COL1 COL2
do
echo " awk 'NR==FNR {T[$1]; next} {for (t in T) if ($1 ~ t) $0 = $1} 1' readfile source "
done<readfile
output
awk 'NR==FNR {T[]; next} {for (t in T) if ( ~ t) ./s2.sh = } 1' readfile source
awk 'NR==FNR {T[]; next} {for (t in T) if ( ~ t) ./s2.sh = } 1' readfile source
awk 'NR==FNR {T[]; next} {for (t in T) if ( ~ t) ./s2.sh = } 1' readfile source
What in the shown result of s2.sh does not satisfy your needs? Looks perfect to me, considering the code you presented.
The difference is the T array index is $1 in the working code, and empty in your error case. I desparately try to understand why "Awk is more complicated for" you and you forgo the efficient complete solutions presented to you in posts #7 or #8, falling back to highly inefficient band aid pseudo solutions.
Consider the case the bumper has fallen off your car. The professional repair shop grabs their MIG welder, welds the screw nuts back to the carrier beam, and with a ratchet screws the bumper back on. Amateurs use chewing gum to fill the gaps, and scotch tape to glue the bumper back.
awk (or perl , sed , etc.) is the MIG welder and the ratchet at your finger tips.
So, should I tell awk to search which column to be looked up? Sed+grep are like medium frequency welding technology for me. So far, awk seems not-comprehensible, even after your detailed explanation. I need to read more and more..
Thank you for your time.
Boris
--- Post updated at 07:16 AM ---
Hello Rudic,
It requires:
perl -i -ne 'print unless ${$_}++' output
Don't worry, the problem solved with a bit longer way
I apologize to the author of the topic for rejection.
I think too beautiful to work correctly.
I have simplified:
awk 'NR==FNR {T[$1]; next} {for (t in T) sub (t".*", t)} 1' readfile source
Although the condition of the task to remove the remaining substring after the match.
But logically, it is still necessary to store the address fully in which the entire match is found.
Example
result: http://www._aa.bb
result as I imagine it: http://www.aa.bb.cc