OK, this is what you have to do in order to understand how this code works:
- Execute:
sed 's=\(.*\) \(.*\)=if($4~/^\2/)print $0\" \1\"=' country-codes.txt
It will convert each line of the "country-code" to an "if-statements" in awk-language that instructs awk that:
if the forth-field (of the read-in line) starts with the "number" then, print the entire line followed by the "name".
Thus, if you run awk on the phone-lines.txt file with those generated "if-statements" as the executing-commands on the read-in lines, the result will be "addition of the appropriate Country-City-Name to the end of the line, based on the starting number in the forth-field of the read-in line".
You can see it for yourself, by executing:
awk "{$(sed 's=\(.*\) \(.*\)=if($4~/^\2/)print $0\" \1\"=' country-codes.txt)}" phonelines.txt
Now, we have to add the "Second Country-City-Name to the end of the line, based on the starting number in the fifth-field of the read-in line".
We do this by piping the output of the above-command to another awk who has no phonelines.txt as its file-parameter (so, it will be forced to read the output of the previous awk in the pipe) and, as its set of "if-statements", the same sed is used but this time it is generating "$5~/^ ..." instead of "$4~/^ ..." to instruct (the second) awk that:
if the fifth-field (of the read-in line) starts with the "number" then, print the entire line followed by the "name".
Now, as for the speed, I'm afraid I don't have a 500000 line phonelines.txt file and more than that, it depends on the computer on which you're running this commands. But, one thing that I am very sure of is that both sed and awk are very fast in what they do.
As for the description of how the sed is generating the "if-statements", I would refer you to the description of "Regular Expression" either on the internet onin man-pages.
Good Luck;)