File Reference Matching

ChicagoBlues · July 10, 2015, 4:16pm

Hi,

I need help with the below scenario in Bash scripting:

I have two files:
tmp.orig with the below contents

"TEST1"
"TEST2"
"TEST5"
"TEST6"

and tmp.new with below contents

"TEST1;921"
"TEST2;34"
"TEST3;2"
"TEST4;33"
"TEST5;98"
"TEST6;1"

I need to run tmp.orig against tmp.new and match the characters while generating the below output:

"TEST1;921"
"TEST2;34"
"TEST5;98"
"TEST6;1"

I would greatly appreciate your help.
Note: Although the problem is real, these are dummy values and filenames.

Thanks.

Don_Cragun · July 10, 2015, 5:35pm

This is pretty similar to the thread you posted in 2009: Matching key fields

What have you tried to solve this problem on your own?

What operating system are you using?

ChicagoBlues · July 10, 2015, 6:43pm

Yeah, that code is not going to work here. This is bash in Sun OS, and that was ksh in AIX.

Don_Cragun · July 10, 2015, 8:08pm

I repeat: "What have you tried to solve this problem on your own?"

bakunin · July 11, 2015, 4:08am

If i read the code in the mentioned other thread correctly the suggestion was to use

fgrep -f <file-with-keys> <file-to-search>

Which part of this is shell-dependent? Have you even read the other thread?

bakunin

Don_Cragun · July 11, 2015, 5:08am

bakunin:

If i read the code in the mentioned other thread correctly the suggestion was to use
fgrep -f <file-with-keys> <file-to-search>
Which part of this is shell-dependent? Have you even read the other thread?

bakunin

The fgrep solution from that thread won't work exactly because the quotes in the files won't match properly.

There was also an awk suggestion in that thread. It won't work exactly either, but would be a good starting point if the field separators were modified.

I had just hoped that with our help in processing 41 requests over these last 7.5 years, ChicagoBlues would be willing to show us that some attempt had been made to come up with a solution for this relatively simple task before using the UNIX & Linux Forums as an unpaid programming staff. (However, I am disappointed at some of the suggestions that have been given to ChicagoBlues over the years.)

bakunin · July 11, 2015, 7:42am

You are right and i perhaps should have done a better job of expressing myself. Regardless of the solution being usable as is or only after some tweaking

fgrep -f <file-with-keys> <file-to-search>

or any variation thereof will be independent of the shell employed. Therefore the mentioning

was about as astute as "yes, but it was mentioned on a Friday and today is Saturday".

Absolutely correct, but this (or any solution derived from it) would have nothing to do with the shell or the system - just with the willingness to undergo the effort to actually modify it into a fitting form.

Spot on on both accounts!

bakunin

Don_Cragun · July 11, 2015, 3:27pm

You might note that in post #2 in this thread I didn't ask about the shell being used (partly because it doesn't matter here and partly because bash had already been specified).

I did ask about the OS, and it does matter here if an awk solution is appropriate (as in the suggestion I would provide if ChicagoBlues showed us that any effort had been put into this problem) since on Solaris systems awk would need to be changed to /usr/xpg4/bin/awk or nawk .

Also, the shell does matter more on Solaris systems than on many other systems since /bin/sh there is a pure Bourne shell lacking several standard shell parameter expansions, arithmetic expansions, and the $(command) form of command substitution.

ChicagoBlues · July 13, 2015, 11:39am

A simple grep against the whole file worked for me (within the loop). Initially, I wasn't thinking of wrapping it in a loop. If you have a more efficient solution, then please share.

  for platform in $(cat $DATA_OUT/tmp_ALLL_Platforms.txt); do
    warning=`grep $platform $DATA_OUT/ALL_warnings.txt`

    ... rest of the code

    done

Don_Cragun · July 13, 2015, 4:02pm

For the original problem you posed in post #1 in this thread, the following works perfectly:

/usr/xpg4/bin/awk -F'[";]' '
FNR == NR {
	t[$2] = $0
	next
}
{	print t[$2]
}' tmp.new tmp.orig

But, since the data you showed us in post #1 would not find any matches with the code you showed us in post #9, I have to assume that the data you showed us is not representative of your actual data.

Aia · July 13, 2015, 4:40pm

A for loop is not good when you need to read from a file. You could eliminate an extra call to the cat command by using a while loop.

while read platform; do
    do something here with $platform
done < "${DATA_OUT}"/tmp_ALLL_Platforms.txt

Concerning your request.

tr -d \" < tmp.orig | grep -Ff - tmp.new
"TEST1;921"
"TEST2;34"
"TEST5;98"
"TEST6;1"