Finding difference between two columns of unequal length

jamie_123 · November 28, 2014, 4:49am

Hi,

I have two files which look like this

cat waitstate.txt 
18.2
82.1

cat gostate.txt 
5.6
5.8
6.1
6.3
6.6
6.9
7.2
7.5
7.7
9.7
22.4
27.1
30.4
33.4
37.5
46.6
50.7
89.4

I want to find difference between these two files. The values that I am interested in are as follows. If my subtraction formulation is (A-B), B is a value in the waitstate file.
The value of A will be the one which is the smallest number that is larger than B in the gostate file.

In this case, the solutions should be 22.4-18.2 and 89.4-82.1.

I am not sure how to generate the subset of file 2 based on file 1.

Much appreciated!

RudiC · November 28, 2014, 5:20am

What be the desired output? Try

awk 'FNR==NR {T[NR]=$1; CNT=1; next} {D=$1-T[CNT];if (D>0) {print D, $1, T[CNT]; CNT++}}' file1 file2
4.2 22.4 18.2
7.3 89.4 82.1

---------- Post updated at 11:20 ---------- Previous update was at 11:17 ----------

In case there's more values in file2, try

awk     'FNR==NR        {T[NR]=$1; CNT=1; MAX=NR; next}
                        {D=$1-T[CNT]
                         if (D>0) {print D, $1, T[CNT]; CNT++}}
         CNT > MAX      {exit}
        ' file1 file2

jamie_123 · November 28, 2014, 5:21am

Works like a treat @RudiC. Thanks!

Scrutinizer · November 28, 2014, 5:28am

Alternatively:

awk '$1+0>p+0{if(NR>1)print $1-p,$1 "-" p; if(!((getline p<f)>0)) exit}'  f=file1 file2

Akshay_Hegde · November 28, 2014, 10:43am

@ Scrutinizer : elegant solution

Not one liner solution, I just wanted to share this function, you can try this

awk '

# Array, nearest neighbour using simple min-max method
function _get_keyMnMx(Arr,key,arg,   low,upp,i)
{	
	for(i in Arr)
	{ 
                # Make it Numeric
                i+=0

		if(i < key)
		{ 
			low =  (low !="" && low > i) ? low : i
		}
		if(i > key)
		{   
			upp =  (upp !="" && upp < i) ? upp : i
		}  
	}		    
	low  =  (low == "") ? "NaN" : low 
	upp  =  (upp == "") ? "NaN" : upp 
			 
	return ( tolower(arg)=="up" ? upp : tolower(arg) == "dw" ? low : low " " upp )	 
}

# Reading your gostate.txt
FNR==NR{
	Array[$1]
	next
}

#  Reading waitstate.txt
{
	# larger than column1 value from gostate.txt file
	largerthanthis = _get_keyMnMx(Array,$1,"up")


	print $1, largerthanthis, largerthanthis - $1
}
   ' gostate.txt waitstate.txt

Resulting

18.2 22.4 4.2
82.1 89.4 7.3