awk comparison and substitution

Hi,
here's my - not so easy to describe - problem: I want to compare the values of one file (FileA) with a cutoff-value and, if this comparison is true, substitute those values with those in the second file (FileB). However, there are many FileA's (FileA[1->200]), whereas there is only one FileB. Every FileA has three lines, each containing one value.

3.4
3.5
3.6

FileB has 3 columns and > 200 lines.

0.0154    0.0139    0.0227
0.0198    0.0259    0.0231
0.0126    0.0216    0.0174
0.0115    0.0145    0.0237
...            ...            ...

The three values of FileA1 should now be compared line-by-line with the cutoff-value. If true, the corresponding value of FileB should be assigned. However, those corresponding values are contained within a line of FileB. So I now need some kind of script which substitutes line x of FileA[1] (if value > or < cutoff) with the field in line[1] and cloumn x.
My script so far:

# getting number of line of FileB in which values of FileA are contained

n=`echo "$1" |sed 's/.*\([0-9]\{1,3\}\).*/\1/'`

# comparison and substitution

awk -v val=$n '{
    getline < "$1"
        for(i=1; i<=NF; i++){
            if($i >= 3.5){
                print $i 
            }
        else{
            getline < "FileB.txt"
            NR==n {print $i}
            }
        }
        } ' $1 FileB.txt > $1_new.txt

since i'm a beginner in awk, it's very intuituve aaaand - of course - doesn't work.

output should look something like this:

0.0154
3.5
3.6

Any help would be greatly appreciated!

waddle

I still don't understand, why 3.4 is choiced, and replaced by 0.0154. Why not 3.5 or 3.6. What's the cutoff value in each line in fileB

can you explain more detail?

The script should chek FileA[1] for a line containing a value greater or equal to a value, in this example 3.5. If this is true, the original value should be printed, if it's false, this specific value (in this example in line 1 of FileA[1]) should be replaced with the corresponding value in FileB, here line1 (since its FileA[1]), column 1 (first value in FileA). Again, notice that the corresponding values are listed in one column in FileA and one line in FileB.

I hope, it's better to understand now...

Try this:

# cat FileA1
3.4
3.5
3.6
# cat FileA2
3.1
3.3
3.8
awk -v val=3.5 '
NR==FNR{o[NR""1]=$1
        o[NR""2]=$2
        o[NR""3]=$3
        next}
{
if ( FILENAME  != lFN ) 
   L++
val+=0
cmp=$1+0
if ( val <= cmp ) 
   print 
else 
   print o[L""FNR] 
lFN=FILENAME}' FileB FileA*
0.0154
3.5
3.6
0.0198
0.0259
3.8

Thanks for your time and work Klashxx,

the code works, though I now have some new problems

  1. i don't really understand it (but i'll try to)
  2. the code only works when i paste it into the shell, not when i try to run the script (but that's not the main point)
  3. i have to type in all FileA's (>200) consecutively (s. 4.) (also not the main point)
  4. i can't compare one FileA individually: if I take FileA[145], the values (if necessary) become substituted with those of line 1 from FileB, not with line 145

Thanks again,
waddle

The basic thing here is the naming of the files, you need a constant pattern , say FileA1,FileA2,FileA3,...FileA200.

# cat FileA4
3.6
3.3
3.8
# cat ren.sh             
#!/usr/bin/ksh

value="${1}"
patFileA="${2}"
FileB="${3}"

awk -v val="${value}" '
NR==FNR{o[NR""1]=$1
        o[NR""2]=$2
        o[NR""3]=$3
        next}
{
if ( FILENAME  != lFN ) 
   extF=substr(FILENAME,match(FILENAME,/[0-9]/))
val+=0
cmp=$1+0
if ( val <= cmp ) 
   print 
else 
   print o[extF""FNR] 
lFN=FILENAME}' ${FileB} ${patFileA}*
# ren.sh 3.5 FileA4 FileB
3.6
0.0145
3.8

Use

ren.sh 3.5 FileA FileB

to process all the files.

Hi Klashxx,
I took your code and modified it slightly for my purposes. The files indeed have a pattern in their naming: [0-9]{1,3}[A-Z]{3}. Your code gives me correct output solely for FileA's, in which the comparison is true.
For FileA no. 2 however, I always get the same output: in case of comparison is true, i get the three values contained in this file plus as much "2"s as FileB has lines. In case that the comparison is not true, i get as many empty lines as FileB has lines.
Do you have any explanation for this finding?
cheers,
waddle

Post the ex. file (content and name ) that generates the wrong result.

ok, so here are all my files:

FileA's:
1ABC:

3.75289
3.74839
3.74117

2DEF:

3.45011
3.44657
3.46905

3GHI:

3.27445
3.27389
3.30938

etc. etc.

FileB:

0.0154    0.0139    0.0227
0.0198    0.0259    0.0231
0.0126    0.0216    0.0174
0.0115    0.0145    0.0237
0.0146    0.0124    0.0149
0.0128    0.0142    0.0161
... 
# cat ren.sh             
#!/usr/bin/ksh

echo "Please choose cutoff value", read cut

FileB="${1}"
patFileA="${2}"


awk -v val=cut '
NR==FNR{o[NR""1]=$1
        o[NR""2]=$2
        o[NR""3]=$3
        next}
{
if ( FILENAME  != lFN ) 
   extF=substr(FILENAME,match(FILENAME,/[0-9]/))
val+=0
cmp=$1+0
if ( val <= cmp ) 
   print 
else 
   print o[extF""FNR] 
lFN=FILENAME}' ${FileB} ${patFileA}*

sorry to annoy you that much,
thanks so far

Ok , you have to adjust the regex to match the file pattern.

# ls [0-9]*[A-Z]*            
1ABC  2DEF  3GHI
#!/usr/bin/ksh

echo "Please choose cutoff value: \c"
read cut

FileB="${1}"
patFileA="${2}"


awk -v val="${cut}" '
NR==FNR{o[NR""1]=$1
        o[NR""2]=$2
        o[NR""3]=$3
        next}
{
if ( FILENAME  != lFN ) 
   extF=substr(FILENAME,match(FILENAME,/^[0-9]*/),RLENGTH)
val+=0
cmp=$1+0
if ( val <= cmp ) 
   print 
else 
   print o[extF""FNR] 
lFN=FILENAME}' ${FileB} ${patFileA}*
# ren.sh FileB "[0-9]*[A-Z]*"
Please choose cutoff value: 3.45
3.75289
3.74839
3.74117
3.45011
0.0259
3.46905
0.0126
0.0216
0.0174

For an individual file:

# ren.sh FileB 1ABC          
Please choose cutoff value: 3.75
3.75289
0.0139
0.0227
1 Like

Hi Klashxx,
I applied exactly your code, it just didn't work out for me. I still got some incorrect output with a lot of empty lines. It's ok though, I solved it with another (more artless) bash-script.
Thanks anyway!