Find fields and replace using awk

Roozo · November 14, 2013, 2:43am

Code:
Using ksh

Var1=`awk -F";" {print $1}'  Input2.txt`
 
cat Input1.txt |  awk -F";"  '{$3="Var1"}' > Output.txt

Franklin52 · November 14, 2013, 3:26am

Assuming both the files Input.txt and Input2.txt have 1 row:

awk 'NR==FNR{var=$1;next}{$3=var}1' Input2.txt FS=; OFS=; Input1.txt > Output.txt

Roozo · November 14, 2013, 7:15am

After executing the command

awk 'NR==FNR{var=$1;next}{$3=var}1' Input2.txt FS=; OFS=; Input1.txt > Output.txt

getting the below error,

Input1.txt : aster :not found 
Input1.txt :PAGE  :not found
Input1.txt:1233   :not found
Input1.txt: 4 :not found
Input1.txt:1222  :not found

CarloM · November 14, 2013, 7:43am

Try

awk 'NR==FNR{var=$1;next}{$3=var}1' Input2.txt FS=";" OFS=";" Input1.txt > Output.txt

Yoda · November 14, 2013, 11:22am

Are you trying to join three files?

If yes, something like this might work for the input that you posted:

awk -F'[; ]' '
        NR == FNR {
                i = $2
                next
        }
        !(j) {
                j = $1
                next
        }
        {
                $3 = j
                $5 = i
        }
        1
' OFS=';' config.txt Input2.txt Input1.txt

Roozo · November 19, 2013, 2:23am

Thanks Yoda, Your command was joining fields well!!

It�s taking field 2nd from config.txt and replacing $5 in Input1.txt. But Input2.txt file has two fields, for first line �1111 A� our command wants to compare it with config.txt and take �A DART� from config.txt to replace $5 in Input1.txt

1111 �input to replace
A -- to compare it with config.txt and take desire field, say �DART�

Example: Input files

Input1.txt:

aster;PAGE;1233;4;1222

Input2.txt

1111 A
1234 B
2355    C

config.txt:

A DART
B MART
B KART
B KAR2
C HOME

[/COLOR]
So Output is:

aster;PAGE;1111;4;DART
aster;PAGE;1234;4;MART
aster;PAGE;1234;4;KART
aster;PAGE;1234;4;KAR2
aster;PAGE;2355;4;HOME

I Tried with below, but facing error
[COLOR=black]

awk -F'[; ]' '
NR == FNR {
i = $2
p = $1
}
{
n=awk '$1==2 {print $2}' 
next
}
{
$2 = p
$3 = need
}
1
' OFS=';' Input2.txt config.txt Input1.txt

Akshay_Hegde · November 19, 2013, 3:50am

For given input in #1 this will work

$ cat config.txt
A DART

$ cat input2.txt
1111 A

$ cat input1.txt
aster;PAGE;1233;4;1222

 
awk -F'[ ;]' '
             FNR == 1{
                       ++i
                     } 
               i == 1{
                       replace1 = $2
                     } 
               i == 2{
                       replace2 = $1
                     } 
               i == 3{
                       $3 = replace2
                       $5 = replace1
                       j=1
                     }j
              ' OFS=";" config.txt  input2.txt input1.txt

Resulting

aster;PAGE;1111;4;DART

neutronscott · November 19, 2013, 4:03am

mute@thedoctor:~/temp/roozo$ ./script
aster;PAGE;1234;4;KART
aster;PAGE;2355;4;HOME
aster;PAGE;1234;4;KAR2
aster;PAGE;1234;4;MART
aster;PAGE;1111;4;DART

#!/bin/sh
awk '
        FNR==1{file++}
        file==1{a[$2]=$1;next}
        file==2{b[$2]=$1;next}
        {
                for (i in a) {
                        print $1,$2,b[a],$4,i
                }
        }
' config input2 FS=';' OFS=';' input1

Akshay_Hegde · November 19, 2013, 4:28am

For input in post #6 and order is important then you may try

$ cat config.txt 
A DART
B MART
B KART
B KAR2
C HOME

$ cat input2.txt 
1111 A
1234 B
2355 C

$ cat input1.txt 
aster;PAGE;1233;4;1222

awk -F'[ ;]' '
              FNR == 1{
                        ++i
                      } 
                i == 1{
                         R1[$2]=$1
                      } 
    i==2 && ($1 in R1){
                          S[FNR]= R1[$1] OFS $2
                          s = FNR
                      } 
               i == 3 {
                            for(j =1; j<=s ; j++){
                                                      split(S[j],A,OFS)
                                                      $3 = A[1]
                                                      $5 = A[2]    
                                                      print $0
                                                 }
                      } 
              ' OFS=";" input2.txt config.txt input1.txt

Resulting

aster;PAGE;1111;4;DART
aster;PAGE;1234;4;MART
aster;PAGE;1234;4;KART
aster;PAGE;1234;4;KAR2
aster;PAGE;2355;4;HOME

Roozo · November 19, 2013, 11:26pm

Thanks Akshay...

$ cat config.txt 
A DART
B MART
B KART
B KAR2
C HOME
 
$ cat input2.txt 
1111 A
1234 B
9999 B
2355 C
 
$ cat input1.txt 
aster;PAGE;1233;4;1222

when I provide more than one(like below), it considering only last field "9999 B" and giving me the result.

$ cat input2.txt 
1111 A
1234 B
9999 B
2355 C
 
Out: output only for last parsed "9999"
aster;PAGE;1111;4;DART
aster;PAGE;9999;4;MART
aster;PAGE;9999;4;KART
aster;PAGE;9999;4;KAR2
aster;PAGE;2355;4;HOME

And I tried this but "$4" increment failed in output. Not getting the below output

Input3.txt 
1111 2
1234 1
9999 3
2355 4
 
Output: 
aster;PAGE;1111;3;DART
aster;PAGE;1234;2;MART
aster;PAGE;1234;3;KART
aster;PAGE;1234;4;KAR2
aster;PAGE;9999;4;MART
aster;PAGE;9999;5;KART
aster;PAGE;9999;6;KAR2
aster;PAGE;2355;5;HOME

And Akshay, can you explain the command which postin ?

neutronscott · November 20, 2013, 1:44am

wow this is just getting more and more complex! This was fun. Would be more understandable if I knew what I was working with and could name the variables something..

mute@thedoctor:~/temp/roozo$ ./script
aster;PAGE;1111;3;DART
aster;PAGE;1234;2;MART
aster;PAGE;1234;3;KART
aster;PAGE;1234;4;KAR2
aster;PAGE;9999;4;MART
aster;PAGE;9999;5;KART
aster;PAGE;9999;6;KAR2
aster;PAGE;2355;5;HOME

#!/bin/sh
awk '
  FNR==1{file++}
  file==1{if (!d[$1]) a[b++]=$1;c[$1,d[$1]++]=$2;next}
  file==2{e[$2,f[$2]++]=$1;next}
  file==3{g[$1]=$2;next}
  {
    for (i=0;i<b;i++)
      for (j=0;e[a,j];j++)
        for (k=0;c[a,k];k++)
          print $1,$2,e[a,j],++g[e[a,j]],c[a,k]
  }
' config input2 input3 FS=';' OFS=';' input1

Roozo · November 20, 2013, 2:18am

Neutronscott it Working good
I just started working in awk and tried more examples...
Could you explain the script so that would help me to understand more about awk ?

neutronscott · November 20, 2013, 2:36am

I really wish I could but I made it rather obscure! Here I try to comment the code..

#!/bin/sh
awk '
  #each time we encounter line 1 of a file, increment the "file"
  #variable, so we know which file we are in.
  FNR==1{file++}

  #inside file1 (config), we keep the order of first column (i.e: A,B,C)
  #if not a duplicate (a key does not exist in d[] yet), we omit adding to a[]
  #and incrementing b. so that
  #A,B,B,B,C looks like
  # a[0]=A
  # a[1]=B
  # a[2]=C
  #That will make d[] (which is the count of each individually):
  # d[A]=1
  # d=3
  # d[C]=1
  #and finally the meat of our work, the c[] array:
  # c[A,0]=DART
  # c[B,0]=MART
  # c[B,1]=KART
  # c[B,2]=KAR2
  # c[C,0]=HOME
  file==1{if(!d[$1])a[b++]=$1;c[$1,d[$1]++]=$2;next}

  #input2: maps the letters to a number (phone extention?)
  #we simply stuff it all in another array, e[]
  #using f[] like we used d[] before to keep count of each individually
  # e[A,0]=1111
  # e[B,0]=1234
  # e[B,1]=9999
  # e[C,0]=2355
  file==2{e[$2,f[$2]++]=$1;next}

  #input3: simplest one. hold a count for each number, with the
  #starting number from input3
  # g[1111]=2
  # g[1234]=1
  # g[9999]=3
  # g[2355]=4
  file==3{g[$1]=$2;next}

  #this is our processing function. we stored all the data we need into
  #arrays. now lets do work. and a lot of confusing work!
  #there is no condition listed. so this block executes for every line
  #(where file>3 since those condition blocks used "next" to skip all later
  #blocks from processing).
  {
    #i will be used to index a[]
    #go through each letter (A,B,C) in the order obtained
    for (i=0;i<b;i++)
      #j indexes e[], which mapped i.e. A->1111
      for (j=0;e[a,j];j++)
        #k indexes c[], which mapped i.e. A->DART
        for (k=0;c[a,k];k++)
          print $1,$2,e[a,j],++g[e[a,j]],c[a,k]
  }
#we start with default FS and OFS, but change to FS=; only for input1
' config input2 input3 FS=';' OFS=';' input1

Akshay_Hegde · November 20, 2013, 2:55am

for input posted in #10

$ cat input1.txt 
aster;PAGE;1233;4;1222

$ cat input2.txt 
1111 A
1234 B
9999 B
2355 C

$ cat input3.txt 
1111 2
1234 1
9999 3
2355 4

$ cat config.txt 
A DART
B MART
B KART
B KAR2
C HOME

awk -F'[ ;]' '
              # File Counter
              FNR == 1{
                        ++i
                      } 

              # File1 on statements goes here....
                i == 1{
                         R[1,FNR] = $1 ; R[2,FNR] = $2; d = FNR
                      } 

              # File2 on statements goes here....
                i == 2{ 
                              for(j = 1; j <= d; j++)
                                             if(R[1,j] ~ $2)
                                                    S[++s] = R[2,j] OFS $1          
                       } 

              # File3 on statements goes here....
                i == 3{
                          c[$1]=$2
                      }

              # File4 on statements goes here....
                i == 4{
                            for(j = 1; j <= s ; j++){
                                                      split(S[j],A,OFS)
                                                      $3 = A[2]
                              $4 = ++c[$3]
                                                      $5 = A[1]     
                                                      print $0
                                                 }
                      } 
              ' OFS=";" config.txt input2.txt input3.txt input1.txt

OR
processing in END block

awk -F'[ ;]' '
           FNR==1{++i}{LC=FNR}{for(j=1;j<=2;j++)A[i,FNR,j]=$j}
              END{
                           for(j=1;j<=LC[2];j++)
                           for(i=1;i<=LC[1];i++)
                                if(A[2,j,2]~A[1,i,1]){
                                               for(k=1;k<=LC[3];k++){
                                                                     n = A[3,k,2]
                                                                     if(A[2,j,1] == A[3,k,1]){
                                                                                              ++n
                                                                                              print A[4,1,1],A[4,1,2],A[2,j,1],n,A[1,i,2]                
                                                                                             }
                                                                     A[3,k,2] = n
                                                                    }
                        
                                                     }
        
                 }
              ' OFS=";" config.txt input2.txt input3.txt input1.txt

aster;PAGE;1111;3;DART
aster;PAGE;1234;2;MART
aster;PAGE;1234;3;KART
aster;PAGE;1234;4;KAR2
aster;PAGE;9999;4;MART
aster;PAGE;9999;5;KART
aster;PAGE;9999;6;KAR2
aster;PAGE;2355;5;HOME