Merging two files each contain 16 lakh lines on HP-UX 11.11 system

Hello All ,

I am trying to merge two files each contain 16 lakh lines ..My requirement is i have merge after every 14 lines of each file .

Like from file1 14 lines then after after 14 lines form file2 ..so i wrote below script .

It is working for small files ,but large files script not running and PID of script get killed automatically after merging 90000 files ..

Please anyone help me inthis ,anyway that i can merge it in quick time .

#!/bin/sh
touch test
size=`ls -lrt file1 | awk '{print $5}'`
while [ "$size" -gt 0 ]
do
    sed -n 1,14 file1 >> test
    sed  '1,14d' file1 >testing1.tmp
    cat testing1.tmp >file1
    rm testing1.tmp
    sed -n 1,14p file2 >> test
    sed  '1,14d' file2 >testing2.tmp
    cat testing2.tmp >file2
    rm testing2.tmp
    size=`ls -lrt file1 | awk '{print $5}'`
done
echo " Files merged successfully "

How about

awk '1; !(NR%14) {for (i=1; i<=14; i++)  {getline < F2; print}}' F2=file2 file1
1 Like
#!/usr/bin/awk -f
BEGIN {
File2 = ARGV[2]
    --ARGC
    }
    {
        print
            if((FNR % 14) == 0) {
                for(n=1; n<=14; ++n) {
                            getline <File2
                                    print
                                        }
                                            }
                                            }
                                            END {
                                                while (getline <File2)  print
                                                }

Above code is working for small sized files but coming to big size files it merging in zigzag manner .. Can anyone suggest which script works here

What is "zigzag manner"?

lines are repeated so many times by suppressing trailing lines.. like 0 1 2 3 4 5 5 5 5 5 5 10 11 12 12 12 ..like this i am receiving output ..

I neglected error checking. Might be file2 is at its end, and getline failed repeatedly. This doesn't depend on the files' sizes but stems from file length unbalance. Try

awk '
function FLOK() {return (1 == getline < F2)}
1
!(NR%14)        {for (; (++i)%15 && FLOK();) print}
END             {while (FLOK()) print}
' F2=file2 file1  
1 Like

I am receiving same error .. it is merging good upto 1000 lines then it is starting wrong merging .. Lines are getting repeated and some line getting supressed:confused:

How many lines in file1, and in file2?

---------- Post updated at 21:33 ---------- Previous update was at 21:23 ----------

I did a test with roundabout 3000 lines, and one file 100 lines longer than the other. It worked.

It is working fine upto 5000 lines (not sure) ,but in my case each file contain more than 15000 lines.

File =16739
File2 = 17512

Lenght of each line is below 100 characters

Can you suggest me for this much large file size:(

Try this shell script:

#!/bin/ksh
while :
do
  for descriptor in 3 4
  do
    for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14
    do
      IFS= read -r line || break 3
      printf "%s\n" "$line"
    done <&$descriptor
  done
done 3< file1 4< file2

The outer loop opens the two files via descriptors 3 and 4 and does never end.
The 2nd loop toggles between the the two descriptors.
The 3rd loop reads 14 lines from the current descriptor.
The break 3 breaks out from the 3 nested loops
If the output looks good, you can redirect the whole stuff to a file

...
done 3< file1 4< file2 > file3

Hi Phani,

can we split src file in 16 files each with 1 lakh records and then process each one by one with codes suggested here..?