Find common lines between multiple files

Hello everyone

A few years Ago the user radoulov posted a fancy solution for a problem, which was about finding common lines (gene variation names) between multiple samples (files). The code was:

awk 'END {
  for (R in rec) {
    n = split(rec[R], t, "/")
    if (n > 1) 
      dup[n] = dup[n] ? dup[n] RS sprintf("\t%-20s -->\t%s", rec[R], R) : \
        sprintf("\t%-20s -->\t%s", rec[R], R)
    }
  for (D in dup) {
    printf "records found in %d files:\n\n", D
    printf "%s\n\n", dup[D]
    }  
  }
{  
  rec[$0] = rec[$0] ? rec[$0] "/" FILENAME : FILENAME
  }' f10.lista f12.lista f13.lista f14.lista fs6.lista

The problem now is that I want to find intersectons of lines between 3, 4 and 5 files, but the program is only showing the results for 3 files.
I'm very newbie at AWK so help me please to modify this code to get my solution.
Thank yo in advance.

Sort each file unique, sort merge not unique all those, and count the duplicates:

sort -m <( sort -u file1 ) <( sort -u file2 ) ... | uniq -c | sort -nr | pg

Thank you DGPickett for your answer but what I need is to modify the given code to obtain the intersection results for 4 and 5 or more files than just 3.

Actually, I want this kind of result:

records found in 3 files:
.
.
.
.
records found in 4 files:
.
.
.
.
.
records found in 5 files:
.
.
.
records found in 'n' files:

but the program now is only showing this:

records found in 3 files:

I hope this would clarify any doubts

try:

awk '
! f[FILENAME]++ {fc++}
! b[$0,FILENAME] {a[$0]++; b[$0,FILENAME]=$0}
END {
for (j=3; j<=fc; j++) {
   print "records found in " j " files:"
   for (i in a) {if (a==j) print i}}
}
' file*
1 Like

Thank you so much rdrtx1, It works as I wanted!

If a line is in 5 files, it comes up prefixed with 5. You can add "grep -v '^ 1 ' |" before the final sort to toss those with only 1 file.