Help needed with Sort and uniq data

asirohi · August 17, 2009, 2:37pm

Hi All,

After Sorting directories and files i have got following output as below, now i only want the strings common in them, so the actual output should be as below in the bottom. How do i do that?

Thanks
-adsi

File to be modified:-

Common Components for ----> AA
atria.basement.sun5
atria.msgcat.JPN.sun5
atria.perl.sun5
ClearCaseAdministrationTools-ent-CINSTALLDIR
com.ccl.feedreader.feature
CMServer.CQ.sun5
com.ccl.feedreader.feature
com.ccl.welcome.bits.feature
com.cic.licensing.feature
com.cqweb-ua.war
ClearCaseDotNetClient
com.help.common.feature
com.help.common.rational.feature
com.java.jre

Common Components for ----> BB
CCRCWebServerINSTALLDIR
atria.basement.sun5
ClearCaseAdministrationTools-CINSTALLDIR
ClearCaseAdministrationTools-ent-CINSTALLDIR
ClearCaseAdministrationTools-pro-CINSTALLDIR
ClearCaseAlbdServer-CINSTALLDIR
com.ccl.feedreader.feature
ClearCaseClearQuestIntegration-CINSTALLDIR
ClearCaseClearQuestMultisite-R
ClearCaseClientComponentsINSTALLDIR
ClearCaseConverters-CINSTALLDIR
com.help.common.rational.feature
ClearCaseCoreComponents-CINSTALLDIR
ClearCaseDotNetClient

Actual OUTPUT:-

Common Components for ----> AA
atria.basement.sun5
ClearCaseAdministrationTools-ent-CINSTALLDIR
com.ccl.feedreader.feature
ClearCaseDotNetClient
com.help.common.rational.feature
com.java.jre

Common Components for ----> BB
atria.basement.sun5
ClearCaseAdministrationTools-ent-CINSTALLDIR
com.ccl.feedreader.feature
com.help.common.rational.feature
ClearCaseDotNetClient
com.java.jre

peterro · August 17, 2009, 2:47pm

comm -12 aa bb

asirohi · August 17, 2009, 2:51pm

Hi Peterro,

But in comm we compare two file, but this was just the example. I this case i dont have a second file. I just want the common string in one file, i.e. that repeats more that one time.

Thanks
adsi

vgersh99 · August 17, 2009, 3:52pm

something along these lines to start with - although I cannot quite correlate the input and the desired output...:

nawk -f adsi.awk myFile.txt myFile.txt

adsi.awk:

BEGIN {
  FS=RS=""
}
FNR==NR {
  f[FNR]=$0
  next
}
{
  print $1
  for(j=2; j<=NF;j++)
    for(i in f) {
       if (i==FNR) continue
       n=split(f, a, ORS)
       for(i=1;i<=n; i++)
          if( $j == a && !($j in dup)){
             dup[$j]
             print $j
          }
  }
  print RS
  split("", dup)
}

jp2542a · August 17, 2009, 8:25pm

Here's an awk program that will retain the output order. One thing.. the java entry is not printed because it doesn't appear in both groups.

BEGIN {
# init the first group's array index
AINDEX=1
# init the second group's index
BINDEX=1
# init the group flag
FIRSTGROUP=1
}


#function to find match in a group array
# returns index if match, else 0
function check_match()
{
        for (i = 1; i <= AINDEX; i++)
                if (a == $0)
                        return(i)
        return(0)
}

# Store the first a group record
(FNR == 1) {
        a[AINDEX] = $0
# Set that it is printable
        p[AINDEX] = 0
# get next record
        next
}

# Do this clause if we are processing first group
(FIRSTGROUP == 1) {
# Check if we have complete first group
        if ($0 == "" ) {
# end of first group
                FIRSTGROUP = 0
# get the second group header
                getline
# initialize the second group by storing the header
                b[1]=$0
# go to next line
                next
        }

# add it to the first group's array
        a[++AINDEX] = $0
}

#Do this clause if we are processing second group
(FIRSTGROUP == 0) {
# Check to see if this in first group
        if(( j = check_match()) != 0) {
# yes - so add it to b array
                b[++BINDEX] = $0
# create index in p to indicate it's a match in a
                p[j] = 0
# next record
        }
}

# display results
END {
        for( i = 1; i <= AINDEX; i++)
# print first group entry only if it matched b group entry
                if( i in p)
                        print a
        print " "
        for (i = 1; i <= BINDEX; i++)
                print b
}