Making Connection nodes for Graph

Hi Power User,

I have this following data:

file1
aa A
aa B
aa C
bb X
bb Y
bb Z
cc O
cc P
cc Q
. .
. .
. .
. .

and I want to turn them into a connection nodes like this:
file2

A aa A
A aa B
A aa C
B aa C
B aa B
C aa C
X bb X
X bb Y
X bb Z
Y bb Z
Y bb Y
Z bb Z
. . .
. . .
. . .
. . .

I made this relation, to create a graph. The file could have more than 100.000 lines. Any suggestion, how to create file2 by using perl or awk? Tx

join -o 1.2 0 2.2 -1 1 -2 1 file1 file1 | nawk '!a[$3$2$1];{a[$1$2$3]++}'

This may perform better (or not) with a large file1:

join -o 1.2 0 2.2 -1 1 -2 1 file1 file1 | nawk '$1<$3{print;next}{print$3,$2,$1}' | sort -u

Tx for the answer. I have tried the first script, and it worked great :slight_smile:

Also doing even the join work in awk:

nawk '
NR==FNR { c = a[$1]; a[$1] = c?c" "$2:$2; next }
{ c = a[$1]
  if (c) {
    split(c,b)
    for (k in b) {
      p = $2<b[k]?$2" "$1" "b[k]:b[k]" "$1" "$2
      if (!d[p]++) print p
    }
  }
}
' file1 file1

Tx for the scripts. However, I have another problem, which is related to the previous one. For example, if I have this file:

file1
aa A 3
aa B 4
aa C 5
bb X 6
bb Y 7
bb Z 8
cc O 9
cc P 10
cc Q 11
. .
. .
. .
. .

and I want to turn them into a connection nodes like this:
file2

A aa A 3 3
A aa B 3 4
A aa C 3 5
B aa C 4 5
B aa B 4 4
C aa C 5 5
X bb X 6 6
X bb Y 6 7
X bb Z 6 8
Y bb Z 7 8
Y bb Y 7 7
Z bb Z 8 8
. . .
. . .
. . .
. . .

I made this relation, to create a graph. Like before, The file could have more than 100.000 lines. Any suggestion to modify the script, or to create a new one? Tx