awk executes and works but only on copy of file

The tab-delimited file below using the awk produces a blank output. However, when I copy the same lines in file to a new document and execute the awk I get the desired result.
The awk counts the unique characters before the : in $7 according to the id in $1 .
The awk seems to work but I can not figure out why it doesn't on the original file. There doesn't appear to be windows line endings. Thank you :).

file

PTPN11	5781	13324	28363	genomic	na	LRG_614:g.36663G>T	g.36663G>T	-	-	Yes	No	No
PTPN11	5781	13324	28363	coding	na	LRG_614t1:c.214G>T	c.214G>T	LRG_614p1:p.Ala72Ser	p.Ala72Ser	Yes	No	No
PTPN11	5781	13324	28363	coding	na	NM_002834.4:c.214G>T	c.214G>T	NP_002825.3:p.Ala72Ser	p.Ala72Ser	Yes	No	Yes
PTPN11	5781	13324	28363	coding	na	NM_080601.2:c.214G>T	c.214G>T	NP_542168.1:p.Ala72Ser	p.Ala72Ser	No	No	No

awk

BEGIN { FS="[\t:]" }
{
    cnt[$1][$7]++
    max[$1] = (max[$1] > cnt[$1][$7] ? max[$1] : cnt[$1][$7])
}
END {
    for (word in cnt) {
        for (val in cnt[word]) {
            if (cnt[word][val] == max[word]) {
                print word, val
            }
        }
    }
}

desired result

PTPN11 LRG_614t1
PTPN11 LRG_614
PTPN11 NM_080601.2
PTPN11 NM_002834.4

What does the file command show

file filename
1 Like

hgvs4variation.txt: ASCII text, with very long lines , but I am not sure what to do from here. Thank you :).

How did you "copy the same lines in file to a new document"?

What does ls -l original_file new_document show?

What does diff original_file new_document show?

1 Like

It might also be worth doing sha1sum on each file to compare the derived checksum value. If that is not available, try using sum which is not so detailed but is more widely available.

Perhaps the copy process has written an end-of-file or perhaps the original file is open and you awk is refusing to read it because of that where cp is less concerned. Does running fuser on the original file show anything?

Just suggestions to try out. If there are differences, can you push them through od to see what the differences are?

I hope that this helps,
Robin

1 Like

The same file was formatted with tab and space delimiters in it. Thank you very much :).