I am trying to use awk to place the contens of a filename in $1 and $2 followed by the data in the text file. Basically, put the filename within the text file. There are over 1000 files in the directory and as of now each file is saved with a unique name but it is not within the file. Thank you :).
Text file:
LastName,FirstName_123456.txt.hg19_multianno
Desired output:
$1 $2 $3
LastName,FirstName 123456 data in files (the 24 columns)
Each text file contains 24 columns with multiple rows in it. I am trying to print the LastName,FirstName in field 1 and the 123456 in field 2 of row 1. Each new filename has a header in row 1. I will post a sample as soon as I can, I am on my blackberry and can not right now. Thank you :).
which, with your attached sample data (stored in a file named LastName,FirstName_123456.txt.hg19_multianno ) produces the following as the 1st five lines of its output:
ls *unw*
one_1234.unwanted_part three_3214.unwanted_part two_2314.unwanted_part
With the following content header, rows and columns:
cat *unw*
a b c d e f g h i j k l m n o p q r s t u v w x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
a b c d e f g h i j k l m n o p q r s t u v w x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
a b c d e f g h i j k l m n o p q r s t u v w x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Do you want an output like this?:
perl -07 -ne '@np=$ARGV =~/^([^_]*)_(\d+)\./ and print "@np\n$_"' *unw*
one 1234
a b c d e f g h i j k l m n o p q r s t u v w x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
three 3214
a b c d e f g h i j k l m n o p q r s t u v w x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
two 2314
a b c d e f g h i j k l m n o p q r s t u v w x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Or do you want this output?:
perl -07 -ne '@np=$ARGV =~/^([^_]*)_(\d+)\./ and print "@np $_"' *unw*
one 1234 a b c d e f g h i j k l m n o p q r s t u v w x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
three 3214 a b c d e f g h i j k l m n o p q r s t u v w x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
two 2314 a b c d e f g h i j k l m n o p q r s t u v w x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Maybe this output?:
perl -pe '@a=$ARGV =~/^([^_]*)_(\d+)\./; $_="@a $_"' *unw*
one 1234 a b c d e f g h i j k l m n o p q r s t u v w x
one 1234 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
one 1234 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
one 1234 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
three 3214 a b c d e f g h i j k l m n o p q r s t u v w x
three 3214 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
three 3214 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
three 3214 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
two 2314 a b c d e f g h i j k l m n o p q r s t u v w x
two 2314 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
two 2314 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
two 2314 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
@Aia the perl below is close except for the 1 in the output goes under column a and the output is tab-deliminated for excel (that was how the original input files was). Thank you :).
perl -07 -ne '@np=$ARGV =~/^([^_]*)_(\d+)\./ and print "@np $_"' *unw*
one 1234 a b c d e f g h i j k l m n o p q r s t u v w x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
three 3214 a b c d e f g h i j k l m n o p q r s t u v w x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
two 2314 a b c d e f g h i j k l m n o p q r s t u v w x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
---------- Post updated at 07:33 AM ---------- Previous update was at 07:10 AM ----------
@Don Cragun
the awk is all most perfect, except the first two columns field only need to appear in the header row.
LastName,FirstName 123456 Chr Start End Ref Alt Func.refGene Gene.refGene GeneDetail.refGene ExonicFunc.refGene AAChange.refGene PopFreqMax 1000G2012APR_ALL 1000G2012APR_AFR 1000G2012APR_AMR 1000G2012APR_ASN 1000G2012APR_EUR ESP6500si_ALL ESP6500si_AA ESP6500si_EA CG46 common clinvar clinvarsubmit clinvarreference
1 43394661 43394661 A exonic SLC2A1 nonsynonymous SNV SLC2A1:NM_006516.2:exon8:c.T1016C:p.I339T
one 1234 a b c d e f g h i j k l m n o p q r s t u v w x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
three 3214 a b c d e f g h i j k l m n o p q r s t u v w x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
two 2314 a b c d e f g h i j k l m n o p q r s t u v w x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24