I have a csv dataset like this :
C,rs18768
G,rs13785
GA,rs1065
G,rs1801279
T,rs9274407
A,rs730012
I'm thinking of use like awk, sed to covert the dataset to this format: (if it's two character, then keep the same)
CC,rs18768
GG,rs13785
GA,rs1065
GG,rs1801279
TT,rs9274407
AA,rs730012
Could anyone give me some clues ?
Hello nengcheng,
It is always recommended to add your efforts in your post which you have out in order to solve your own problem, could you please try following.
awk 'BEGIN{FS=OFS=","} length($1)==1{sub(/.*/,"&&",$1)} 1 ' Input_file
Output will be as follows.
CC,rs18768
GG,rs13785
GA,rs1065
GG,rs1801279
TT,rs9274407
AA,rs730012
2nd solution: Using $1
value itself to make it double.
awk 'BEGIN{FS=OFS=","} length($1)==1{$1=$1$1} 1' Input_file
Thanks,
R. Singh
1 Like
Cool solution nez, how about a pure BASH one, just for fun
while IFS=, read field1 field2
do
if [[ ${#field1} -eq 1 ]]
then
field1=${field1}${field1}
fi
echo "$field1,$field2"
done < "Input_file"
Thanks,
R. Singh
1 Like
#!/bin/bash
while read -n2 a; do
read b
echo ${a//,/$a}$b
done < file
--- Post updated at 10:37 ---
#!/bin/bash
while read a; do
echo ${a/#?,/${a%,*}${a%,*},}
done < file
1 Like
Thank you Singh, I will try it next time.
--- Post updated at 03:55 AM ---
Thank you, nezabudka it also worked for me.
Hi Ravinder...
Just by removing one set of []
makes your version fully POSIX compliant:
#!/usr/local/bin/dash
echo 'C,rs18768
G,rs13785
GA,rs1065
G,rs1801279
T,rs9274407
A,rs730012' > /tmp/text
while IFS=, read field1 field2
do
if [ ${#field1} -eq 1 ]
then
field1=${field1}${field1}
fi
echo "$field1,$field2"
done < /tmp/text
Results, OSX 10.14.3, default bash terminal, calling dash:
Last login: Sun Apr 28 11:31:12 on ttys000
AMIGA:amiga~> cd desktop/Code/Shell
AMIGA:amiga~/desktop/Code/Shell> ./add_single_char.sh
CC,rs18768
GG,rs13785
GA,rs1065
GG,rs1801279
TT,rs9274407
AA,rs730012
AMIGA:amiga~/desktop/Code/Shell> _
1 Like
anbu23
8
$ awk -F, -v OFS=, ' { sub("^.$","&&",$1) } 1 ' file
CC,rs18768
GG,rs13785
GA,rs1065
GG,rs1801279
TT,rs9274407
AA,rs730012
1 Like