Hi,
I have a file like this
>hg19_chr1_123_456_+
asndbansbdahsjdbfsjhfghjdsghjdghjdjhdghjjdkhfsdkjfhdsjkdkjghkjdhgfjkhjfkf
hasjgdhjsgfhjdsgfdsgfjhdgjhdjhdhjdfhjdfjgfdfbdghjbfjksdhfjsfdghjgdhjgfdjhgd
jhgdfj
>hg19_chr1_123_456_-
akjldshfuiewyruiewehbjhvbdcnmbfhdsjfjdbfhdbhjdbghjfdbghjbdfghjdfbjhbkk
jsdhfjdgjfdgjfdgjkfdhjkfhkjfhjkfjkhkjfskjdfhkjhgjkdgkjfdhgjkfhjkfhgkfhkfkjhgf
dshjghjdg
>hg19_chr2_234_456_+
skjfhdsjkfghdjkghdfjkhgjkfdghjkdfuiertytoierytuireyteiruytueriyteruytierutye
sjhdjashdjahjkdasjkhdajkshdkajshdkashdasruweyriweyrueiwryewrewurewuu
jdhfjkdshf
I want to grep the start of line to be '>' and grab the next 19 letters or the whole line
So, my output will be
>hg19_chr1_123_456_+
>hg19_chr1_123_456_-
>hg19_chr2_234_456_+
Like this:
sed -n '/^>/s:\(.\{20\}\).*:\1:p' infile
1 Like
awk '/^>/{print substr($0,1,20)}' input-file
1 Like
@neutron scott. It worked for me. I have more numbers than 3 in the pattern. So, I used 25 instead of 20.
@elixir_sinari - Ur command works too. But, when I increase the 20 to 25, it omits some records. I donno why.
Thanks to both of u.
yah the sed match there says exactly 20, rather than up to 20. I believe that'd need to be \{0,25\}
jacobs.smith:
Hi,
I have a file like this
>hg19_chr1_123_456_+
asndbansbdahsjdbfsjhfghjdsghjdghjdjhdghjjdkhfsdkjfhdsjkdkjghkjdhgfjkhjfkf
hasjgdhjsgfhjdsgfdsgfjhdgjhdjhdhjdfhjdfjgfdfbdghjbfjksdhfjsfdghjgdhjgfdjhgd
jhgdfj
>hg19_chr1_123_456_-
akjldshfuiewyruiewehbjhvbdcnmbfhdsjfjdbfhdbhjdbghjfdbghjbdfghjdfbjhbkk
jsdhfjdgjfdgjfdgjkfdhjkfhkjfhjkfjkhkjfskjdfhkjhgjkdgkjfdhgjkfhjkfhgkfhkfkjhgf
dshjghjdg
>hg19_chr2_234_456_+
skjfhdsjkfghdjkghdfjkhgjkfdghjkdfuiertytoierytuireyteiruytueriyteruytierutye
sjhdjashdjahjkdasjkhdajkshdkajshdkashdasruweyriweyrueiwryewrewurewuu
jdhfjkdshf
I want to grep the start of line to be '>' and grab the next 19 letters or the whole line
So, my output will be
>hg19_chr1_123_456_+
>hg19_chr1_123_456_-
>hg19_chr2_234_456_+
You said you want the next 19 letters or the whole line.
You have been given two ways to get the '>' and the next 19 letters, but now you seem to want the next 24 characters.
If you want the entirety of lines starting with '>', just use:
grep '^>' file
otherwise, you need to explain how we are supposed to determine when you want 19 characters, when you want 24 characters, and when you want the whole line.
1 Like