Search for the two patterns and print everything in between

Hi all,
I have a file having data:

@HWUSI-EAS1727:19:6:1:3674:984:0:1#GTTAATA
NTTGGGTTTTCT
@HWUSI-EAS1727:19:6:1:3674:984:0:1#GTTA...
NTTGGGTTTTCT
@HWUSI-EAS1727:19:6:1:3674:984:0:1#.....CT
NTTGGGTTTTCT

I want to print everything starting from # till line ends.
can you please help me how to do that??
I am trying in perl, able to parse "#......." pattern but not able to understand how to make it print??

Thanks...

awk -F'#' '{ print "#"$2 }' inputfile > outputfile

If not on linux, try nawk or gawk.

1 Like

In sh or ksh or bash

IFS="#"
while read a b
echo "$b"
done <inputfile
1 Like
sed -n 's/.*#//p' infile
GTTAATA
GTTA...
.....CT

or do you need to combine the next line?

sed -n 'N;s/\n//;s/.*#//p' infile
GTTAATANTTGGGTTTTCT
GTTA...NTTGGGTTTTCT
.....CTNTTGGGTTTTCT
1 Like

Finally I used this code...by Scrutinizer...
thank you very much for replies...

I didnt know sed can be so useful and easy...now along with perl need to learn sed...

Using Perl, you could search for all text from "#" to end of line and replace the entire line by the result -

$
$ cat f31
@HWUSI-EAS1727:19:6:1:3674:984:0:1#GTTAATANTTGGGTTTTCT
@HWUSI-EAS1727:19:6:1:3674:984:0:1#GTTANTTGGGTTTTCT
@HWUSI-EAS1727:19:6:1:3674:984:0:1#CTNTTGGGTTTTCT
$
$
$ perl -lne 's/^.*#(.*)$/$1/ and print' f31
GTTAATANTTGGGTTTTCT
GTTANTTGGGTTTTCT
CTNTTGGGTTTTCT
$
$

Or you could remove everything up to and including the "#" character from each line -

$
$ perl -lne 's/^.*#// and print' f31
GTTAATANTTGGGTTTTCT
GTTANTTGGGTTTTCT
CTNTTGGGTTTTCT
$
$

Or you could split each line on "#" as the delimiter, assign the chunks to an array and print just the 2nd element of the array -

$
$ perl -lne '@x = split/#/ and print $x[1]' f31
GTTAATANTTGGGTTTTCT
GTTANTTGGGTTTTCT
CTNTTGGGTTTTCT
$
$
$ # Or more succintly...
$
$ perl -lne 'print ((split/#/)[1])' f31
GTTAATANTTGGGTTTTCT
GTTANTTGGGTTTTCT
CTNTTGGGTTTTCT
$
$ # Or even more succintly...
$
$ perl -plne '$_=(split/#/)[1]' f31
GTTAATANTTGGGTTTTCT
GTTANTTGGGTTTTCT
CTNTTGGGTTTTCT
$
$

Or you could even find out the index of the "#" character in each line, extract the substring of each line that starts from that index onwards, and then print that substring, like so -

$
$
$ perl -lne 'print substr($_,index($_,"#")+1)' f31
GTTAATANTTGGGTTTTCT
GTTANTTGGGTTTTCT
CTNTTGGGTTTTCT
$
$

tyler_durden

1 Like