Parsing a fasta sequence with start and end coordinates

Hi.. I have a seperate chromosome sequences and i wanted to parse some regions of chromosome based on start site and end site.. how can i achieve this?

For Example Chr 1 is in following format

I need regions from 2 - 10 should give me AATTCCAAA

and in a similar way 15- 25 should give me AAGATTGCAT

and from 27 - 30 should give me AGTT

How can i do it either in perl or bioperl or awk or any other way?

awk -v start=2 -v end=10 -v chr=chr1 '$0~chr{getline seq; print substr(seq,start,end-start+1)}' sequence
AATTCCAAA

awk -v start=15 -v end=25 -v chr=chr1 '$0~chr{getline seq; print substr(seq,start,end-start+1)}' sequence
AAGATTGCATC

Thanks for the reply.. i am pretty new to awk programming.. so i have chromosome 1 in a fasta file format and where should i give it as input?

Using cut command

cut -c2-10 inputfile

cut command is not working properly.. its splicing whole file in to 10 frament length lines

I think it should be ok if you use the whole fasta file as the input file:

awk '{CODE}' fasta 

No its not giving correct results.. I have the fasta file of 300,000 bp long.. but i need the sequences for some specific sites.. The above code in awk only giving the sequence of one line no matter how much length you give.. Also if the start site is after the first line, we are not getting any information about it..

Could you upload a little of fasta file as example, and your expecting output?

Yes.The given cut command is intended to work as you said. You would need to change the start and end character index accordingly. If you need to only match certain lines to extract characters, provide more information.

cut -c15-25 inputfile