Extraction of upstream and downstream regions from long sequence file

Hello, here I am posting my query again with modified data input files.
see my query is :

i have two input files file1 and file2.

file1 is smalldata.fasta

>gi|546671471|gb|AWWX01449637.1| Bubalus bubalis breed Mediterranean WGS:AWWX01:contig449636, whole genome shotgun sequence
TATGGATGTGAGAGTTGGACTATGAAGAAGGCTGAGCGCTGAAGAATTGATGCTTTTGAACTGTGGTGTT
GGAGAAGACTCTTGAGAGTCCCTTGGACTGCAAGGAGATCCAACCAGTCCATTCTGAAGGAGATCAACCC
TGGGATTTCTTTGGAAGGAATGATGCTAAAGCTGAAACTCCAGTACTTTGGCCACCTCATGCAAAGAGTT
GACTCATTGGAAAAGACTCTGATGCTGGGAGGGATTGGGGGCAGGAGGAGAAGGGGACGACAGAGGATGA
GATGGCTGGATGGCATCACTGACTCGATGGACGTGAGTCTGAGTGAACTCCGGGAGTTGGTGATGGACAG
GGAGGCCTGGTGTGCTGCGATTCATGGGGTCGCAAAGAGTCGGACACGACTGAGTGACTGATCTGATCTG
ATGTGATCTGAGGGAGGCCTGGTGTACTGCAATTCATGGGGTCGCAAAGAGTTGGACACGACTGAGTGAC
TGATCTGATCTGATCTGATTACTCATTTGATTTTCCAGTTTTAAATGTCATTCATTGTATCTTCACTAGA
AAAGGTGATTTCACTCTTTCCCATTATACAGAAACATATTTCCTATCTCTTCAAATATAGTTACACTATT
TTATTTTAATTTGATTTGACTGTCTATTGTCTTTGAGGAGTGGGGTTGTACTGGGTCTTGGTTACAGGAT
CTTTAGTTGCAGCATGTGGGATCTAGATCCCTGTCCAGGGCCCTGAGTATGGGGAGCTCAGAGTCTTAGC
CACAAGACCACCAGGGAAGTTTCCAGTTACACGATCATTTTAGTTAGATAAATATTTTGTGTTTACATTA
TTACTGTATCAGTGATATTCACACTGAATTATACAATGTGATTTTTACACAGTAATTTTTTCTTTCTGGC
TTATTTTTGCGCTTTTCCTGAAATTCATCGTTGTCCTTGTTTTGTGTAGGTTTCTAAGAACTCAGCCCTA
GTTAAACTCCAGACTTCGTGTCAAGTGTATAAATCTCCATTCAAGATGTTCAGAAGCCTGTGGTGACCTA
CGAATTCTGTCTTTCTGGGAAGTCCCTGCTCCCAGCTGGACGATCCCCCGGAGTGCACGCCCATCACTGC
AGGATTCCTGACACGTGTCGCCTTGGTTAGTGTCCTGCTGTCTTCCTCTTTCTTGTTTCATCCTGTAGTT
TGGGGGACCACAGCCCCTGCCCCAGTAGCTTCCTGAGAAAAGATGCTTTAGAGGTAAATATTTTGACACT
TGAATACTTGGAAAACTCTTCATTCTATCCTTGCATTTTGTTGTTTGTCTGGGTGAGGAATTCTTAAGAT
TCATTACTGCTCTCAGGCTTTCTTTAATTGTGGAGAGTGGTGGCTATTCTCTAGTTGCAATGCATGGGCT
TCTCATTGCAAAGCATGGGCTCTAGGCGTGTGGGCTCAGTAGTTGTAGCACACGGGCTTAGTTGCTCCGA
GACACATAGGATCTTCCTGGACCAGGGATCGAACCAGTGTCCTTTGTATTGCAAGGTGGATTCTTAACTA
CTGTATCACCAGGGAAACCCTGGATAAGGAACTCTAGATTGAAAATTGTTTTCCTTTAGAATTTTGAAAA
TATTGCTCCATTGCCTTAAAAAAAAAAAAGTTACTGTTGAGAAACAGGAAACCATTTTGATCTCTGTTTT
CTTTGTCTGAAAACAGAATTTTTCTTTTAAATCTTTTCTTTCTGTCCCCAGTGTTCTGAAATTTCACAAT
GACCTGCCTTGATGTGGGTAATTTTTCATCTGCTTTGTTGGAATGGTCCCATTTAATGTAGAAACTTTCC
CATCAGGTCTGGGAAAGTTTCTTGAAATATTTCATTGATGATATCCTCTTGACAACAGCTTTTGTTTTTA
TTTTTTTTTCTAACACCCTCTGACTATACCATTTCTGGAAATACCATTTCTCTCACAATATAAAGCCAGA
TCTTATGGTCCTAGAATAAAATCAGGGAAGTAGTGCTGGGAAAAAAATGAACAAAGACATCCCATTAAGT
CTGTCTGGAGTAGGAAGGAAGGCATCTCTGACTTTGAAAAGGGAGCTCCGTGGTACCCTTTTCAGTCCCT
TCCGGGGTCCTTTATGTCAGCCCAGTGCCTGGAAGCTTGGGGATCTCACCCTGTCAGATTGTCTCTGGGC
AGTTCATCTAAGATAACATCAGTGACCCTGGCAGGAGGAGCCCTTTGAAAGGTGAAAACCTGTGACCCTT
GGCCCTCAAGAAGGCGTATCTGAAAGCTAGATCCTTGACCCCAGCCAGCCCTCTTCCTGGGGCTGCCTCC
CTCGAAAGACTGGATCGAATTAGATTCAACCGGTGTGGACGTAGGTGTGGACACCCACAAGGATGGGACA
GAGACAAAACCCAAGAAAACCAGTCTGTGACATCACACACCACTCCAGAAGGCCTGCGGATGGTGACCGC
AGCCACGAAATTCAAAGACGCTTGCTCCTCGGGAGAAAAGCGATGACCAACCTAGACAGCATATTAAAAA
GCAGAGACATTACTTTGCCAACAAAGGTCCATCTAGTCAAGGCTATGGTTTTTCCAGTGGTCATGTATGG
ATGTGAGAGTTGGACTATAAAGAAAGGTGAGCACTGAAGAATTGATGCTTTTGAGAGAAGCAAAAGACTT
CTTTTGAGAAGTCTTGGAGTGTTGGAGAAGACTCTTGAGAGTCCCTTGGACTGCAAGGAGATCCAACCAG
TCCATCCTAAAGGAGATCAGTCCTGAATATTCATTGGAAGGACTGATGCTGAAGCTGAAATTCTAATACT
TTGGCCACCTGATGCAAAGAGCTGACTCATTGGAAAAGACCCTGATGCTGGGAAAGATCGAAGGCAGGAG
GAAAAGGGGACGACAAAGGATGAGATGGTTGGATGGCATCATCGACTCCATGGGCATGAGTTTAAGTAAG
CTCCAGGAGTTGGTGATGGACAGGGAGGCCTGTCATGCTGCAGTCCATGGGGTCACAAAGAGTCAGACAC
GACTGAGCGACTGAACTGATCTGACATCACAGAGCAAAAGTGTTGGATGTTGCCGTGACTGGGGTGGCCT
ACTCCAGCACCGTGGCTTCTATGGGACTCCATGCAGTAGAAGTGTCCTTCCATCCTCACCAGAACTCGAG
AGGAGACTGGAGTTTCAGCAGCTACTATGGAGGCACAGAGTCAGATGCCTGATGTGCCTTCCTGACTAGT
AATCCCAGTACCCAGCACAACGTGAAATCTGCTGACTGGTAAGGGCGCCCTCATGTGTTACTACAGGGTA
ATGTCAACTTGGTCTTCGCAGCAGGGACACAATTCCTCTGGGTATATCTCTTCATCCTGCGTTTCTCTTC
TCCCTGCGTCTGTCTCCTTCTCACTTACCTGTTAGACCACGTATGTGCTCAGAGAAGAAACACAGGCAGG
GCTTCTGAGTGTGACTTCTCTAGCCAGACTGACCATTTCGTCCCTTCACATTCTACACCTATTTACACAC
TTAAAAATTATTGAGGATCCCAGTATGTGTATGTGGGTTGTACTTATTGATTTATACTACATGAGAAACT
GAAACTGAGAATTTTAAAATGTTTATTTATTAATTAATTTGGCCACCTGCTGCAAAGAGCTGATTCGTTG
AAAAAGATCTGGACGCTGGGAAAGATTGGAGGCAAAAGGAGAAGGGGGAAGCAGAGGACCAACTCAATGG
ACATGAGTTTGAGCAAACTCCAGGAAATAGTGGAAGACAGAGGAGCCCAGCATGCTACAGTCCCTGGGGT
TGCAAAGAGTCAGATACAACTTAGAGACTGAACAATAACACTTTATTAATTCACTTAGAATAGCCACAGC
AAACCCATATTAACACAATACTTGAATGAAAACTAACCAGGGTTTTAAAAACAAACAAAGAGTGAAAAGC
ATGACATGGTTTTAAACTTTTGCAAATCTCTTTAATGTCTGGCTTAGTAAAAAACAGCTAGAAAAGAGCT
AAATTTGCTTCTGCATTTGCTCTCTATCCATGTCCCATGTCACGTAACCTCTGGAAAACTCCACTGTACT
CTTATGGGAGAATGAGTGAAAAGGGCAATTAACATCTTAATATTACTATGAAAACACTTTTGACCTCAAT
ATCCCCCTGACAGAGACTTGGGAACCCCTAAAAGATCTCAGACCACTTTGAGAACTGCCAAATTAAGAAT
ATAGTCACAGCGTTACATATTTATGTCAGATCTTTAATATTACCCATAAATGTGTATGCATGCTTAGTTG
CTCAGTTGTATCTGATTCTTTGCAACCCCGTGGACTGTAGCTCACCAGGCTCCTCTGTCTGTGGAGTTTT
CCAGGCAAGAATACTGGAGTGGGTAGCTATTTCCTTCTGCAGGGGATGTTCCTGATCCAGGGATCAAACC
TGGGTCTTCTGCACTGCAGGCAGATTTTTTACCATCAGAGCTACCAGGAAAGCCCTTATAAATATGCATC
AACTATTTAATTAATTAGTGGTTTCTTTGCTTCCTAGTGGCTCAGATGGTAAAGAAACTGCTTGAAATGC
AGAAGGCCTGGGTTCAATCCCTGGGTCGGGAAGACTCCCCTAAAGAAGGGAATGGCAGTCGGCTTCAGTA
TTCTTGCCTGAAAAATCCCTTGAAGAGAGGAGCCTGGTGGGCTACAGTCCATGGAGTCGCAAAGAGTCGG
ACATGACTGAGTGACTAACACTTTCACTTTCACTTTTTAGTCCTTAAGGAAATCATATTTTATTGTTAAC
AAGTAACTTTGCTATGATATACATATGTTGTATGTACATCTGAAAAAGCAATCTATACAGCTTGACCATT
TGAATACTAAAATATTTCAACTTTGAGAACTGCCTAAAAATATACATATGTATAATCACATGGGATTTGC
CTGGTGGTCCAGTGGTTAGGACTCCAAGCTTCCACTGCAGGGAACACAGGTTCGATCCTTGGTTGGGGAA
CAAAGATCCTACATGCTGTGCAGCATAGCCAAAAAATAGAAAAAAGAAGAAGAAGAATATAGTCACATGA
ATCACAGGCCTGGCCATGGATCTAGAATCTAAACAAATTCTAATGGTAATTTTTTGAGGTTAAGGTTCCC
TTTGCTATTCTAGCCAAATAACTGAGGTTGCACTGAGAAGGGCAGGGTTCATGCTCCCATGATGTTCTGG
GCTCTCTGCTTTCTGCTTCCCCGGGCTGCCTATTCAAGTTCTGGAACCCATAGCTTCACCAGGATTTAAT
CACTGTTTGCCCATAACAGTGTCCTGCGATGCCTTATCCCTCAGAGAGATCTCTATTGATGGGGATTTAA
TATACATGGAAGAGCCACCAAAGGGACCTTTCAGGTCAGGAGATTGGGGTGTGTTCACAGCAGTCTTTGC
CCCCCTGGGGCTGACCCCATGACAGTGTCAACTGAGACTTCTGGGAAGGAAGGAGGTAATCCCTGGTAAT
GCCCTGGACCTCTGTGAAGTGGGTTCTAGAGTCTGGCTGGTGGGGTTCTGAGACAGCCACTTACTAGCTG
GGTGACCTTGGGACGTTCTCTTTGTCTTTCGCAACTTCAGTTTCCTCATCTATAAAACCGGACTAATTGT
ACCCACCTCCCTAGTTGCAGATAATAAATGCAAAGCACTGAGGCATTGACTGGGGTCTCAGTAGCCATGA
ACCCCAGCCATGTCACTTCCTGTGTGTGACCTGGGCAACTTTCTTTTAGGAGGTCTCGACCTTCCAAATA
TTTCTAGTCTCTGACCGCTTCTTGCATCTGTTCTGTTATCACCCTTGCCCCAGCTGTCACCATCCATCAG
TCCCTGCCATCTTACTCTGCTCGATTTTACTCACCACGCGGTGGTAGGTGCTCAATGTTTGTTGAATTTA
GTTGAACTGGATTATCTGAACTGTAACTTCTCCATCTTTTTTTTTTTTTGGTTGTGCCACCCAGCATGCA
TGTCTAGTTCCCAACCAGGGATTGAACCTGTGCCCCCTTTAGTGGACACATGGCATCCTAACCACTGGAC
CTCTGGGGAATTCCCCACTTTTCCATCTTTAAAATGAGAAAATTGGACTAGAATTCTCAGGGTCCCACTA
GCTCTAACATCCCATAACTTTCTTTATATATCTATGTCGGAAAATGACTTGTGAATGTATCTATAAATAT
GTTTTTCTGGTCATGTTTAATGAATGATTTAACAGCCCTGTTACAGTCTCCTGATTCAGCAGCCATCACA
GGAGCGGGACCCCTACCCTTCTCAGTGCCAGGCAAGTTCCTTTGGGTGAAATAAGAAAAGGGAACCTGAA
CTCCAGGAAGTAAGCCAGAAAGAAAAACACCAATACAGTATACTAACACATATATATGGAATTTAGAAAG
TTGGTAATGATAACCCTGTATGCGAGACAGCAAAAGAGACACAGATGTATAGAACAGTCTTTTGGACTCT
GTGGGAGAGGGAGAGGGTGGGATGATTTGGGAGAATGGCATTAAAACATGTATAATATCATATAAGAAAC
GAATCGACAGTCCAGGTTCAATGCAGGATACAGGAAGCTTGGGGCTGATGCACTGGGATAATCCAGAGGG
ACGGTATGGGGAAGGAGGTGGGGGGGGGTTCAGCATGGGGAACACGTGTACACCCGTGGCAGATGCATGT
TGATGTATGGCAAAACCAATACAATACTGTAAAGTAAAAATAAATAAATAAATAAATAATTTTCCACAGT
TTGTTGTGATCCACGCAGTCAAAGGCTTTTAGCATAGTCAACAAAGCAGATCTTTTTTGGAATATCCTTG
CTTTTTCTATGATCCAGCAGATGTTGGCAATTAGTTCTGGTTCCTCTGTCTTTTCTAAATTCAGCTTGTA
CATCTTAAAGTTCTCAATTCATGTACTCCTAAAGCCTAGCTTGGAAGAGTTTGAGGATTACCTTATAGCA
TGTGAAATGAGTGCAACTGTACAGTAATTTTAATATTCTTTGGCCTTGCCTTTCATTGGGATTGGAATAA
AAACTGACCTTTTCTAGTCCTGTGGCCACTGCTAAGTTTCCCAAATTTGCTGGCATATTGAGTGCAGCAC
TTTCACAGCATCATATTTTAGGATCTGAAATAGCTCAGCTGGAATTCCATCCCCTCCACCAGCTTTGTTC
GTAGTAATGCTCCTAAGACCCACTTGTCTTCGCACTCCAGAAAGTCTGGCTATAGATGAGTGATCACAGG
ATCATGATTATCTGGGTCATTAAGATCTTTTTTGTATAGATCTTTTTTGTATAGTTCTGTGTATTCTTGC
CACCTCTTCTTAATCTCTTCTGCTTCTGTTAGGTCCTTATTGTTTCTTTATTCCTTTATTGTGCCCACCT
TTGCCTGAAATGTTCCCTTGGAATCTCTGATTTCTTGAAGAGATCTCTAGTCTTTCCCATTCTATTGTTT
TCCTCTATTTCATTGCAATGATCACTGAGAAAGGCTTTCTTATCGCTCCTTGCTATTCTTTAGAACTCTG
CCTTCAGTTGGGTGTATCTTTTCCTTTCTCCTTTGCCTTTTGCTTTTCTTCTTTTCTCAGCTATTTGCAA
GGCCTCCTCATACAACCACTTTGTCTTACTGTTACATTTCTTTTCTTGGGGATGGTTTTGGTCACTACCT
CCTGTACAACACTATGAACCTCCATCCATAGTTCTTCAGGCATTCTGTCTACCAGAACTAATCCCTTGAA
TCTATTCATCACCTACACTGCATAATCATAAGGGATTTGATTTAGGTCATACCTGAATGGCCTCATGGTT
TTCCATACTTTCTTGTATTTTGCAATAAGGAGCAGATGGTCTGATCCATAGTTAGCTCCAGGTCTTGTTT
TTGCAGACGGCGTAGAGCTTCTCCATCTTTGGATGCAAAGAATATAATCAATCTGATTTTGGTACTGATG
ATGTCCACATGCAGAGTCGTCTCTTGTATTGTTGGAAGAGTGTGTTTGCTATACCAGTGTGTTCTTTTGG
CAAAACTGTGAGCTTTTGCCTTGCTTCATTTTATACTCCAAGGCCACACCTGCCTGTTACTCCAGGCTAT
CTTTTGACTTCCAACTTTTGTATTCCAGTCCCCTACGATGAAAAGGACATCTTTTTTTTTTTTTTTTTGG
TGTTAGTTCTAGAAGGCCTTGTAGGTCTTCAGAGAATCATTCCAAGTTCAGCTTCTTCAGCATTAGTGGT
TGGGGCACAGACTTGGATTACTGTGATGTTGAATGGTTTGCCTTGGAAA
>gi|546671514|gb|AWWX01449617.1| Bubalus bubalis breed Mediterranean WGS:AWWX01:contig449616, whole genome shotgun sequence
ATGTTATTATATTTTATTTGATTTTCCTGATAGGTATTTCTACCAAATTACTTCCTAAAGTTATTTTTCT
TTTTTTTTTTACTATTTCCCCCCTTTTTTTTTTTTTTTTTTACAAGCTGCATGAGTTTTGCGTAAGAAAT
ATCCTCTAATGCAAAACTCAAGCATTGACTTATTATTTCATTTAATAGCTATTTATTGTCTCTTAAGAAG
TCCAGAAATTGATGTAAACTGAAAGACCACTGACAAATGATCACCTCAAAGCAGCCGTGACTGCTTCCCC
GTCTAAATACTGAAACATGTCAGCATAAAATCCCAAGGAATTGCATAGGAAGCGAACCTAATTCCGTCTT
GTATGCTCTGGGAAGTTACAGAAAATGAAAATGTGGAATTTATGACCAGAAGGATGAATACCAATTAGCT
AAGCAAAGGGTTAGGGAGGAGGATGCAGACTAAAGGGATTACATTTTAAATGGACCAGATCAGAGAACGT
GAAATTGTGTTAAACAGATTCCCCTTGAAAATAGACTATGAAGGCTGGGGAAAGAGAAGCGTTAAACTTG
GTGGCATAGACAGAACCAGAAAGGACATTATATATCATACTAAATAATTTGGGCCTTTGTACCTCTGAAA
GATTTTAGGTAGGATAACTTTAAAGTCATATTAATGCTTCAAACTGATGACTATAAATGCAGTGTCTGTC
AAGACTGGAAGTTAGGGAATCAAGATAGTAGAGATGTAATTAAGAGAAGTATAAAAGTATAGAACTACTG
AAGAAGAAAAGCCAAAAGTACTGAACTATTTAATGGGAGATGTTGGGGAGAAGAAGGAGTAAGGGTCGAC
ACGTTGCTCATCTGGGAAAGTGAATATACAATGCTGTCATTCACAGGAATAGGAGAAAGGTCAGGAAGGA
CAGAAAATGAGTCCTGGCTTAAATAGGTCATTTTAGAATATTAGTTTATTTTCCCAATGACTTTAAAGCC
CAGGAACTTGAACTAGAACTTTGAAATGCTGAGCATGGAAAGTAAAATAAGAATTGAAGACAGTCATGGC
TTGAACTCTAAATAAACAAATATCAAAGAGAAACAGGAGCCCAGTAGAGCTGAAGATGGTGTGATCATCA
GTAAGAGGTGGAGAGAAAATCAACATAATATGATATTATAAAAAACAAGTTAAGGAAGCATTTGAGAAAG
GAGAGAGTGGTCGTCAGGATTAAAATCTGGAGACAAGAAAGGAAAAAAGTTTCAATACCTTTGAGGTAGA
ATGCAAATTGAAACGAATTGAATAATTGACTAAAAGTGGAGGAAATACAGGTAAAATGGCCTACAAGTTA
AGAAGCCTGATTCTCAAGAGGAAGAAAAAAATTGTGTGATAAAAGTGAATCATATGGGTTCAAAAATAGA
GGAAGTTTTGAAGCTGGAAAAGTTTTGAATTTCTATAGTGATAGTGAAGTCACTCAGTCGTGTCCGACTC
TCTGCAACCCCATGGACAGTAGCCAACCAATCTCCTCCGTCCATGGGATTTTCCAGGCAAGAGTACTGGA
GTGGGTTGCCATTTCCTTCTCCAGGGGATCTTCCTGACGCAGGGATCGAACCTGGGTCTCCTGCATTGTA
GGCAGACTCTTTACCGTCTGAGCCACCAGGGAAGGCCAGATTTCTATAAGGACCAATGAAATCTGTATTA
GTTTGTTAGGGCATCCATAACAAATTGCCATAGGTTGAGTAGCTTAAACAACAGCAACTTCTTTTCTCTC
AGTTCTAGAGGCTGCTGCTGCTGCTGCTACTAAGTCACTCCAGTCGTGTCTCACTCTATGGCAGCCCTCC
AGGCTCCCCCGTAGTGGGGGTAGGTTGCTCTGTCAAGACCAAGGGCCAATTATTTTCTTACCATGAAAAC
CAAGAAGAAGGTGACTACAGGTGATTCAACCTCTAACACATACACATGCACACACAACGTGGACACTCAG
AGAGTTGAGTTAAAGCATAACTATTTTACCTCCAAATTACTGCTAATGCTGAAAAGTACAGGTATTTATC
TAATGTGTTTCAGGGTCATGTGTGGGAAAATTGAATGTTAGGTGGAGTTTAATTTTACCAGCAGCTATTT
GAGATGAACTGTCAGAAAGAAACTTGCTTTTTAAAAAATATATTTGTCTTTATTTAGCCTCTCAAAGATC
AGGTCAGCAGGCTTCCTTGATGGCTCGGAGGCTAAGAATCCACCTGCACTGCAGGAGCTGCAGGAGACGC
AGGTTGGGTCCCTGGACTGGGAACATCTTCTGGAGTAGGAGACAGCAATTCACTCCAGTACTCTTGCCTG
GGAAATCCCATGGACAGAGGAGCCTGCTGGGCTATAGTCCTTGGGGTCATATAGGGTCGGACACAACTGA
GCACTCATAAACACCCACAGTCTCTCAAAATTCAGAAAGTTGCCCATTCTAAAGGATATTGTTCTATGTT
GAATATAATGAAATCCACGCTTTGTGCTCTGAGACAATGGGAATATCCTTAAATTTGGCTGATGGGGCGT
GAAATTCAATTACTGTTCATTCAAGTATTGTTTGAACTATTCCTTTGTTCTTTTATAAAATTTACCATAT
GGAAAGGAAATACAGTGTCTTCCAGTTGAAAGATATAATTGGCATTAATATCCTATAATTAAATTTACTT
TATTTTCCATTACTTTTTGAAGACTTATATCCAGAAAGACAGAAGAGAGAGAGAGAGAGAGAGAGAATAG
TAATGAGCAGAATAAACTTGAATAACTCGACTAGATGGCTTATATCAACTTGATGCTGAGCAGTACCCTA
TTGGACTTCCCGACATGTCCTGCATTTCTTGCAGGCAAGACTCCAGCCTCCATGACTTTTCCTGAGTTCC
AAGGGGCAGATCAGTTGCTAATCAAGGGAGCTCCAGCAACGAAACTACCTGGGGCAAGATTTAAAAGGCC
AGGGAAGCTCATTAAGATTAGAAGACTCACCACCCAAGACTCTGTGTTGTGTATTCAGTACTCAAGTCAT
GTCCAGCTCTTTGGGACCCTACGGACCGCATCACACCAGGCTTCCCTGTCCATCACTGTCTCCCAGAGCT
TACCCAAGTTCATGTCCATTGAATCGGTGATGCCATCCAACCATCTCATCCTCTGTTGCCCCCTTCTCCG
ACTCAGTCTTTCCCAGCATCAGGGTCTTTTCCAATGAATCAGCTGTTCAAATCAGGTGGCCAAAGTGTTG
GAGCTTCAACTTCAGTATCAGTCCTTCCAATGAGTATTCAGGGTTGATTTCCTTTAAGATGGACTGGTTT
GATCTCCTTGCTGTCCAAAGGACTGTCAAGAGTCTTCTCCAACACCTCAGTTCAAAAGTGTTAATTCTTC
AGTGCTCAGCCTTCTTTATGGTCCAGCTCTCACATCCATACATGACTCCTGGAAAGAACATAGCTTTGAC
TACACGGATCTTTGTCGGCAAAGTGAGGTCCTTGCTTTTTACTACACTGTCTGGGCTTGTCATAGCTTTC
CTGTCAAGAACCAATCGTCTTCTAACTCCAGGGCTGCAGTCACTGTCCGCAGTGATCCTAGAGCCCGAGA
CGCGGAGATCGGTCACCTCTTCCACCTTCTCCCCTTCCTTCTATTTGCTGTGAAGGGATGGGGCTGGATA
CCATGGGCTTAGTTTTTTTTAATGTTGAGTTTTAAGCCAACGTTTTCACTCTCCACTCACCCTCATCAAG
AGGCTTTTTAGATCCTCTTTGCTTTCTGCCATGTGAGTAGTATCATTTGCATATCTGAGGTTGTTGATAT
TTCTCCCAGCAATCTTGATTCCAGCTTGTAACTCATCCAGCCTGGCATTTTGCATGACTCTACACATTCT
AATCTTGTCAGCAACCTCTTCCTCTTGGAACTTTGCAGCAGAAAGAATGCAAAGTTACTGTTACTGCCTT
AATCCTTATCACATAACCTTGCCTGATTATATAACCTCTCCTTACTCACCAGGGAGAGGGGCACAGTTCT
TGGGGCGCTAGGCTACTGTGTTCCCTCTTTGCCTGGAAAGTAATAAAGCCATTCTTTCCTTCTTCTTTCT
GTTGCTGTATTTCTGTTTAGCAGTGATGCACAGAAAGCCAAGATGTTAGCAAAGAACTCAGCAAGTTTTT
CAATTTAATTTTTATAGCAATCATCTATAGTATAAATAAGTTAGCAATACATTTCATCATGAGATCATCT
CATTTAATTACTCTACCAAAGATAGAGCATCTCCTCTGAAGTATCTGTGTCAAATGATACATCATATTCT
CTTAAACCAGAATCTGCCTTTGTGTTAGTTTCTCTGGGATCCAAATAACTTCATCCTGCCTGCAAAAGTG
AAAATACTGGATCAATCTTCCCTTTGAAACCTCCAATTTCCACATGATGGATTACAGTTATTTTGGAGAC
TAGCTACAATTTTCTTATCAGCCAAGAAAAATATAGAATGTGTTATTGGTTTTCATGAAAGGACTGCAGC
ATGATTTTTATAAACCTGCAATTTCAACATCTATTAAAAATGCTGGGGCTTAATTTAGACTTCTCTTGCC
TTAGAATGCAAGCCATGAATCTTACAGAATTGAATTTATAAGCAGCCATTATTCATTAGAAGGTCTGTTG
CTAATGCTGAAGCTCCAATACTTTGGCCACCTGATGTGAAGAGATGACTCATTGGAAAAACCCCTGATGC
TGGGAAAGATTGAGGGCAGGAGGAGAAGGGGGTGACAGAGAATGAGATGGTTGGATATTGTCACTGACTC
AGTGAACTTGAGTTTCTGCAATCAATGGGAGATAGTGAAGGATGGGAGACTGGCGTGCTGCAGTCCGTGA
GGTCACAAAGAGTCCAAAACATAGTGACTGAAGAGCAAATTGTTTCACTTAGAAACTTCACAGATACCAA
GCACAATGTTTGCAGATGAACAGTAGAACCTTCCTTATAGCCTTATCAGGGAGTAGGTGTGGTCAATAGC
AATTAATTTCAGTTTGTTTTTTATGCTCAAAAAATAGTCAGGGTAAAAGACTGTTTTGAAGGAAAGCATA
AATAGTAGATAACTAAAACAAGAAAAATAAATAGTACATATGTTTTTATATCGATTATCTAAATTCACCT
GACTTTAAGTATGTTACAGTATCAGGCTTTTAGTTAAAACACAAAGTCAACATAAACCAAAGACCTCAGA
CTAGCCACTTACATGTGATTTAAATAACATAATTAAGAATCATTTCACACCCCAAAAGGGGTAATTGAAA
GACTGCTAAAACATTCTAATTTCAAATTGTGCCAAGTGTTAAATCCAATGCAAAAATAGCACCAAGCTAT
AATCCCAAATTTTCCTTATGTCAGTTCCCATTTTTTTTCTTGCATTCATCCTGTAGAGAAGAAAATAAGA
ATTTCTAAGTTCTCACACCAATGATGTTAGTTATATGAGTTTTGGTGCAATTCTGAATATTGATACTGCC
TCATATTTTCAGACAAGTCTTCTGAGTGCTCCCACTGTAAATTATCCATTCCTCCTGACCTGTAACTCTC
ATGTCACTGGGCTGTAATTTCTCAGAGCACTCAAAACTATCTGAAAGTGTAAAATTCATTTGATTATTGA
TGATTTTATTATCTGAACACCCTTCTCTAACGTCCCCATGAGAGTGAGGCTCTCAATTTATTCACTTCTG
TATCTCTGGTTCCCAAGATGTGATGGTGTATATTGTTGTTTAGTCACTAAGTCGTGTCTGACTCTTTCGT
GACCCTATGGACTGTAGCGCACCAGGATTTTCTGTCCTTAGGATTTCCCAGGCAAGAATACTGGAGTGAG
TTGTCATTTCCTCCTCCAAGGGATCTTGCCAACCTAGGGATTGAACCCACGCCTCCTGTATTGCGGGCAG
ATTCTTTACCACTGAGCCACCAGGGAAGCCCAATGGTATATGCTGTGCTGTGCTTAGTCGCCCAGTCGTG
TCTGACTCTTTGTGACCCAGAAGGAAATGGCAACCCACTCCAGTACTTTTGCCTGGGAAATCCCATGGAC
GGAGAATCCTGGTAGGCTACAGTCTATGGGGTTGCAAAGAGTCAGATACAACCAAATGACTTCACTTTGG
ACTGTTGCCCACTAGGCTTCTTTGTCCATGGGGATTCTCCAGGCAAGAAGATTGGAGTGGGTTGCTGTGC
CCTCCTCCCAATGTTATTTATTAAGTAGCTAATAATGATCTGGAATAAATCAATTATTGTTTTTTCCTTA
AAAACATTTACAGTCCTGAACTTCTTCGTCAGGACTCAAAATGTTTACCTATGAAATAATACCTTTAATC
ACAAAATAAAATTCTCCATCTAATGGAAACTGTTGACTCATATCAATTAGGGCACCGTGACCATGGGCCA
AGCTCAGAAGTAACTGAGTTGGCTCTACTCCACAAACTTCTTCCTCACACATCAATACCCTTAAAAATCA
AATTGAATAAGAACACTGAATAATAAAAGTGGAATCCATCAAAAGAAATTGCTATGTTACTTTTATTAGG
AGAAATCATCTAGAAAAGGGTGCTGTTTCTAAACATATGATGAATTTAAGTAAAGATGACTAATGACAAT
TAACAGCTGGCAACTCAAGAGTTATTTGATCTCAAAGATAAACATAGTACAGGAGCAAGTTTTCCGGAAA
TCTAATTCCAAGCAGTGACAACCATTAGAGATCACATTAAAAAAAAAAAAAAAAATCCCAAAATTACTTA
AAATTCAGATATGGCTTTAAAAGTGTCATAATATGTAAGACTTAATTTTTTTCCAGCAACCTATACCAGA
CACCTATCTTCTTAAAAAAGAGCAAAAGATATTGCTGATTTAGAAATAGAAAACATTGAGAATTATGTAT
TTTAGATGAAAGATTATATGACTAAATTTTTCTTGTTATACTTACTTTATTTTGCAGATGTAAATCATAA
ATCTGAGGTCAGAGTTTTGTTTTCTGCATCATTCTGTGTTGTTTTCCAAACACTGGAATTTTGTACAGAA
AATGGGAAGTAGATGGCACTCTTGGACTTGCAATCAGATGCGGGAAGCCTGTGCAGGATGGAATTCAAGC
ACCTAACTCGGCAAGACCCGGTCTCTCACTTCGTTTACCAAAACTATTTTTTTAACTAATAGCTTCTACT
CTATGCTGCTTCTCCCTCTTACAGTGTATGTATTAAGAGACAAAAGTACTTTTTAGCTATTGACAGTCCA
ACCTGTGCTTTTCCCTTTCGCAGTGTATTAAGAAAATCATATTGCTGAATCATTCATACAGGAATTTTTT
TTACAAATATAGCAAAATTATTAATAAAATTATTTAAAATGAAGTGGAGAAATTTACAATGGGTCAGGCG
GTACATGTTTTTCAGTTCTGGGGAACATATGTCTTGGCTTGGAGAGATCACATTTACAGAGAAAATACGC
TTTAGATGTGATTATCCAGGATTCCCACAATAAGAGAAAGCATTTATCCACACTCAATAATCCTCAGTTT
GTTCATAACACTCATTTTTTGATGCAGGCCATATATATTTAATTGCCTCTTTTGATTGAGATAACTGCAT
GTTTTATCCCATTTCTTCTCCTTACGTCAAAAACGCTGCCAAATTGTTAGCAGTCATGAGAAATTTCACC
TTAATCTCCAAGTCTTTTGGTACCTGTGACCCACACATTCTCTCCTGAAAGGAAATCTAATCATTTTCTT
CCTTTCCGCACAGGGACCTTCTGTGTCCTGCCATATCACACGTCACATCACAGTCATGTGTGAGCACGTG
TCATCCCCTCGCTAGATTCTCGAGAGCCACGGTCACGAAGCCTCAGTCTAGAGACCCCAGAACCAAAGCC
AGCCACCTGTCCTTCTTGGTCTTAGGTGAGAGGCCTCCCACCAAAGGCCCAGCAAGGACCTACCTGGAGA
GTGGAACAGTTTGCAGAGCTTCCATCTTCAGCCCCTTTCCCCCACCCGTGATCCCGTTCTCCTCTCTCCC
CCAGCAGGTGAGCAGCCTCTATCTCGCCCTGCCGTGTCACTTGGGCTGGTCCTCACTAGACAAGGTAAAC
ATATTTGTGTGTGCCGAGTTGTTCAGTCGTGTCTGACTCTTGGGACCCATAGATTGCAGCCTGCCAGGCT
CCTCTGTCCTAGGATTCTCCAGGCAACAAGACTACAGCAGGTAGCCTTTTCCTTCTTCAGAGGATTTTCC
CGACCCAGGAATCGAACCCTGGTCTCCTGCATTGGCAGGCGGATTCTTTACAGACTGAACTACATTGCTT
CTTGACCAAGGTAAATATATAATTATCTGTTTAAAGGCTTGAGAATAGAAAGATTACTACTAAAATACAC
AGGCCTGAAACTTGCATTAACAAACTGAATGCCATTTAGCAGGATAACACTTAACCTGAGATTCACATAT
GAGATGAAGAGAAGTTATTGGTCCTTTATCATACAGCCTCAGTTACCCACTACGTGCACACGTACTGAGC
TCCTCAGGGAGATATATTTCTATGAAATGATGGAAGATAGAGCGATTGAATTTTGTGTTTAGTTTTCTCT
TAACTGAGAAGAAGAACTGAATACAGAGATGAGAGCAGACTCTCTCTCCAAAGAGAAAATCTTCAGATTC
TCCCACCGCTAATCTGTAGGAAAGAGGCAGGAGAACCCACTTCCGGAGGTCACAGTATCTCTACCTGCTA
GAAAAAACAGCCCGATTGGAGATTTTACTGCCGATACATCCACAGGAACCCTGTTCTGTAGGATGGGAAC
CCTTATTCCTCTCTTGGAGGATGGTCCAAGGGGCACTGTGAAAGAAGCAACGATTTATGGTGTGAATTTG
GGTAAATTAGAAAAACCTCATTAACTTCATAGAGATTAATACCTTCATTAAGATCTGAAGGTCAAATGAA
ATAATGTTCCCACTCACTCATTCTCTCATTTATTTGATAAAAAATTAATAACTAATTGATTACTAAGTGC
TATTCACCATTCTAAGCATTAAGGATATTGTCTTTAAAAAAAAGAAAAAAAAACATTGGTATTAGTGTAC
TGGCTATTGAGTTACTAATGAATAGTATTCATTCTGTAAATAGCAGTGATTTATAGGCACACATTTGTTT
CCCTAGAATGCTTACATTTCTTCTAGTATATACAAAGTATAAATTCATAATCTTTACTTCATAAACCCAG
GACCACAGATTTAGTACAGGGGTAAATCCAAATTAAATCTTACACATATGAAAGAAAAGATAAATATTAT
TTTAGACATGGCCATTTTAAAGATATCTGGTTTTCAGTATTTTAGAAGCAACTTAAGATGATAAGAACAT
ATTATTACTCAGTCTGGTTTTTAACTGCTTCCTCAGAGCTGGGAAAAAAGCAGCCCCGCCTTCTATTACA
TATATATACTAAACTAAAACATGCCCTGTGCTTACTCAGTTCTTGATAGTACCAGGAGCAAGTGATATTT
TTTACTTTAAAAGAAGTCTTTCATATATTTATTTTTAATTATGAAGGATTTCAAACATGATAAAAGTACA
GAGAGCAATAAAGCAATGTGCTCACCACCCAGATTTTAACATCGCTTCACCATATTTACTTTGGGATTTT
ACATACACATTTATTATAAAATGTGAATATGTAAACAGTAATACACTCTATTATATAGATTTTTGAATAC
ATAAAAATAAAATTTATATAAAGTGTTACAAAAAGGATTAAAAATCTCTCTCTCACCATCGACCTGTCCC
CTCGTGAGGAAAGCATGCACATCCTTCTCATTATGATTTATAACCTTATACATATTCACTATCTATAAAT
AACAAACACAATTCCTCTGTGTATTTCTAAGATTTACACCATAATACGATTTACCTTCCTTTTTTAACCT
CAATGTTAGATTTTAAGATTTATCAGTTCTGAATGGAATTAATTAAATTTCTTTCTACTGTATAAAAGCG
CCATAGTAAGGTGTTCCTTGTGTATATTGATAAACATTTATTTTTTCACTGGTTTTTGCTTTTTCAAAAA
AGGCTAGAATGATCACGCTGGTATGCTCACCTGCACATATGTTCAGGTCTTTTAAATAGATTTTAATTCT
GTAAGTAGAATGGCAGGATCAAAACACAGGCAGAGCTCAGCTTCACTAGAAATTGCCAGCTGCTCCTCTA
AACAGTGCAAACATTTACATTCCTACAAGCCCAGTAGGAGAGGTTTTGATTCACCACGTTCTAGCTAAGA
CTTGACATTATCAGAGTTTTAAGGTATGCCGATCATGTGAAAATCATATTTTGCTGCTGTCGTGACTTCC
ATTTCCCTAATAAACAGTAAGATCAGGTGTTTTTCATGTGTCTATTGGCTATTGGGGTTCCCCTTCAGTG
AACGCCTGTGTTTTTTATTGATTTCATAATTGACTTACTTTCTCTTGCTAGTTTATCTGTACTTTTCTAA
GTTTTAAACTTTTACTTTTTACTTTCAGATTTTTAATGTTGTCTATGCTATTGAGAAATTTGTATGCAGG
TCAGGAAGCAACAGTTAGAACTGGACATGGAACAACAGACTGGTTCCAAATAGAAAAAGGAGTTCATCAA
GGCTGTATATTGTCACCCTGTTTATTTAACTTATATGCAGAGTACATCATGAGAAACGCTGGACTGGAAG
AAACACAAGCTGGAATCAAGATTGCCGGGAGACATATCAATAACCTCAGATATGCAGATGACACCACCCT
TATGGCAGAAAGTGAAGAGGAACTCAAAAGCCTCTTGATGAAAGTGAAAGTGGAGAGTGAAGAAGTTGGC
TTAAAGCTCAACATTCAGAAAACGAAGATCATGGCATCCGGTCCCA
>gi|546669842|gb|AWWX01450698.1| Bubalus bubalis breed Mediterranean WGS:AWWX01:contig450697, whole genome shotgun sequence
ACGGGGGACAGCACTTCCGCCTCTGCAGAGAAGGAGAAGGGGTCAGCGGGGCCTGGACCCTCCCCCCCGC
ACCCAACCAGGGGACGGGCCCGACTCACTTTGAGCGACCACCCTCACGGTGCCGACGGTCTTTCCTCCCT
TGGGGTGCTGGACTTCGCACACCAGGTAGTCATCTGGCCCTTGAAAGGCGCTTGAGGAGGGCAGGACCAC
CTGAGAGGAGGCCGACCACAAGCCGCCCCTCAGGACTTCGGGGAAGGTCCGGATTTTCTCACTGCTGACC
GTGCTGTTGTTGAACTTCCAGGAGAAGCTGACAGAATTGGGCACGAAGTCCCGGGCCAGGCAGCCCAGGG
CCACCGTGCTCTCATCAGACGGGGAGCTCACGCAGGACACCAGGGGGAAGACTCTTGGGAGCGATTCACC
TTCTGGGGACCCGAGAGAGGACACAGGAGAAGAGGGGGGTGAGAGGTGTCCTGCTGGTAGGGGGTGTGGG
CAGCTCCACCTTCTCTCTGGGACAGTGGAGCGGAGGGCACACTCAGCCCTGCCAGCCCACCCTCACTGTC
TGTGATTACCCACCTGGGGCCTGCCCTGGGGGTCTGGGGTCATCAATAAGACTGATACACACTCAGGCTC
CCAGTCCTCAGCACAACCAGATCACTGAGGTCAGCCCACTGTTGACCAGGACAGTCCAGTGCGGTCAGCT
CAGTCCATCTAGACCCACCAGCCTCAGTGGAGGTTAAATGCACCCAAAGCATCTCAACAATTTGCCCAAG
TCAAGCCTGCTCAGTGGGTTCACTTCTGTTGGCCCAGTCTCAGTGCACCATGGTTAACCCAGCATACCCC
AGTTAAGCCCAGGCTAGCCCAGACCAGCTCAGCCCAGCTCAGCTCAGTTCAATCCAGATCAGCCCAATCC
AGGCCAGCTCAACCCAGCTCAGTTCAGCTCAGCTCAGCTCAACCCTCTCAGCCCAGCTCACCTGCTCAGC
CAGCTAAGCCCAGTTCTGCCCAGCTCAGCTCAGCCCAGCTCATCCACTCTGCCCAGCTCAGCCCAGCTCA
GTTCAGTTCAGCCCAGCTCAGCCTAGCTCACCCACTCTGCCCAGCTCAACATAGCCCAGCTCAACCCAGC
TCAGCTCAACCCAGCTCAGCTAAGCCCAGCTCAGCTCAGCCCAGGCCAGCTCAGCCCAACTCAACTCAGC
TCAGTTCAGCCCAGCTCAGTCCAGCTCAACCCAGCTCAGCCTAGCTCACCCACTCTGCCCAGCTCAACAC
AGCCCTGCTCAACCCAGCTCAGCTCAGTTCAGCCCATCTCACCCACTCTGCCCAGCTCAGGCCAGCTCAA
CCCAGCTCAGGCCAGCTCAACCCAGCCCAGGCCAGCTCAACCCAGCCCAGCCCAGCTCACCCACTCTGCC
CAGCTCAGCCCAGCAAAGCTCAGCCGAGTTCAGCTCAGCTCAGCCCAGCAAAATTCAGCCCAGCTCAGCC
CAGCAAAGCTCAGCCCAGCTCAGCCCAGCTCACCCAAGCTCAGCTCAGCTCAGCCCAGCCCAGCCCAGCC
CAGCTCACCCACTCTGCCCAGCTCAGCCCAGCAAAGCTCAGCCAAGCTCAGCTCAGCTCAACAAAGCCCA
GCTCAGCTCAGCCCAGGTCAACCCAACTAAGCCCAGCCCAGCCCAGCCCAGCTCACTCATGCCACCCTGC
TCAGGCCAGCTCAACCCAGCTCAGGCCAGCTCAGCCCAGCTCAACCCAGCCCAGCCCAGCTCACCCACTC
TGCCCAGCTCAGCCCAGCAAAGCTCAGCCCAGCTCAGCCCAGCTCAACCCAAATCAGCCCAGCCCAGCCC
AACCCAGCCCAGCCCACACACTTGGCCCAGCTCAGCCCCCTTCAGCCCAGCTCAGCCACTCCATTCAGCT
CAGCCCAGCTCAACCCAGCTCAGCCCAGCTCACCCACTCCACCCAGCTCAGCCCAGCTCACCCACTCCAC
CCAGATCAGCCCAGCTCACCCACTCTGCCCAGCTCAACACAGCTCAGCTCAGCCCCCCTTAGCCCAGCTC
AGCCACGCCATTCAGCTCAGCCCAGCTCACCCCAGCCCGCTCAGCCTAGCCCAGCTCAGCTCAGCCTAGC
CCAGCTCAGCTCAGCCTAACCCAGCTTAGCCCAGCTCACCCACTCTGCCCAGCTCCGCTCAGCCCAGCTC
AGCCCAGCACAGCCCAGCTGAGCCTAGCTCAACTCAGCTCAACCCAGCTCAGCCCAGCTCAGCCCAGCAC
AGCGCAGCCCAGTGTAGCTCAGCCCAGCGCAGCTCACCCACTCTGCTCAGCTCTGCCCAGCCCAGCTCAG
CGCAGCCTAGCCCAATTCAGCTCAGCCCAGCTCACCCACTCTGCCCAGCTCCGCTCAGCCCAGCTCAGCC
CAGCCCAGCTCAGCTCAGCCTAACCCAGCTCAGCCCAGCTCACCCACTCCGCCCAGCTCCGCCCAGCTCA
GCCCAGCTCACCCACTCCGCCCAGCTCCGCTCAGCCCAGCCCAGCCCAGCCCAGCTCCGCTTAGCCCAGC
CCAGCCCAACCCAGCTCACCCACTCTGCCCAGCTCAGGGCAGCTCAACCCAGCTCAGGCCAGCTCAACCC
AGCCCAGCCCAGCTCACCCACTCTGTCCAGCTCAGCCCAGCAAAGCTCAGCCAAGCTCAGCCCAGCTCAA
CAAAGCCCAGCTAAGCTCAGCCCAGGTAAACCCAACTAAGCCCAGCTCAGCTCAGCTCAGCCCAGCCCAG
CCCAGCCCAGCCCAGCTCACTCATGCCACCCTGCTCAGGCCAGCTCAACCCTGCTCAGGCCAGCTCAACC
CAGCTCAGGCCAGCTCAGCCCAGCTCAACCCAGCCCAGCTCACCCACTCTGCCCAGCTCAGCCCAGAAAA
GCTCAGCCCAGCTCAACCCAAATCAGCCCAGCCTAGCCCAACCCAGCCCAGCCCATACACTCGGCCCAGC
TCAGCCCCACTCAGCCCAGCTCAGCCACTCCATTCAGCTCAGCCGAGCTCACCCACTCTGTCCAGCTCAA
CACAGCTCAGCTCAACCCCCCTTAGCCCAGCTCAGCCACGCCATTCAGCTCAGCCCAGCTCACCCCAGCC
CGCTCAGCCTAGCCCAGCTCAGCTCAGCCTAACCCAGCTCAGCCCAGCTCACCCACTCCGCCCAGCTCCG
CTCAGCCCAGCGCAGCCCAGCTGAGCCTAGCTCAACTCAGCTCAACCCAGCTCAGCCCAGCTCTGCTCAG
CCCAGCTCAGCCCAGCACAGCGCAGCCCAGAGCAGCTCAGTCCAGCGCAGCTCAGCCCAGAGCAGCTCAC
CCACTCCGCCCAGCTCCACCCAGCCCAGCTCAGCGCAGCCTAGCCCAATCCAGCTCAGCCCAGCTGACCC
ACTCTGTCCAGCTCCGCTCAGCCCAGCTCAGCCCAGCCCAGCGCAGCCCAGCTCAGCTCAGCCTAACCCA
GCTCAGCCCAGCTCACCCACTCCGCCCAGCTCCACCCAGCCCAGCTCAGCACAGCGCAGCACACCTCACC
CACTCTGCCCAGCTCCGCTCAGCCCAGCCCAGCCCAACCCAGCTCACCCACTCTGCCCAGCTCCGCTCAG
CCCAGCCCAGCCCAGCTCAGCCTAACCCAGCTCAGCCCAGCTCACCCACTCCTCCCAGCTCCGCTCAGCC
CAGCTCAGCCCAGCCCAGCCCAGCTAAGCACAGCCCAGCTGAGCCCAGCTCAACTCAGCCTAACCCAGCT
TAGCCCAGCCCAACCCAACCCAACCCAGCCCAGTGCAGCCCAGCTGAGCCCAGCCCAGCTCACCCACTCT
GCAGCTCAGCCCATCTGAGCCCAGTTAAACTCAGCCTAACCCAGCTCAGTTCAGCCCAGCTCAGCCTGGC
TCAGCCCAGCTCAACTCACCCACTCTGCCCAGCTCAACCCAGCTCAGCCCAGCTCAGTTCAGCCCAGCTC
AGCCTGGCTCAGCCCAGCTCTGCTCAGTCCAGCTCAGCCTGGCTCAACCTGGCTCAGCCCAGCTCAGCTC
AGACCAGCTCACCTGGTTGGCCCAACCCAGCTCAGTTCAGTTCAGTTCAGCCTGGCTCAACCTGGCTCAG
CCCAGCTCAGCTCACCCACTCCGCCCAGCTCAGCTCAGCCCAGCTCACCTGCTTGGCCCAACCCAGCTCA
GTTCAGCTCAGCTCAGCCCAGCCCAGTCCAGCACAGCTCACCTGCGGTTGGTGGCCCGGGCTGCCCTCAC
AGACGTGAAAGCCCAGTGGTCCTGACAAGAAAGGGTCAGATCCCGGACCCGTGGCCTCGGCTAAAGCCCC
TGGTCTGCAGACGCTGCCCAGCTGGGCTCACCCCTCCCAGCCTCTTCCCGCTTCTCCTGGGTGCCCGACG
CCTCCATCCCCACACCAGGCCCAGCTGGCCCTTCTCCCAGCCGTCAGTCACCACCACCCTCCACTCTGGG
TGAAAAGCATCGTGATGACTTTAGCTTCCCTAGAGCATCTCACAGGCTGAGACATGCTTGCCACCCTCAG
ACAGAGGCCCTGTCTCTGATAAGCAGGCAGCGCTACTTCTCTGGGAGAGGAGAACCTGGGCACACGTCCC
TGGGGCCTGGCCACGTGCCGAGGGCCTGAGATCCTGCCCCAAGTCTAAAACAGTCCTGGTGACTAACTGC
TCTCTGGCAAATGTCCTCATTAAAAACCACTGGAAATGCATCTTATCTGAACCTGCTCCCAATTCTGTCT
TTATCACAAAGTTCTGCTGAGAAAGAGGATACTCTGTAGCACAGAGCGACCATCTGAACCCCAAAGCTGC
ATTGAACACCTAAGTGTGGACGCGGCAAGTGGTCCCTGTGGATGTGAAGCACCCTGGCATCGCAGGCAGT
AGGTAAAGGCAGATTCCCTTTCAAGTAGAAACAAAAACAACTCGTAGAAACGTCCCGGGGCAGCGAGTCT
GGCTGCACCGGCTCCTGCCCCTCACAGCTCGGCGCCTGGTCCCTGGCACGTCCCGTGGGCTCTCTGACCT
GGGCGGATTCCTCCGAATCCCTTCGCTGCGCTAACTCGTGACCTGCCCGCTGGCCTGGCGGCAGAGGCCA
GGCCCACACGTCCCCAGGTGCGGGCGCTCCCAGGCCCCGCTGACTGCCACCCCACCGGGCATCCTCTCAG
TCCCCCAGCTAGTGGTGTAGCAGAGTGACTCATGACGAATGCCCCCGTTTCACCCAAGTCTGTCCTGAGA
TGGGTACCCGAAAGGCGGCCCTGAACATTCTGCAGTGAGGGAGCCGCACTGAGAAAGCTGCATCATTGCC
AGGCAGGAGCCGGCCAGCTACGATTGTGAGCACACTCAGTGCACACGGCATGCGCACGGTCTCAGCTTAA
CTACCTTGAAGGAGTAACTCATTAAAGAGTGTACCAATGCATTGATAAAGTGCACCTGAGACAAATTAAT
TTCTTAAACATCGACTTTGAAAATGAATATAAGTGAGCAGTTGATAGCCTCTGAAGGAAATACATTCCAA
CAGGTGCTGAGAACCCCCAGGAGCAGGGAACGGACTCCCCGTGGAGCCCCAGAAGGAGCCAGCTCTGCTG
ACACCTTGGCCCTGGGCCCTCCCTCACGCTGGAGAGAGCCAGCTCCTTTTGTTCACACCTGGCCTGTGCT
TCTTTGTCGTCATGGCCCTCAGACAAGCCCACAGGTCCTGACCTCAGCCCCTCAGCCTCCGTGCAGCCGT
CCCCCTCCCCTGCTGGAGGCACCCTGCCTGCCGTGGAGCCCCTCACCCAACATTCCCCTGCCTGATGGGT
TGGGCCGCAAAGGACAGCGTTTAACCAGAAATGCCTTCCAGGAGCCTCCTGCTGGGAGACGGCCTTCTCT
GGGGACCAGGTCCACTCCCACTCCCTTGGACAGTCACTGTCAGGCCCCTGACGGCCCTATGAGAGGCGTC
CTGGGAAGCCCCAGTCTCCTTCCTGCCCCTGAAATTGCCTCCCTGGAGAGCCAGATCACCCTTACCGAGC
TCCCTGCCCTGGCCCCCGGGGTGTCCTCTCCCGTCCCACCGCCCACCCTACCCTGGACCTCCCCGGGGCC
CGAGCGTGCCGGCGCCCCTGTCGGCCCCCACCTGGACCCCCGCAGCTTATCTCTGAGGGCTAATTCCCCT
GTCCCCTGTCCCGCTGCCAGCTGCCCCCTCTTTCCAGGCCTTTCCTCCGTGCCTCTCCAGTCCTGCACCT
CTCTGCAGCTTCACCTGACACTTCCTTTCACCCTCCAGGCACCGTCTTCTGGCCTGCAGGTGAGGTCTCG
TGCTCCCTCAGGGCATGGTGTGCTGCACACACACCGGCCTCCTCCCGAGTCCCTCCTGCACACACCACGC
GAACCCGAGGTTGACAAGCCCTGCCGTGGTTGGGGTTCCGGGAATGGCGGCAGAGAGGGACAGGGTGTCC
TTGGGGCTGGTGGCAGGGTCCTCCCGGATGCACACAGAGGCCCCAGCTCAGGCCACCTTGGGAAACCAGT
CCTGGGATCTGCAACTCGGCCATGTTCCTGCATCTGGACCAGCCCCAAGACACCACCCTGGCGTGGCGCC
ACTGGCCTGGGAGGAGACACACATCCCTTTCCCATCAGCAGTGGGTTCAGCGCTCAGGATATGCAGCCCA
CAGGAGTGTGGCTTGGGGGAAAAAAACCTTCACGAGGAAGCGGTTTCACAAGATTAAGTATACTTGTTTA
TTTCAAGGCCACAAATGCGACATTGCAAAGCAGGGCCAGGTGGAGCCTATAACTGCGGGCTCCATGCTTG
GCCATGGCACTCAGCTGCTCCGAGAAAAGCCAGTTTCTTACAGCTTGAAGCTGGAGAGGACACAGGGAAA
ATTTCAGTACAAGCAGACAAGCCACATAGGCCAGGGACCGGCCCCAAGGCAGCCCCTCATGTTCCGTCTC
TGGTGCCCCATGTTCTTGGCTGGCCATCACTTCACCTGCAGGTGACAGAGACAGTGTGAGTGGGTGCAGG
GGCGCTGGGGGTCTGCGGCCCCGGGGCTCTGTGGCCTCACCCCCTCCCCGGGGCCAGGCAGCCTACCTTG
AACAGGGTGACCGTGGTGCTGTAGAAGAGGCTCAGGAGGAAGAGCACGATGAAGGTGGAGGCCATGGTGT
TGAGGTTCTCGAAGCCTTCCTCCTCGGCGCTCACCTCCCCCTCTGTGGGAGATAGAGCACGGTGGTCAGC
ATGGCTAAGCTACCTGCAG
>gi|546669977|gb|AWWX01450566.1| Bubalus bubalis breed Mediterranean WGS:AWWX01:contig450565, whole genome shotgun sequence
AACCATGAAATTAAAAGATGCTTACTCCTTGGAAGGAAAGTTATGACCAACCTAGACAGCATATTAAAAA
GCAGAGACATTACTTTGTCAACAAAGATCCGTCTAGTCAAGGCTATGGTTTTTCCAGTAGTCATGTATGG
ATGTGAGAGTTGGACTATAAAGAAAGCTGAGCGCCGAAGAATTGATGCTTTTGAACTGTGGTGTTGGAGA
AGACTCTTGAGAGTCCCTTGGACTGCAAGGAGATCCAACCAGTCCATCCTAAAGATCAGTCCTGGGTGTT
GATTGGAAGTTGATTGGAAGTTGTTGAGTTGAAGCTGAAACTCCAATACTTTGGCCACCTGATGCGAAGA
GCTGATTCATCTGAAAAGACCCTGATGGTGGGAAAAATTGAGGGTGGGAGGAGAAGGGGACAACAGAGGA
TGAGATGGTTGGATGGCATCACCGACTCAATGGACATGGGTTTGGGTGGACTCCAGGAGTTGGTGATGGA
CAGGGAGGCCTGGTGTGCAGCTGTTCATGGGGTCGCAAAGAGTCAGACACAACTGAACGACTGAACTGAA
CTGAACTTATTCTCAGAGCCTGCATCATTCTCTTCCCAGACTGCTCTGGGCTTCCATGGTGGCTTGGATG
GTAAAGAATTTGCCTACAATGCAGAAGACTCGGGTTCAATCCCTGGGTTGGGAAGATCCCCTGGAGAAGG
GAATGGCTCCCTACTCCAGTATTCTTGCCGGGAGAAGCCCATGCACAGGGGAGTCTGGGGGACTACAGCC
CTTGGGGTTGCAAAGAATTGGACACAACTGAGCAGCTACCACCTTTTCACTTTCAGGTTGCTCTGGGGTG
CTTCCCAGGTCTTCAGAATGCTCCGTATACCCATTGGTATTTTTTCCTAATCATGGAACAGTTGTTGCTT
TGTTATTATTGTTGTTATAATCGTTTTCATAGGTTAGTTACCTAACAATTAAATTCCCATTAACACCAAT
TTTTTTTCTCATATAAGAATCCTAATTTCTTCAACTTTTTCTATACATACCCTCTTCTGCTTTTCTCCTG
ATGTCCATGAATATCCTCTAAGAGGCAGCTACACTCTTAGTCTTCTTCCTTTTGCCTCATCAAATGTAGG
GGGTGCTAGAAGGAAGAGGTACATCACACATTTGATGCAGACTTCACCTGCAGAGAAGTTTATTTGGAGC
TGCAATTGTGGCATTCAGAAAAGGCTTATCAGACACAGGTTTGCTATAAAACACTTGTTATGACACTTGA
TGTTTGATTTCTTCTCGAAACAGAGCTCCCGTGTCTCAGAAACCAAGCATAGTTTTCCATTTGGTCCTCC
ACCAAAGAAAATACTGTCAGTTCCAGGGAGTCAAATACAAGTCTGTCTGTCCCAGTCCATTAAAAAAAAA
AATTATTCAACTGCTATTATTAGTAGATGTCAGTTTCTAAGGGCAAATCTGAAGAGTGAAAAATAATGTG
TGCAGTTACTTTACATTGATATAGGCATGGCTATTTTTGTCTATCAAGGAGGGGAAAAAAGACTACTTCA
TGTATGTTCCATGCATGCACACGATGGGAGCCATATTTCTCTGTCAAGCAGTGAAGCCAGTTGGAAAGCA
AGGGCCAGTAATAGAACTGGAAAGTCATGCTGATGCTTTCTGTGCTTGTGAGATTCTAGCTCTGTGGACT
GGGATTTTACAATTCCCAATTGCAAAACTGTTCGTGTATCTGATGATCCATGCCATTTCAGGCAGGTGTT
TCTCTGTGGTCCAGTTATTCCACCTGTGAAATGAGTGGTTACAACAGGACTCCTGGGTAGTTTATTAGCA
AAATGGAGTCATTTTTGAGCAAGATATCTGCAGGAAAAGCTGCTAGGAACCTTGACTATTCTAGAAGTGA
AGCAAGTTTTTTCATTTTTATTTTTTAGGAGGCCCAAGGACAGTGTATTGTGGAGAAGGAGGTTATGGTT
CTCATGAACATTCTCTGTTTCCTTTGGACCTTTTTGTTCTTCTGGTTCTCAGAAACTTCGAGACTTGCTT
TCCTTTTTCCTCTCAGTGGAGGGCAATCTTCTCTGAGAATTAGCCAATACCAACACTAGATAGTTCTCTA
AACAGCATTGCCTTTTTATTGTGGGATCTTGCAGATATGAGTGTAAATATAGGAATTTCCAACCATAGAA
TGGCAGTTACAAGGGGCGGGGGAATTTCCAGTGCCTCATAATTGATCACTGATGCCATCCACTATGAAGG
CCAGTCTTGGACCCAAGAATATACTGAAAGATGGGGATGATTTCACCTGGAAGCATTAGCAAATTCTGGA
ATATTCTTGATGTCAGAATTAAAAGCAGATTCTAATTTTGGCTTCATGGGTTCGTGGGTCGTCCTGTTTG
GAACCTTCCCTGGAGATCCTTTATATGTTTACTTCTGTTCTGTTTCCCTGGCAAGGCGTGATGTAGATTT
CGTTACTGCTTTCTGAAGCTCTGGAGTAGCGGGAAAGGTTTCCTCTCCCTTTTGCAAACTCCTGTGAAGG
AGCTTAGTAGCAGTACCAGGGCATTTGTTTTCCTGTAGAGTGGATGGGGGTGGACAAGAGCGGAATGAAC
TTCCTCGCGTTCATCAGTGAGAGTCAAGTCCTCTCATCCTATTTACTATTTCTTACACTCTAAGCCATCA
TTTCTCATAAGAGATTTTTATGATATCAAAAGCAAGCACAAACCCGCAAATTGGTTGGGCATAAAGAATA
TGTATTACAAGGTTACTCCTAACTGTGAGAATCATTAAGCCTTTTTTTTCTATGAGATAATGTGGATGGT
CGCCTATGTATGGGGTTGGCCAAAAAGTTTGTTTGGGTTTTTCCACATGCTGGTATAGAAAACTTGAATA
CACTTTTTGGCCAACCCAGTAAGGGCTTTGCCTCATCTCTGTCTAGCCAAATTGCCACCTTCCCTGCTAA
GCTCCACATCCCAGAGTGATCACCTTCTAAATCCCTTCCTCCTATCAGATATCAGATACCTCGAACCTAG
TCATGTACTTATGTGAAGTTTGTGTTGTTACCTTTTTAAGCAGTTTACATTGTATTGGATACACATTGTA
ACTGCATGACATTTCTGCAGGGCTCTATTCTTCTGGTCAAACTGAAGATCTGACAGCTATGAAGCTTTCT
GGCGTCCCTGGTGGCTCAGATGGTAAAGAATCTGCCTGCAATGCAGGAGACCTGGGTTTGATCCCTGGGT
TGGGAAGATCCCCTGGAGAAGGGAATGGCAACCCACTCCAGTATTCTTGCCTGGGAAATTCTATGGACAG
AGGAGCCTGGCAGGCTACATACAGTCCATGGGGTCACAAAGAGTTGGAAGCGACTAAGCGACCAAGACAC
ACAACTCCTGAAGTTCCCTACTTGGCCTTCTGTTTGGTTCACTTAAATATGTTCGCAAGAGATATTTATA
ATATATAATGCAAGGCAGGAAACTGGTATTCATGCATAAGAAATAGTGTTGCTGCTGCTGCTGCTAAGTC
GCTTCAGTCATATCCGACTCTGTGCAACCCCATAGACGGCAGCCCACCAGGCTCCCCCGTCCCTGGGTTT
CTCCAGGCAAGAACGCTGGAGTGGGTTGCCATTTCCTTCTCCAATACATGAAAGTGAAAAGTGAAAGGGA
AGTCGCTCAGTCGTGTCCGACCCTCAGCGACCCCATGGACTCCAGCCTTCCTGGCTCCTCCATCCATGGG
ATTTTCCAGGCAAGAGTACTGGAGTGGGGTGCCATTGCCTTCTCCAGAAATAGTGTTAGGCATTATGTTA
TATGATTTCAAATTTCATTTCATTTTTGCAAGCATTACCTGAGATTAGATATTGTTATCTCTGTTATCTT
CAGAAGAGAAAAGCAAAGCTTTCTTTATTTTTTTTTTTTATCCATAGCTACCCTTGGGGAGTGTCAGGAC
CAGGGTTCAAAGTTATGACTGTAAACTTCAGAATCTGTTATTTCCCACTACTCCAACAAAAACACTGGAG
AAGTCACAAACGTTTACAATGCCAGACTTATCAAAGACATTTTTTAAGGTGCAAAATATATTTAGATAGA
CTCAGTAATCTTTAGTTAAAGAATAAAATCGATGTGTCAATAACATTACTTTTCTAAGCTCTTATGTAGT
ATTTATACCCTCTTCTCCCCACTTCTTCCCTCACACCCACCTTTGCCCCACTCAGGCTGTAGGAAGTTCC
CTAACTTTCTATTCATATTTGTCCCACTGAAATTCTTATCAATCCTGGACTTGTTTCCCACACACCTAGT
GGATCCGGTTTCCATTTGAGGTATATAGTTCTTAAACATAGAGCAGATTTTTTCTCTGAAGTAATTGAAC
AGGAGAAGTTGCAGACTCAGAGAAATGAGATATTATCATAGGTGAGGGTGAGGTAGTGGATTGGGAAAAG
GGTTTTGAGGGAAAGTTAGCTTAGCTGGTAAAGAATCCGCCTGCAATGCAGGAGACCCTGGTTTGATTCC
TAGGTTGGGAAGATCCCCTGGAGAAGGAATAGGCTACCCATTCCAGTATTCTTGGGCTTTTCTGGTGGCT
CAGATGGTAAAGAATCTGCCTACAATGCCGGAGACCTGGGTTCGATCCCTGGGTAGGGAAGATCCCCTGG
AGAAGGGAAGGGCTACCACTCCAGTATTCTGGCCTGGAGAATCCCCTTGGACTCCAAGAGTCCCTTGGAC
TGCAAAGAGATCCAACCAGTCAATTCTAAAGGAAATCAGCCCTAACTATTCATTGGAAGGACTGATGCTG
AAGCTGAAACTCCAATACTTTGGCCACCTGATGCGAAGAGCTGACTCATTTGAAAAGACCCTGATGCTGG
GAAAGATTGAAGGCAGGAGGAGAAGGAGACGACAGAGGATGAGATGGTTGGATGGCATCACTGACTCAAT
GGACATGAATTTGAGCAAGCTCCAGGAGTTGGTGATTGGCAGGGAGGCCTGGCATGCTGCAGTCCATGGG
GTCACAAAGAGTTGGACATGACTGAGTGACTTTCACTTTCTGCCTAATAATGCCTAGGACACAGGGCTAG
CTTTCCAGGCACAGTGCTCAGGAGGGTCCCGTGCTTTACTGAATGCCACCATCTTGAAATCCTTAATAAT
TTTATTTTGAACTTGTGTTTTGTAAGTGAAGCTCAGTGGGATGATGGAGCATGATCATGAGCAGAGGAGC
TAGGCATAATGTGCGTTTGAGAATAGTTTTTGTGATGCCCCAGGAGAACAGAATTCCAATGAACCCATAT
ATGTGGTAGTTGAGCAAGACTCAAAATTGTAATACAAGGTGAGCAGAGCACATTAGCCTATAAGAGAGGA
CATTGAGGCACATCCCAAGGGACTGTGCTTTCTGTTCAGATAAATCAGAACTTTCAAATGCAGAAAAAGG
CAGCTGCATTCTAAGAAACACAGCCACCAAGGAACCGTATCCTGCCCTTTCTTATTCCTGTTACTTTCCT
GTATTAGCCAGACCACCTACAATAAAGCTGAAGAAAGAAGAAAGGGGAAAGATTGGGCAAGGCAGGGTTC
CCTTTCAGTTCTATCTTACTCATCAGTGAGCTCAAGGTAGAGTATATGGCTACAGAATATCAAGAAGTGA
AATGAAAATAATCTTGTGTGTGTGTTAGACACTGTTTTGGCAAGAATGAAATATGTATATACAAGTACAC
TCCATGAAACAAAAATTATGTAATTTTGGTGATTCTGCCTAAGGGTTAAAGTTCTGATAATTGCATTTAG
ACTTGGCATGGTACAATATAAAGATGAACAGTAAAATTCATGCTCATAATGTAAAATTTTATTTATTTTT
TACTTAAAACAACTTTAAATAGCATCTTAAAAACACCATGACCAATTGAGAGAGATTACAGAAAAAAGGG
AAAAGCTTTCTATTTTGTTAACGTTAATGGCATTTTTCCTGCTTTTGGAACAAGGGGCCTAGGTTTTCAT
TTTGCACTGGGTTGTATAAGTCACCTAACTGGCCCTGGTGGGATGTTTGTCTCTCCCTGAGGTTCAGCTG
CCAGGATTTTACCCCGATGATTCTACCCTGGAGAAAAGAACTTCCATTAATTTAAGTGAAAGTGAAAGCT
GTTCAGTCATGTCCAACTCTTTGCAACCCCATGGACTATACAGTCCATGGAATTCTCTAGGCCAGAATAC
TGGAGTGAGTAGCCTTTCCCTTCTCCAGGGGATCTTCCCAACCCAGGGATCGAACCCCAGTCTTCCGCAT
TGCAGGCGGCTTCTTTACCAGCTGAGCCACAAGGGAAGCCCAACAATACTGGAGTGGGTAGCCTATCCCT
TCTCCAGTGGATCTTCTTGACCCAGGAATCGAACAAGGGCCAGGGTTTCCTGCATTGCAGGCGGATTCTT
TACCAACTGAGCTATCAGGGAAGCCCCCTCCATGAACTTAGGAGTTCACCAGTATGCAGCGGACCTTCCC
TGCACTCATTCTAGAGAGAAACCCACAAGGGTATGGGTCCTCCCACATGCACTGGATTTCTTATTGGTTT
TAGGCTCTCACTTTATGAACAGATAGGCAAGGACCTACAAACATCACAGCAAAATTTCCAGCCTTTTCAT
CATCCTGCTACTCTTCTCTGGCTTAGAAGATGCCCTGCTTCTCGATTCATTGCAGATACAGACTCAGATG
CAATTTTTGTAATTGCATTTCTTGTCTGAAAGTGGTCTTCATAGAAAATACAACTTCAGTGAGCTTCTCA
AGAGAGAAGCTGTGATCAGGTGAAAAAGTCAGTCTAAACTCTGCATTTAACACAGTTGATGAAGTCATTA
ATTCCAATTTTTGGAGAAATATGTTCAAGCTGGGCATAGAGGAAAAGAGGATTGAAACAGGTGGCTGTGA
TATTTGGGGGAAAGCCCGATTGATTTTAAATTGCAACTCGGAGGGAAGAACAGAATGTTGATACTAGCAG
TGTCTGGCAGGCAGCTCAGTGGAATATTAGGTCTTCCTAGAATCAGGCTTCATACTAAACTGACAAGCCT
CATTTCTGAAGCTTGGAAATCCACCAGAGGTGTGGGAAGTAGTATGTGCATGGCTATACCCTGAAGCTGG
CCAAAAAGAGTTCTTGACTGTCCACTGTCCTGGGCTGTCTGCGGGTGACTTGGGGGGTGTGTGTGGAGGA
GGTTAGCATATGAGAAGAAAAGCAGGAGGATTTACAATCAATTTAAGAGAAACGAGACATATTTCTTACT
CTTAAATAAGTTAAACAGAAGCTTTCTGGGAAGGAGGGCATCTTCTAAGTCAAATAATCCAGTTGGCTTT
CTGCCTTTTGAACCTTATTTGTCTCGCTGTACCCATTCGTCTGAGTCTATTCTGTGGGTCTACACAGAAA
AATGTAACCCTCCTCCCCATGATGGCCCTTCCCACTTTTGAAGGCAGCTACGGTGTGCTTTGTAATCCTC
TCTTCTGAGAATCAGAGAGCCCTCTTTCCTTCAGCCTTTCCTCATTGCAAAATTTCCAGCCCTTTCATCA
TCCTGCTACTCTTCTCTGATTCTGCTCACTTGTCAAAAATTTATTGTGCACCAACTATAGGCCAGGCATG
CCTAGAGCTAAAGTACCAACATAAATAAGATATAGACCCTCCCATGAAAGAGTTCTGGGGCTAGTGGGTG
AGACAAATGAAAAAACTGTATACGGTAATAAGTGCTATAATACAGATATGCATGGGAGATGTTATGGGAG
TATAATTTGCCTTTTTGTGTAGAAATAAGCAGTGTGTTTCACGTAGGGTTTGACTGGTGCAGAGCAGTGA
AGGTCTCTCATATCTTAGAATTTAAAAAATCTCATCTGTAACCCAACTTATAAGCCCCTTTTCATACCTA
ATGTTATTTTTTTATGACTTCCCTTTTGATCTCTTTTTCTGTATCATTGCTGTTGCTTAGTCGCTAAGTC
GTGTCTGACTCCCTGCAACCCCCTGGACTGTAGCCTGCCAGGCTCCTCTCTCCATGGGATTTCTCAGGCA
TGAATACTGGAGTAGGTTACCATTTCCTTCACCAAGGGATCTTCCTGACCCAGCGATCCAACCCACATCT
CCTATATTGGCAGGTGGATTCTTACCACTGAGCCACCAGGAAAGCCCTTTCTGTATCATACTCAACAATT
GTCTGTTTTATTGGTTCTTATTTTTCATTCACATATGCTTTAATTTTTATCAATTTTTCTACTTTAAAAA
ACTGAGTTATCATCTCTAGTTTTCAGCCTTGTTGAGCTCTAGCACATTCTTTTAAGTCGACAGATCTCCT
TTGGGTAGTGCTTTCACCACACTGCGTGTGTTTTGGTGGAGGAGGTTGTTCTGTGGTTGCTCAGTTATAA
ACGGTTTGTAAAGTTAGGTTTGTCTAACCCCAAAGTTCTCCCAATGACTCTCTTCAGATACTGTCGTGCA
GAAGCCCTGCACAGTCTGACGTCTCTGCTAAGAGGACATGGGGAGCGATCGCCCCTGCGGAGGGTCACTC
CTCTCACAGCATCCCTTTGCCTTCCTCACCACTTTGAGAGGAACAGCAACACCCTCCGCGCCCCAATTAG
GGCACACAACGAGGGGAGCTGGCAAATGGGCCTGTTTTTCTTAACTATGTTGCCAACATGCGAGAATGAG
TTCTGGAGAGAGACCCTTTGCCAGAACAACTAGTTAAAACAGGAATATGATTGTGTCTAGATTTCATCAC
AATAGGATGAGAAGCATCATTAAGCAAAGTGGGAAGGCGGTGGACTGGTGTAAATCGACTCTCCTACTCA
TTCAGAGCAGGAACTGTTCAGTGAGCTGCAGGTGAGCCTCATAAAGCTTTGTTTCACGTGGATGCCTGCT
GGGCTGGCGACTGAGAAAATAATGTAGCCCAGTTTGTTACTCTTAAAATAGCCTTTTATTATCATCCTTC
CTTCAGGACACAGTGGTTTCATGATCCTTTTCTCCTTTTCCCTATAAAAATGCACCCTTAAGAAAAGATC
CAGAAAATCCAAGATAGATAACAAGATAAAAATGATTGATAACAATATTCAACCTTACCCAGACTGACAT
TCATTTTAAAGGGACACAAAACCTTAGGCTGAAGGAAACCAGCATGGATTTGTAGACTTTTCATCATCAG
TAGTAGAAGGGACAGTTTGGATGGAGGTTCTAGTGTGGATGGGGCGTGGCTCGAGTCTTAAATGAGCCAA
GAGGACTGGGTATACGCAAGGCAGTAGTTCTCAAGCAAGGAATATCCGCCCCTCCCCGAACCCTAGGGGG
ACATTTGGCAATGTCTGGGGATGAAGGCAATGGCACCCCACTCCAGTACTGTTGCCTGGAAAATCCCATG
GATGGAGGAGCCTGGTGGGCTGTAGTCCATGGGGTCGCTAAGAGTTGGACACAACTGAGCGACTTCACTT
TCACTTTTCACTTTCATGCATTGGAGAAGGAAATGGCAACCCACTCCAGTGTTCTTGCCTGGAGAATCCC
ATGGATGGAGAAGCCTGGTAGGCTGCAGTCCATGGGGTTGCACAGAGTCGGACACGACTGAAGCGACTTA
GCAGCAGCAGCAGCAGCAGGGGATGTTTTTGGTTGTCTTAACTGCGGTGAAGGTTACTAGGGGCATCTAG
TAAGTAGAGGCTAAGGATTCTAGACAGGCCCCCACAACCAAGAATTATCTGACTGCAAAACTCTGTACAG
GAGTTCCAAGGCTGGGGATACCTATTGTAGATGGACACAGATTTGGTAGTGACTCTGGCCCTCAAGTGGG
AGGGGCAGCAGGAGAGCCTAGGGGTGCCAGCCGGAGACGTATGGCCTTCTCTGAATAAAGGAGTCCCCAC
TGCTTAGCAATCGAAAGTCACTGTCCTACGACCCAGTATAGACAGATCTGACTTCTTTGGAGAATTCTGA
AATCCGGATTTTTACATGAATGGCTACATTTACAAACACCAGCAATGAATCCAAATGAACAAGGTTGAGA
AAGCCACACAAAAGACGACCATGAGCTGCACTTGGCTTTTGGGTCGCTACCTACAAGCTGTCACCCAGAC
AGTCATGTCCTAAGAAGTATAGTGCAGGTTGCCAATCCAGGGCTGCAAGGTACCCGGGGAAGGGTCTGGC
CACTAGCCCTTCTGTCTGCAAAGGCCACGACTGTCCTACTGAGGCAGGAGGCACAGAGAGGTCTGTTTTA
GCTCTGTGGGGGAAAATAAAAAGGTATTGATGATCTAAAATTGCTCTATCATTATTCAACATAGTTTCCC
AAAGCAATTTCACCTTTTCCTCTCGTAAAATTCTCTTGGGCTCAATATGCACATTTTGGATTTTGGACTT
CTACTAATTCAAATCATGTCCAAAAGATTAAATAAGATAATACACAAGCAAGTTCAGAATGTAGTATTGT
CTGGCACTAAGAGAAATGTCAGTTGACCTTCACTATGGAGACACTGCTTAACTATGTGAACAGGACTGTA
GATACTGTTGTTTAACTGGATAGAATTAGTGACCAGCTCCTGGGATAATGAGCTGAGTTCACCCGCTTGA
ATTTGGAGGGCAAACACATGTTCTTTGTGTCTGGGAGGTATGCTTGAGGTTTCCTTTGTCTCTCCCTGCT
TTGGGGAAAAGCAAATTTGGCATGGCCTCTGAATGTCTGCCTCCTCCACGCTTTTTTTTTTTTTTTTTTA
GTATAGAAGCTTTGTAAATTCGCCTTGTTCAAAAACAAGCACCATCTTCATACCTGCTGAAAGGTAATTC
TGATCCAGAGGCTGTGTAATGAACAAAGGGGGCAGTAATACCAACGGGCGCTCAGACAGCTCAGCCCACG
CAGGTACACGGGCACGGCTGCACCGACCGGGGAGGGAGGCCCTTTGTAAAGATCCCTTTTTATTGTGACT
CTTTCACCTCCCCCAGCCCAGCCTGAGCCATGAGAAGAAGAGCGATGTCAATGAGACAGGCCAATTTTAC
CCATCTTACACAATCGTGCATTACAGACTTTCCAAGAAGACGCCATTCGGGGTGAATCACCAGATTCAGG
ATGAAAAGGTTTTGATTTGCAAAAGCGCTCAAGCATTTCTCTTTAGAGCCGGAGATGGCAGGGAAGGAAC
CTATGTGCATGCAGCTGCTTCACCTTTGGGCCAGGTGCACTGGCAAGCACTCCGTGGTGGAGCAGGGAGA
AGGCACCCAAGCATTTACAGGCGCCTTTGTGACACCCTGGGTGCCTGGGCCTGTGAATGCCTTATTTAAG
CCAGCAAAGAATGAAAGGGTGCTGGTGACGGTGGGGGTATTGAGACGTGTGGAACTGAGTCAAAGCATGA
ACTACTTGTTAACATCAACCGAAATTAATGTTTAAGCTTTCAGGTTTCAATTGGATTTCATCAACCCAGG
AAAGAGGTGATAAGAAAGTGCTAGATTGAGGATTCTATTATTTTGGATGATTTTGGCTACTGATAACAGA
AAATTTATGTCAAATGACTGAAACCCTTTATTACTCAAACTTTGGTCTCAGGACCGCCTTCACCATCACC
CGGGAGCTTGTCAGAATTGCAGAATCCACAGCTCCAGTCCAGACCTGAATCAAAGTCTGCATTTTAACAA
GACACACAGGTTATTTGTATGCACTTTATAATCAGAGAAGCACAGCCTTCAACAAGAACTCTTGTGTTTT
TCTTTTTCTAAATCTCAGAGACCAAGTCGGCTAGAGGTAGGCAGTTCCAGGGTTGGGTTAACTCAGCAAC
TCAAGGATGTCGAAAAGGACCTGGGTGTTCTTCATTCTTCTGCTCTGCTCTGCTAAGTGTGTTTACTTTT
ATACTTGGATTTATCTTCCTGTCGTCACAGGTGGCTGCAGAAGCTCCAGGCATTACCTTGCACAGCTGAG
TGAAGGGGTGGGAAAAAGGAAGCTTCCTCTCACACATATGCCTTTTTAACAGCAAAGAAACATTTTCCAG
AAGCCACTTAGTAGATTTCCTTTCATGTTATATGTCTCATATCTGGCTGTCCAGAATTGCAATTTCTTAC
CCATTCCTAAACCAATCATTGGCAAAGGATTAGAATAATAATCATTGGCTAAAGCCAATGAATATTTAAC
CTCTGGGACCAGGTAGGAGTCTGGGCTTCTTCGATGCACATGGCGATTAAATATATGAACAAAATGGATT
TCTGTAACTTGGAAGAAATGAGACTGCATAGGAGAGAGACGGTTGGTGGGAAAATCAACAGTGTGTGCGG
CAAGAATTTTTTAAAGGCTGGTCATGTATATTTTGGGACATTAGGAATTTCCTCGTGAGTGTTTGTTTGA
GAGTTAAAAAAGAACCATCTGTGAGGGTTCCCTTTATGTCTGAGCTAAGAGATAGTGTTCTAGTGACGTG
TATAGCCTTGGATGGTTTGTGCTCCAGGAAGTTCTTAATAATAGAATAATGCAATTATGTTTTCAAACGA
GACAGCTAGAAGGGGCATTTTAGACTACTTCGTGTGTCGTGCCCATACAGTTTATAAATGAACAAAAAGT
CCAAATGACCCATACCACACCTTGGTCAGTTGTGTGGAGACCTCGATAGTCTCCTTTTATGTGCGACCAT
CTTGATGGAGGTAGTTCATTCTTTCAATTTTACATCTGGACTTTAAAGAAGTACATCATTTTACTTTTCT
GGGGGAAAATGAATGATTATGTGGAGTGAATCACATAGAGGCCATTTTAAGACCATCTAAGAGAAATAGG
ATTTAATTAAGCCGGTCAGAGGATCCCAGCTGTCATGAAACCATGAGACCCCAGAGAGCAGGGTCTATCT
GATACAATCTGTAAGCACTTGGCAAGTGTTTGTAGAACAATTAATGGAGTGGATAAATGAATTAAGTGTA
ATTATATCTTTCTCTAAAGAAAGTTGTCATGTGACCTACCTTTACCATTTGCATTCTTATTTGCTGTATG
TTGTTTTTAATTGACTTTCTCTGAATAAATGTGATTGAACCAAAGGATTCCTGGGAATCCAGGATATTTT
CTTTCATACTTTTATTTCTGAGGGTCTGAAAGAGAACAGCAAATCCAACAGTAGGCTCCATTAATAAAGG
AGGTGAGAAAACTGAAGCTCAGAGAAATTAGTTCAAGGACAGCTACTTATCTGTGAAGTTGGGATTCAAA
AAGGGAACTGTTTGACTGAATCCATGCCTTTAACATATATAGTAGTTACTCTATCTAAGCATTATTATAT
GCTAGGTATTGTTCTGGGCTTCACAGATGGCTCAGTGATAAAGAAGCCACTTGCCAATGCAGATGTGGGT
TCAATCCCTGCGTCAGGAAGGTCCCCTGGAGAAGGAAATGGCAACACACTCCAATATTCTTGCCTGGGAA
ATCCCATGGACAGAAGGCCTGGGGGGCTACAGTTCATGGGGTCACAAAGAGTCAGACACAACTTAGCGAC
TCAACAACAACAATAGGCATTGTTCTAAGTGCATTATTCCTGTAACCCTTGCAAAAAGCCTATGAGGTGG
GAAATGTTTACTATTCCACTTTACAGGTAAGGAGACTAAAGCTTAGAAAGGCTGAGTGACCCATCTGTAT
CCCCGTAGCTACTCCTGTTAGCCCTGGGCATGAACTTATGCCTGTCTGATCCCAGAGCTCCAGGCCCCAC
TATGATATTATTGCTTACGGACGGTGGAGCACAATATCGTATCCTATGTCGCTCAAGAGAGCAGGATTAG
CATTTAGAAGAGCCTTTAAGAATTGGTCTGTGCCAACAGCAGACTTGTCTGCCTCAACTACAAATGAGCT
GGGGTGGCAAAAGTCACTGTCAAGAAGGCATAGGCAAATTTACTGCATGTAAGTAGAAGTAGGATTGGTA
ATTTCTATAGTCTTTTTCAAGAATAGAGTTTTATAGTATAAATTTTTTTAAGGCAATTTTGTTTTCAGTA
CTGAGCCATTCACTTAATGGGAAATTATTCTTTGCAAGGGAAAGAAATAGGTAGAAAAAGGCGACTTTCA
TCTCCTTGTCTTTTCTCTCCCACATGCCTACCCCACCTCCAGCTCTTACACTCAAACCCCTACGCTTGCA
GTTTATTGTAGATGACATTCTTTTCCACATCTGCAAAAGCAGTGCCCACACTTTCTTAAATAAGAATTGG
ACCAGTGTTTAGATATATTTACCACCTCTTCCCCAAACACCAATGTCTGATTATCTAAAAGCCCACGTCT
AAAACTTGTGAGCAGTAGTTGAAGCTTCCATCAATGCAAATCTTAGATAACAAACACTTCACTGCAGCGA
GTGGTTGCTTCATTTCAATGAGGTTAAGATGGTAGTCAGTGGAAGCTCTGATTGGTTTTGAGAAATGGAC
ACAAATGTTTCTTTTTTCAAATTGGGGTATAGTTGTTTTACAGTGTTGTGTTAGTTTCTGCTGTAAAAGG
AATGAGTCAGCTATATGTATACTTATAATCACCTCCTGGACTCACAAACCCCCCCAATCCTACCCATCTA
GGTCATCACAGAGCACCAAGCTGTGTTCCTCCTGCTAAACAGCAGGCTCCCACCAGCTATCTCCTTTACA
CATGGTAGTGTATGGAGAACAGTATGGAAGTTATGTAAAAAATCTAAAAATAGAACTACCATATGACCCA
GCAATCCTACTACTGGGCATATACCTAGGAAAACCATATTTCAAAAAGACACATGCACCCCAGTGTCCAC
GGAAGCAGTATTTATAATAATCAGGACATGGAAGCAACCTAAATGGCCACTGACGGTTGAATGGGTCAAG
AAAATATAGTACATATATACAATGGAATATTACTCAGCCATAAAAAGTTGTAGAGATGTGGACGGACCTA
GAGTCTGTCATACAGAGTGAAGTAAATCAGAAAGAGAAAAACGAATATTGTATATTAATGCATGTATGTG
GAATCTAGAAAAATGGTACAGATGAACCTATTTGCAGGGCAGGAATAGAGATCCAGACACAGAGAACCAA
TGTGTGGACCCAAGACAGTGGGGAAGGGGGCGAGTGTCGGATGAAATGGATACAAATTTAATATTTGCCC
AGAGAAATATTTTCAAACGTCTCAGTCTAGTTTGAGTCATCATTCTCCATCCTGGTGAATGACGCTTCAG
TAACTTGAAACCTTACTAAGTGACATCATCCTGTCTTCGAGATAGTGAGCTTGGGCTCAGAGGACAAGCT
GCTCAAGAGGATCCTGGGCACTTTCTGTTTTCTCTCTGTGTCTCTCTTGCCTTTTTCTTTGTCCTGCACT
CTCCTTCCTCCCATGGTCTACTTTTCACTTTGAAAATCACTGATTTCAAAAATGCATCTGTAGCTGTTAA
TTTTGCGTGTATCTGATTTTATGGGTTTGTGGAGCTCCTGAAGTTAAATTAATTTTAAATGGATTTTAAT
TACTAATGCTTTGGGCCATTGCCTTTAGCAAAGTCAACAAAACAATCAGGGTTACCTTAGTTTGTCTCTT
TGAGGAATGTGCTCACCCGTTTGGGCTCTTACTAAACAGAGCAGCTTTCTAACCAGAAACATAAATAAAT
CCATGGCTATATTTGAAGGTCTGAGTTTATTTATTTTTCAAAAACAGCCACCGTTGATTTAAGAAATAGG
ATTTTAAATGTGCACCAAAGGATTTTTGTTTGTCAATAAAACCTCATTATTTTTCAAAGGTGGAGAGGAT
CCTAGTCATAATAGTATGCTATGTAAATGCTAAAGCAGAGGCTAAATTATGACCCAAAGGTTTGAATTCT
CTTCTTAGCCCCACACAGAGTGCTTTGTATGTAACAGTAGAGAAATCAGGTTCCTTCTCAATACAGTATC
CCCTTGTAAATTAAGGGTAATCAACTCCTAGTTGAAAAGTGAAGGCATAATGAGATAATGTTCAAAATGG
CTTTTAAGTGTCTACCTATAGAGGTTGAAGGTGAAAAAAGAACATTGCCATGTCTGTGAATACCTGCCTC
CCAGGAGGGAAGGCTACTCAATGTTGTTCCCAGCGGAATGTTCTGTTCAATCAGGAGCCATCACTTGAGT
GGTTTGATAATAATAAGCTTAGATTGACTGCAGATAAAAATAGGAAAGGTTAATTAAGTCATTTAGTCCC
TTGACAATTTTGCAAACCAAGCGATTTCTGTTAAGGATCAGCCTTAGACTGTGTAGTACACAGTCCAGTC
GAATCACCGAGGTTCTAGTCTCCACCTTCCACATCTTCCCCCAGAAAAGAGCATAAGATTTTTACTATTT
TCTCTGAAAATTCATATCAAATTTCCTTCCTCTCTATATATGCTACTATTATTAATGACTCCTAGGTGGT
CTCTAATACTTCTTTTATCCTTCTTAGTTGAAAACGTGCTAAAATACACCCAGGTAATAGAAACACCTGA
GGTTGGCAGCTGGTTAATTCATTAATCAGTTAAATTCATTGTTTAGCTTGACAGATATAAGACTCTAAAC
AGCATGTTAAAGCTCTATGTCCAAATATATGAATATAGAAGATAATTAGAAATAAAATAGTGATTTAAAA
TTAAAACTCCAAACATAAACTTCCTGAGACAAGTACAAGAAGAGAAAAGGACCTTTTTTTCCCCTCCAGT
TTCTCTCTAGAATTGACTTTCCACCCCTATTTAATACAGCCCTCAATAATCCTGTTTAACTCACAGTGTA
TCCAAATGTGCAAAGCCAAATAAATATACAAGCAATTATTTATTAATAGCAAGAATGTGTGTGTCTATAC
TTTTACAGTTTCCAGAGTATTTTAGACTCATTCTTTAAAATAATATCTGCTTATTTTTCTCAAGCTTTAA
AAAATAATCTAGGTAATTGTTTCACCCTTATAATGAGATTCATCTTTCATTCATGTAATTCTCGTTTGGT
TCATGTGTACTCAAAAATGTAAAAAGTTGAGGATGTTTGAGAGCACCCCCAAAGTGCAGATTCCAGGGTT
TTGTATAAACACTGCCACGGGCACGCGCACACACACACACGCACACATTCTGCTTCTTTATCTTTGTCTT
GTTTGGTATTCCCGGTCAGAAATTACTTTCCTTTCAAATCCCCCTTGATAACTCAGATTCTATGTGATTG
CTTTCAATGGTTGCCCACCATTAAGAAGTCTTCTTTAAAGGCAAAGAGACTGGATAGCAAAAAAGATGAA
TCAGAGGAAGGAAAAAAAAATCTTTGAAGAAGAAAATATGCTTGGGTAAAACAGAAAGCAAATGACATCT
TAGTTATACAGGTACTTCTAAAACTTGTGCTATTATATATTATATATATAATTTAAGTTGCACGTGAATT
CTCCGTACTTTGGAATAAATTTTGGAGTAAGTTTTCTTGTTCTTATTCATCTTAAATATAGATCACTATT
TGAAAAAATAAGTGATAACTTGTTTTCTCAAGTAAACAAGAGTGAAACTGACCATAGAGGACTTCAAAAA
AGAAGCAGGGGAGAGGGGAATTTGCAAGATGAGGTGTTTGATAGAAACAAAAAAGTAAGCACTATTTAGG
CAGAGTACTTGTGATGACTAAAGCACAGGATTTGAGCAATGATGGGAACCAGCTCATCTTTAATTTCTTT
ATGCCTCATGGGTAATTTTAGGCACATAAAGGAGCCTCCAGTGAAAATGTCTTCATAATCATCAGCTCAG
TTCAGTCGCTCTGTCGTGTCTGACTCTTTGCAACCCCATGGACTGCAGCATGCCAGGCCTCCCTGTCCAT
CATCAACTTGAGTTTACTCAAACTCATGTCCATGGAGTCAGTGATGCCATCCAACCATCTCATCCTCTGT
TGTCCCCTTCTCCTCCTGCTTTCAATCTTTCCCAGCATCAGGGTCTTTTCCAGTGAGTCAGCTCTTCGCA
TCAGGTGGCCCCAGTATTGGAGTTTCAGCTTCAGCATCAGTCCTTCACAATCATGTTTTATTGTACCCCT
CCATTTTCTCATCGAGCCACACAGGCTTTTCTTTACTTCTCCAAACATTCAAGGTTCCCTCCTGTCTCAA
GACCTTTGCAAATGTTCTGCTTTGCTGAAAGAGTCTCCAGAAACAGCATGGATCTAGGACATCAGGATAA
CCCATGATACAGCTGGTTGTACAGATGTGGGTGTGACACACAGCATCCTAGCAGGGTCACTTCTGGGGGC
ACTAGAACTCACTGTCAGGGAGAGTGCAGGGCGAGGGGATGGGAATCTCAGTAACAGCCACACATCTAGA
GGAGGCATCCAACGAGAGTGAGAAATGGTCACGGAAATAGGTTCTGTCAGCTTAGGGCAATGTCTTGGGA
GCCAAAGGGAAAAACAAGTTTTAAGAAAACAGATACAAGCAGTATTGGACACAGTACTAAGCACAGGGTT
GGCAGGTAGAAAGCACTTCTTTTTTCACCCCCATTCATTCATCCATTCAACCATTCATTCTACAAATATG
CATGTATAGCCAAGTGCCAGTTGCTCATCTCATAGAAGTGAGGAAAACAAATACTATCCCTACTTCATCC
TAAAGCTTATATTCTGGGGGAAGAGGGAGTGGGGTGAAGAGAGACACAAAACAAATTCACAATAAGTATA
TAATATAATATCAAGTGGCAGTAAATGCTATGAAAAATGCAAAGCAAATTAAGGAAATGGAAAGATCCAG
TGGTGCTGCTTTGGGAAGGGGAGGCAGGGAGACCTCTCTGTTGAGGGGTGGGGTGATATCTCAGTGCAGA
CAAGAATGAAATGAGGCCTGAGATGCTAGCAGTTATGATGATTAACAAGATGACCTGAAGATCCTGGTGT
GAATGGTACCAAGGGATGAGAGAGTCCTAAGAAGGAGAAAGTTGTCAACAACTCAAATGCATTGAGATGT
CAAAGTAAAGGAGAGGAAACTGTTTATTGGATTGTTAGTTTTGACCATTAAGGAGGTTAGTGATCCTAAA
AAGAGCAGCTGCATTGACTTGTTTGAGAGACAAACTGCAGATGATTAAGGAGTAAGAAGGTAACAGGAAA
TGGATAGAGGGGAGATATTTTGAAACAGTTTAGCCATGAAGGAGAGTGGTTAAGGGAGTTGCTTAGTGGC
TAAGTTGTGTCCAACTCTTTGGCAATCCCACAGACTGTTGCCAACCAGGCTCCCCTGTCCGTGGATTTCC
CAGGCAAGAATACTAGAGTGGGTTGCCATTTCCTTCTCCAAAGGATCTTCCAGACCCAGAGATCAAACCC
ACGTCTCCTGCTTGGCAGATGGGTTCTTTACCACTGGGCAGATATAAATGATACAAAACTAAATTTTTGT
GCTTCTCTCCCTCCCCCCCGCCCCCCACCCCACCGCCTCGCCCAGACAAGGTTTTCTCAGCCTGGGCACT
GTGGACATTTGGAGCTGGATGATGCTTTGATTTGGGAGTGATAAGGGGTACTGCCCTATGCATTGTAGGG
TGTTCAACAGCATACCTAATCTCTACCTACTAGATGCCAGGAACACCCCTTCCAGCCAGTTGTGACAACC
GAAAATGTCTCAAAGTGTTGCTAAACGTCCCCTGGAAATTAAGTCTCCCCCAGTTGAGAACACATGCTCT
CTACCGAGTTAACCTGTATAGTTAGAAATGATGGATAATGAGCCCATAGAGAGGGAGAGAGAAAAGAGAT
TGAGAAGGGAGGGTCTTAACATGAGGCACCATAGTCAGGAAGTTCAAAACACCAGCAAAGTGAGAAAAAG
CAGAACTCTAAGGACTGCTAGTCAGTGCTGTGTTGTATGAGTTTAGGAACAAATGTTGTTACTCCAAATT
AAGAAGGGAGCATTGAAACTGAAAATCAGAAAAGAAAAAATACATTTGTTGTAACAATTTGGAAATCAGC
ATGCAAATTTTCAAGGAAAAGCAAGATTATTTTTAAAAAGAGAAACCATCAGTTATCTTTTTTACTTATT
AGGAGGATAAGACAAATGAACACACAGTTTGAGTTGGAAAGTAGATCACGTGGTTCATAAATTGCCCCTT
AGCCTGGGAAGTCTTTAACTTGGGATTGAAGACAGATGTTTCTGATGTGACTCATGTATTTTTACCACTT
TCTCATGGGCATGAACTGAGTTTATTCCTTTCTGTCTCCTGGTGATGTCAGTGTCTTGTTTACACAGGAT
GAGCCTGCCTAAGTCAAAACTAGCCCCAAGTCATGTGATGCATGGTAAAAATACAGAGGTGGGTTTCTCT
TTAGATAGAATATAGTTGCAGGCTCTAGGGGAATAGACTTGATTTCTGGGTTGAGTGCATGAATTTTAGA
TATTTGGGAATGTTCTAGAATAACAGAATTTCAGAGTAGAGAAAGATTGCCATGGTTATCTCCACAAACA
TTTTATAATTTCTCAAGGTGATATTTTAAATATTACACATTAAAAGCACCTGAAAGTTGTGCCTCTTTTA
GCGGTAGAAAAAATGTGAATATACTACTTGGCAATCAAGTGTTCTCCCCCCTAAATTAAAATTAAATATT
CATCTACACCTGAAAAGACCACTGGCATTTAATACCATATCAGAGAGAACTATTATAGTTATTTTATGAT
GTGATTTCCCTAAAGCCTCATGACAGTTTAGTGAGTCAGGTATTATTTCATGTGAAAACAAGAAAAACTA
GCAATTGTGCAGAAAATCATACCTAACAAGGTAGAAACAGCTGTTTTGTGTCCAGAATTGATCTTTGGAC
ATCCCTCCCTCAGTGATTTTTTTTTACCTGTTTGATGAGATCTCAACTTCTTTTTAATGGTAAATACACA
AACTAAACCACTTTGGGACTGATATTCGGGCAATTGATTTCATCCAGCCAGACTCTATCCAGGACAGCTG
AGTCAGCAGACATGATTTTTGTCCAGGTCCTGTCTAGCAGAGCACCCTTCAACCTCCCCTGGCCGACCTC
TCTAGCAACACACTCGGGAAATGATCCATCAAGGGTCATATCAGATGTTATGAAGAGGTAATTTTCTCTG
AGATTTAAATTAAGGATTTGGGTTTGGTCACTGGCCTCCATATGAGGAGAATACTTAATGATTTTTTAAT
CAAGTACAGTTTTTGTAAGCAAGCGTATTCTGTTGCAAGATTTTTGGAATTTATTACATCCTAAAGAAAA
AAAAAAACAGCCCTCAGAAAATCATTAATTTCTAACAAGTAACTTATTTCCATTACAAGGACCCTTTCTT
TGAATGGATAGTTCTGGTCTAATCTATATGCTTTCTTAACTTTCATGGTAACTGTTACTGTGCACTTGTT
ATGTATATGTGTGTGTATATATGTATGTATATATGAGTATATACACACACACACACACATCTTCTGCTCC
AAATGTTCTGTTTGTTTGCTTGTTTTCTTTTCATTGGGCTCTTAAGAGCCCCAACAACCCACAATAACTT
GCAGAATCTACTTAGCCCCAGGGAAAAAAAGAGCATCTCTGCTAAGCATCACTTCTCCCAAACATTATAA
GGGGATACACTGAAGTGACTTTAAAGAAAGAAATCTGTGGATAGAAGCTTCAGGATACTCAGTGTAAAAG
CAACCCCTGAGTCTCTAAGACAGCATCTCACCCAACAGAACCAGGATCTCTTCCCGGTGTGTGTGCAGAT
GAAATGATTGGGAAGGCACAGTTGGGTGGGCCAGTGCGGAGAAGATGGATGAGTGGCCTCTGTGTCTGTG
AGGTGGTCAGCGAGAGAGGCAAATCCCACCGCCCCCCACTCCCATAGCTGTATGTGCCACTCTTCCTCTA
TACAAAGACAATTTGCTTTTCATGTAGTAATGAAGAACAATGTTTAACAGACAAAGGAGCTTCTAGTTTA
TTATTGTGCTGTCTTTGACTAGAGCTCGATGCTGGGAAAAATGGTAATTCACAGTGTGAATGCTTCTTTT
CCAGGCTTACGGGTTATGGCTGAGCATGTACATTGTATATTGCAGATGGTACAATAGCGAACACTGTGAG
AACGAGGAAAGAATGAATTACAGCTTTCCCCCCGCCCCTTGCTAGCCAAGAAAACCAGGAATTTCTTCAC
AATGATTTTTATGTATGTTTCTGTGTGGATGGGGTATTCCTGAGCAATATACGCTACTAGATTCTGATTA
ATGATTTTAGGTTTTAGTGTTTTTAAACTAATGTACAAAATCTACCACGATGCTCTTAGAACACCTGTGT
TCTTCTGCTCCCCCCTCCCTCCGCCCCAGTAAACAGACAAGGCACAGGTGGTGCCTCATTAACTTATTTC
ATGCAAATAAAAACATATTGCATGATTTCTTCTCCTTAACTATATAATCACTGAAACATGATCGTGTTGA
TGAACGCAACAAACTTTGGAAGGGGAAGATTCAGCATCTTTTGTTGGTTCTCTTATGCTGAAGGGATTTG
CAGGCCTGGTAGTAAATCTGATTTAAACGGAAAAGTAGAGGGAATGTCTCTTTCCTCTGTCAGAAATCTA
TTTTCTTCTAGTCTTTTCTGCATGTCCTAGGACCCTCTGTAATATAAAACTGGACATATAGAATTGGAGA
AAAAAGGGAAGACCAAATTTCCCATGGAGGTAGAAGAATTGGGAACCCAGATGAAAAGCCATCACAAATA
AGCACGCTTAGGTGTCTTGATTCAGTGTTCAGATATTCTTATCAATGATTTCCAGGGAGTGACACTGATT
TGCTCTGGAAACAATGATAGAAAAATAGTGACTGTGTACTGGAATGCTTTTAGCATTCTAAAATCATATT
GGAAAATGTCCGAGCATTCCAACCCAACTGTAAACAGTACCATGCTGTATTTATGGGGGATTGCAAGTCT
GTTTTTAAAAGTCTTTTGAGATATTATTTCTGTTTTTAATTTATTGAATTAAAGCCAGATTTATTCAATT
AAGATAACACACTTCAAGGTTAACTGCCATTTGCTGTTCATTTAGGCAAGCTATTGTACAGTTATGTAAA
TTTCATGAACAGAGACACTACTTGAGTGTCTTAATTACAAGCAAAAAGTGTCATTTAAATATGAATTAAG
TGAATCGTTGCCAGTATTTCCTGTTCCTGTATTTTGTATCTTCTTAAAATCTAAGGGCTATGCTATGATT
GGTACTCTGTCCTAAGATTCTTTTTCCTATAAAAGAAAAACACTCGTACTATTAAATATTATTTTTCAAA
GTAGGAAAACATGTAAAAATGCCACGATCTCATTGATCATGAATCATAGAAATATCTGAATAAGAAGGTG
ACAATTTTTTCCAGGTGACCACTGACAGTCATCTTCACATTGTTACTGAAATTTAAAAAGAAAAAAAAAA
TTCCATCCTATACTAGGGGGATAATTTGCTACATGTGTCATTTGCTCAAATTTACCCTCAGTGTGCATTC
ATAGTTTAAAAAATACTCTCTTTTAAATCTACCCATTTATAATGGATCCAAACATTCAGTTCTCCAACTC
AAACACTGGGATGAGCACAATTTTACCACTTCTTTAACTTTGTTGAAAGGAACTTTTATTTTAAACTTAG
AACTGTCATGTCTATAGTTGGGCTTAGCAAAAGGGACCCACAATGACCTAATACTTTGTGGATTTCTGTG
AAATAATTCCCCGTGCATCTCGGACGGACACATTTTTTTGTGTGTCGCTGTATTTGTTGGATACTTCTTG
TGTATTAAAACGATCTAGAGACCAATCTGGCATTCATCAGGACTCGGAAGGGAGGATTCACAGTCCTGTT
CCTGAAAACCCCACCTCACTTCTGCGCAGTCGGTTCCCAGCTCCACTGCTGAAGCCTGGGAAGTTATTTC
CAGCCGCCCCCCGGCCCCCTCTCCAAGGTAGCTCCTCCTCGCTCCCCCAGGAGTGAGAAGTTCACTATTC
AACTCAGTGCTTGCATCATCCTTTCCAACATGTTCCCACTGATACATTAAAGGAATTAATTGCTCATAAT
GATCTCACCATCTGTTAACAGAGGTTCATACCTACAGGCTGGTGGCATGCGGTGGTTCTATTCAGGGATC
AATAGGCAATGCCGGGCGGACGCTTTTCCCACATTTGGCCGCCTCTGCCCGCTGAAAGGGGAGGAGAGCT
CCGTGCCGGCCATCTGCTCCCTGTCTTTCTGTGCGAGACCTTGATGCGGTCCAGCACAGCTCTGATCTGA
CTCCAGTCCGATTGGAATGTGGCTGATCTGAGAGCCTCTCAAAGCTGTGAAAGTGAGTCAAAAGGAGATG
CCCATTGTCCGGCAGGGCTGAAAGAAAATGATCACTTTAAAATTCATAACTCCCAGGTGCCTGCCATCTA
TTCATGGGAAAAACCCTCTGAACTTTCTCTTCATTAAAAGAGAGAGAGAGGGGAGTCTTGCTAAAAAGAG
AGAAAAGGAAGTAGTTTGTGGGAATGTGCAGGGGATTTTCAGGCTAAACAAATGAGGAATTTGGAATTTT
TGCACTTGTATCAAAAGGGCTGTGTAGGGAAAGGGGAAGCTCTTATTTTTAAAAAAGGGCAGGGAGTGGG
TTGGGTGGGGGGATTGGGGGGATGATTGGGGGGATTCCTCCCTGGGGCACTGTAGCAACTAATAATTTTT
TTTCTCCAGGAAAATTCAAAAATCACTCTGACCTTTTTACTTTCATTACCTGAAGCTGAGAAAAAAAAAA
TTTTTTTTTTATTCTGGACAGCATTCTTAGTTGATTTCTGTTCTATTCTGGATAACATTCTTAGTCGATT
TCTATTCTTATTCAACTGGGGTGCAGTTAGGGTGGGAGCTTAGGGAAGAATGAGAGTGGGAAATTCCTGA
ATAGGTACAAGACATCAATGGACACGCAAATGAAACTCAAATCTACCAGGTGTTTCTAGGGAGGGATTTA
TTCAGCAGGATTACCTGCGCCCACCAAGATATCTGTCAGGAGCAGAGCTCTCACTCAGCCTCAATCAAAA
GAGGAGCCATGAGGAAATAGTAAAGAAAAAAAGTTTTTAAATCTCTATGTCTAGGAAACTGAACATTCAG
CTAAGTTCCCGAACTGCAGGAACCGCAGGGCTGTTGTATTGACTGCTCTTGGCTATTGGTCATTGTAGAA
AACTGATATTGTATCAGAGACACTTAAGAATTCAGAGAGGATTTGTGGGTTAATGAATTGAGAGCCAAAC
AAATGTATAAACAGAAAGACCTAAGTTCTGTCTCTTGAAAATCTTTTATCTTACAGATACTGACAAGGCC
CCTAGAGGTTAGGTGGCACCAGTGGAAAAGAACTTGCCTGCCAATGAAGAGACATGAGAGATGCAGGTTT
GATCCCTGGGTCAGGAAGATCCCCTGGAGAAGGGCATGGCAACCCATTCCAGTATTCTTGTCTGAAGAAA
TCCATGGACAGAGGAGCATCAGGACTATAGTCCACAGGATCGAAAAGAGTCAGACATGACTGAGCAACTT
AGCACAGAGGTTAAGTGATAGGCTCAAAGTCATAAAATTTGGGCTATAACCTTACCTGTGCAATTTTCTT
TCTACTGCTTCATGCTACCTCTCTCAATGGGGAAAGTTTTAAATTATTTTTGCTCAACTTTCCCTAATTA
AAATATACATTAGTATTACCAAACTAAATAATATAATAGTTTTTTTTTAATAAATTCATGAGGTCATAGG
AAGATGATGAATACCAGTACTAATACATAAAGAATGGAATATTAATCAAGCAAGGCCTTAATATCATATG
GAAAATATAGCAAAGATGGCAGAACCTACTGGGGGAACAGGGTAGTAATGTCCTGTTGGACATCAATGAC
CAAATAGAGGAAGAAGGAGGCAATCGTCCATTCAGAGTGAAAAACCACCTCCTTAACAAGCCAAGAGTCA
AAAGAAGTGAATGGCAGAAAGGAGATGGAAGAACCTAGAAAAAATTCAAATTCTGAAAATGAGAGAAGAT
ATTTATATCAGCAATAATTGCATCTCTGATAAACCTCTTTGGCTATGATTTTGCCACTGTGCAAAGCTAG
GGAAAAGATATTTCTGACAGAATCTGTGTTGCTTTCCTAACAAAGAAGAGGCCACACCTTGACCATATTT
TTTTTTCCTAGAAATTTAGATTCAGTTTATTAAGAATCTTTGTTTTGCTTGACTTTTACTTTGTGGAAAC
GTATCTTTAATTTTTAGGGCAGTGGATGGGAACATAAAATACACTGAATAGAATTTTGACTTTTTTTTTT
TAGTATTTTGACATTGTTTTTATAGTAGGTTGAAGTGAGAAAAAGCCATCAAGGTCAGTTGTAGATTTGA
TCACTAATCCAATTTAATTCTTTTTTTTTTTTTCTTATTAAATTTAATTCTTAAGTTATTAACTTGTCTA
CTCTACAGTTTATGTGGATAAAGTCAGAATTAGTGAGATATGGTTTGGGAAATAAAACTCATGAGATTAC
TCCAGAAAAAGTAAAAAAAAAAAAAAAAAAAAAGGTAAACCAATAGAAGAAATGAATTAAAAAGTTACCC
ACAATCCCCTTACCAGGGAAAACTACTGATAATTTTCAGAGTGTATATTAAACAACAATAAAGTGAGATC
ATATTATTCACATTATCTTACATAGTATTTAAGTTGAAATTATAGACTTTTGACCATATTTGACCTGCTG
CTGCTGCTAAGTCGCTTCAGTCGTGTCCAACTCTGTGCAACCCCATAGACGGTGGCCC
>gi|546669925|gb|AWWX01450616.1| Bubalus bubalis breed Mediterranean WGS:AWWX01:contig450615, whole genome shotgun sequence
AGGTGGGAGGAGGGTTCAGGATGGAGAACACATGTACACCCGTGGCGGATGCATGTTGATATATGGCAAA
ACCAATACAATATTGTAAAGTAAAAATATATATATATATATTAAAAAAAATAAAATGTTAAAATGAAAAA
AAAAAAAAAAATCTCACCCAGAGAGGCACCAGGATTGGAGTCCAGAGAAAAAAGAAGAGAAAAAAAATCA
CTTGGGGACATAGCAAGAAGGTGGCCATCCTCAAACCAAGGAGAGAAGCCAGAAGAAACCAAACTTTCCA
ACACCTTGATCTTGGACTTCTAGCCTCCAGAACTGTGAGAAAATAAATTTCTGTAGAGTCACCCAGTCTG
TGGTATTTTGTTATGGCAGACCTAGCAGACTGATATGCTCCTTAAGGCAAGATGTTTGCTCCTCTGAAAT
CCAGCAGCATTCTGAGCATGTGCCGTTTTAGCACTTATCACAGCTATTAATATTTTACAGTCATCAGTTT
ACTTGTGCATCTCCCTAACTGGTTTGCAAACTTTTGATGGAATGATTCTATATATTCATCTGTTTCATTA
ACCTCTTGCAATTTGTTTGGAAATTTGTAAGAAATAATATAAAGGCCTTTAGAATGAGTTAATTTCTTGT
TCACAATGTGCAGCTTGCTGATAAATAAGCAATGAGGTTTTCCTTGTTGTCTAGCCTCCTGACATTTCTG
CCCAGGGTGCTAAACAGGAAGTATTGATTTTGATATTGGATTTCATTATTAGAGGCTTTGCCTTGAATTT
TTTTATATTGTTGTGTGAGTACTACTTAAAGATCAATTTTCTGTACCTTTACATATTTTCTAAAGTGCAG
AAGATGAAGATGGGTTTTACAACAAAGATAAGATCTTTCAGTAAGTCAGTGAGGTTGTTGTAGGTGGATT
GTTTTTTGTTTTTATCCACATTATAATAGCCTCCTTTTCCCAAGAAATACCCCTTTATGCAGGAGTAGGT
TTAACATAGTCTACATGCCATGTTTGGTTGGACTAAAGTTGTTACTAGATCCACATTTGACTAGTAAGAT
GCTATCCTCTCAATTTTGAATTAAGAGTGAGAGAAGTTAATCTTTGTCCCTTTTGAGTGACTCAAACCAA
GGAGAGTATAAACTCCAGAATTGTGAAACAGTCTTGTTCCTGGAGAAACAGAGACATCTGACTAAAGAGC
AGGATATATCACAAATACAAAAGATTTGCAGAGAAAAAGATTTTGTGTAGCTACAAAGATAGAGTGTTAG
AGAGAGAAAGGCATCAGTAATCCCAATTAGTTTTCATTTCCTCATTTCAGTTCCTTGTAAGGTCCTCTTG
CTTTCCTTGTTCTTGGATTCCATGATTTTCTACTGCTGGTACTGAGTGTAGAAGTCAAATGTTGCCTAGT
ATGTGATTTTGACGGGCATCCTGGGCAGCTCAGTGGTAAAGAATCCACTTGCCAATGCAGGAGACTCAGG
TTCGATCCCTGGGTCAGGAAGATCCCCTGGAGAAGGAAATGGCAATCCATTCCAGTATTATTGCCATGGA
CAGAGAACCCTGGCAGGCTACAGTCCATGGGTTGCAAAAGAGTTGGGCATGACTTAGGACTAAAAGCAGT
GTCATTTGGGGGAATATTTTTTTTTTTTTTTTACAGCCTTACCTGTTCCCTAGAACATTCTGCCAAATCA
GGGCTTTATCACCTGTGGATTTTATACTATTCCTTACCACCATGGCCAATCAGTTATAATGTCTGTCCTC
CAAATGCCATGCCCTAAGTAGTTTTCCTCCCCCGTCTCACTACCACTGTCTTAATGAGAACTCTCCCGTC
TGAACCAGTGTGATAGTCTCTTGATGGATTACCATGACTCTAGACTCTTTCTTCTCCAACTCATTTAACA
CAGCATAATGAGATACGTCTATCTAACTATGGTTTCTGGCTGTGATTGTCATACAAATTATGGATAACAT
ATGTAAGGGAACTAACACTGTGCATGGCATATCATTGAATATTAAATTCTCCACTACTGTGAAGATAGCA
CAACAATATTAACAGCGACTTTTTTTTTTTTAGTACTGACTATGTATTTGGTGCTTCACTTTATCTAATT
TACCTTTTAATAACTCTTTGAAGAACTATTACAAGTGAAGAACCCATCTGGGTTTGATTGCTGGGTTGGG
ACAATCCCCTGGAAGGCATGGCAACCCACTCCAGTATTCTTGCCTGGAGAATCCCATGGATAGAGGAGTC
TGGCAGGCTACAGTCCATGGCGTTGCAAAGAGTCAGACATGACTGAGTGACTAAGCACACAGCACAGTCA
TGTAATTACCTAGTTGCTGAGATTATATTCAAACCCAGGATTCCTAACTTCCAAGTATTTGCTTAACTTC
CATGCTATTATCTATTGCATCTCTGTATAATTCTTGTTTTCTTAGCTTGAAAACAAACTAAACATGGACC
CTTGTAGCTTGGAGAAACTAATAGAAGTGGGACACAAAATCTGATCCAAGAAAAAGCACCCCAACAGCCT
GCTGCAGATCCTGTACTTAATGTCCTTACAGCCTTGCAGTTTTGTGAAAAGCAACACTTCAGAGATTTTG
CTCTCCCTGAGAACTGTGAAACGTGGCCTTTGCTAGAGAATTGATGTTGTTTAGCCACTAAGTTGTGTCT
GACTCTTTTGCCACCCCATGGAATGTAGCCTGCCAGACTACTCTGTCCATGGGATTCTCTAGGCAAGAAT
ACTGGAGTGAGTTGCCATTTCCTTTTCCAGGGGATATTCCTAGCCTACGTCTCCTGCATCTCTTGCATTG
GCAGGCAGATTCTTTACTACTCAGCCACCTGGGAAGCAGAACGGCTGCAGTCAAATACAGACTGTGTCAC
TGCCTACCAAGTATGTGTCTATGCATATTAAATCCAGACAAAAGGATTTCAACAGTTGAAATTGGAGTGC
TGTCCAAAAACTTGGGGGTCACAGAACAATAGATCTTATACAATCTAATCTGATTTGACATAGGTTCAAA
TGTTTTTATATCAAAGTTTACATTATCATGCAGTGGTAAGACTGTATAGATCTGGCTATGTATTGCTATT
ATTCGCTTCATGTTAAATTAAAAAGAAAATATTCACAAAGAAGTAATCCCTTCTTCACAGAAAAAGCTAT
ATTAATCTTTTAAAATATATGATTTATAAAAGTCCATGAAAAACACAATTAATGGCCCTTTGAAAATCTT
ACTGTGTGGTGTGAAATGCACTTTTCCTATCATGGAGAAGGGATTACTGTTGTCCATATTTGCCTTGAAA
CCTCTATCCACAGTCCACTGTTGTTGATACCTCAGAAGGTATAAACTACTAAATATTATTGAATTTAGAT
GGAATGGATTCAGTAATGCAAAAATAAAGATTTCAAGTATACAGCCAACACTGGAAAGGGTCTTAGGTAA
AGGAGATCCTCAAATGACTTTCAGATATGTCATGATTTCTGTGGAGACAGTAGCCAATTATGGGGTGAGA
GAGAACTGAGAAGTAATCTTAGGATTGTACTGGGGTCTGCTTTTCCTAGAACCTTCATGGGTAGAGTTGC
TGAGGACCCACATTTGAATAATCAGACACTGCTATCTGACTGTTCCCATCAGAACCTATCCCCTTTCTCT
ATATATTTTTTCTTTACCTTGGATTTTTAAAAATTTATTTTAATTGGAGGCTAATTACTTTACAATATGG
TGGTGGTTTTTGCCATACATTGACATGATTCAGCCATGGGTGTCCATGTGTCCCCCATCCCAAGCCCCCC
TCCCACCTCTCTCCCCATCCCATCCTTCTGGGTTGTGCCAGTGCACTAGCTTTGAGTGCCCTGTTTCAAG
TGTCGAACTTGGACTGGCCATCTATTTCACATATGGTAACATACATGTTTCAATGCTATTCTCTCAAACT
ATCCCACCCTTGCCTTCTCCCACAGAGTCCAAAAGTCTGTTCTTTATATCTGTGTCTCTCTTGCTCTCTT
GCATATAGGGTCATTATTACCATCTTTCTAAATTCCATATATATGCATTAATATACAATATTGGTGTTTT
TCTTTCAGACTTACTTCACTCTGTATAATAGGCTCCAGTTTCATCCACCTCATTAGAACTGACTCAAACT
GGAGCCTATTATACAGAGTGAAGTAAGTCAGAAAGAAAAACACCAATATAGTATATTAATACATATATAT
GAAATTTGAGACATTACTTTGCCAACAAAGTTTCGTCTAGTCAAGGCTATGGATTGTTCCTGTGGTCATG
TATGGATGTGAGAGTTGGACTGTGAAGAAGGCTGAG

file2 is result.ods

subject id	 s. start	 s. end
gi|546669925|gb|AWWX01450616.1|	282	305
gi|546671471|gb|AWWX01449637.1|	771	790
gi|546669842|gb|AWWX01450698.1|	1523	1542
gi|546669842|gb|AWWX01450698.1|	1641	1660
gi|546671514|gb|AWWX01449617.1|	1926	1948
gi|546669842|gb|AWWX01450698.1|	2484	2503
gi|546669842|gb|AWWX01450698.1|	2720	2739
gi|546669842|gb|AWWX01450698.1|	2725	2744
gi|546669977|gb|AWWX01450566.1|	2822	2842

output:

and i want to extract region from like 282-305 from seq gi|546669925|gb|AWWX01450616.1| from file1 i.e smalldata.fasta.
i.e output should be like
23 charactes small string. (305-282=23)

moreover i also want to extract region 100 charactes back from 282 and 100 charactes forward from 305
i.e result should be like
100+23+100 characters long string i.e 223 character long string

the result file should be separate file from two input files

I shall be thankful to you if script made by you works for these two files i.e file1=smalldata.fasta
file2=result.ods

Thanku :slight_smile:

Please also post an output file sample.

it should be excel file, and should be like

column1      column2                         column3
seq id          23_character_seq           223_character_sequence

I mean an output sample, not an output format. And please use code tags.

And make sure that the output you show us includes the (exact) output you want produced for at least the following file2 input lines:

subject id	 s. start	 s. end
gi|546669925|gb|AWWX01450616.1|	282	305
gi|546669842|gb|AWWX01450698.1|	1523	1542
gi|546669842|gb|AWWX01450698.1|	1641	1660
gi|546669842|gb|AWWX01450698.1|	2484	2503
gi|546669842|gb|AWWX01450698.1|	2720	2739
gi|546669842|gb|AWWX01450698.1|	2725	2744
gi|546669977|gb|AWWX01450566.1|	2822	2842

Note that the 1st line is this file2 is related to one entry from file1, the next 5 lines from this file2 are related to another entry from file1, and the last line from file2 is related to an entry that is not found in file1.

Is the output for the 5 lines related to the string gi|546669842|gb|AWWX01450698.1| supposed to generate 5 sets of output OR is the output for those 5 lines supposed to be combined into 1 set of output duplicating some of the output (due to overlapping ranges) OR is the output for those 5 lines supposed to be combined into 1 set of output containing the non-overlapping regions of thee requested ranges 1423 through 1760, 2384 through 2603, and 2620 through 2844 (where the start and stop points have been extended 100 characters in each direction and the five overlapping input regions in file2 have been combined into a three non-overlapping output regions)?

And, for the last entry in file2, there is no entry in your sample file1. Is anything supposed to appear in the output for this case? If so, what?

And, just for the record, the number of characters specified by the range 282 through 305 is 24 characters; not 23. (If you don't see why that is true, take the simpler example where the range 282 through 282 is 1 character; not 0.)

sir that is excel file how can i post here?
however it is roughly like

sequence id	extracted region small	extracted region big upstream and downstream
gi|546669925|gb|AWWX01450616.1|	CACCTTGATCTTGGACTTCTAGC	"CCAGAGAAAAAAGAAGAGAAAAAAAATCACTTGGGGACATAGCAAGAAGGTGGCCATCCTCAAACCAAGG
AGAGAAGCCAGAAGAAACCAAACTTTCCAACACCTTGATCTTGGACTTCTAGCCTCCAGAACTGTGAGAA
AATAAATTTCTGTAGAGTCACCCAGTCTGTGGTATTTTGTTATGGCAGACCTAGCAGACTGATATGCTCC
TTAAGGCAAGA"

---------- Post updated at 03:30 AM ---------- Previous update was at 03:01 AM ----------

cragun sir, last entry yes i want to generate 5 set of output correspond to each gi|546669842|gb|AWWX01450698.1| entry. though it is occurring multiple times but positions are different so there should be 5 lines of result in the result file correspond to gi|546669842|gb|AWWX01450698.1| entry.

and last entry is there in file1 see entry no 4.

---------- Post updated at 04:27 AM ---------- Previous update was at 03:30 AM ----------

even now any problem exists sir?

---------- Post updated 08-07-15 at 12:23 AM ---------- Previous update was 08-06-15 at 04:27 AM ----------

hello, i am waiting for your answer sir. :slight_smile:

Try this, based on the good work of Scrutinizer in this post

awk '
BEGIN           {print "sequence id\textracted region small\textracted region big upstream and downstream"
                }
NR==FNR &&
FNR>1           {CNT[$1]++
                 S[$1,CNT[$1]]=$2
                 E[$1,CNT[$1]]=$3
                 next
                }
                {split ($1, T, " ")
                }
T[1] in CNT     {i=T[1]
                 $1=x
                 for (j=1; j<=CNT[T[1]]; j++)
                        print RS i "\t" substr ($0,S[i,j],E[i,j]-S[i,j]+1) "\t" substr ($0, S[i,j]-100, E[i,j]-S[i,j]+201)
                }
' file2 RS=\> FS='\n' OFS= file1

but i want new result file in the same folder where my other files are present file1 and file2 in the terminal it is showing nothing and also not creating any new file.

kindly help me out this is a major step of my research and am stuck here from 3 weeks :frowning:

I guess you are stuck because your specifications are a) unclear, and b) moving.

This is what I get from your samples above (lines chopped at 178 chars):

sequence id     extracted region small  extracted region big upstream and downstream
>gi|546671471|gb|AWWX01449637.1|        CACAAGACCACCAGGGAAGT    TGGGGTTGTACTGGGTCTTGGTTACAGGATCTTTAGTTGCAGCATGTGGGATCTAGATCCCTGTCCAGGGCCCTGAGTATGGGGAGCTCAGAGTCTTAGCCACAAGACCACCAG
>gi|546671514|gb|AWWX01449617.1|        ACACATACACATGCACACACAAC CCCCCGTAGTGGGGGTAGGTTGCTCTGTCAAGACCAAGGGCCAATTATTTTCTTACCATGAAAACCAAGAAGAAGGTGACTACAGGTGATTCAACCTCTAACACATACACATGC
>gi|546669842|gb|AWWX01450698.1|        GCCCAGCCCAGCCCAGCCCA    GCCGAGTTCAGCTCAGCTCAGCCCAGCAAAATTCAGCCCAGCTCAGCCCAGCAAAGCTCAGCCCAGCTCAGCCCAGCTCACCCAAGCTCAGCTCAGCTCAGCCCAGCCCAGCCC
>gi|546669842|gb|AWWX01450698.1|        GCCCAGCCCAGCCCAGCCCA    CAGCTCACCCACTCTGCCCAGCTCAGCCCAGCAAAGCTCAGCCAAGCTCAGCTCAGCTCAACAAAGCCCAGCTCAGCTCAGCCCAGGTCAACCCAACTAAGCCCAGCCCAGCCC
>gi|546669842|gb|AWWX01450698.1|        GCCCAGCCCAGCCCAGCCCA    CCCAGCTCAGCTCAGCCTAACCCAGCTCAGCCCAGCTCACCCACTCCGCCCAGCTCCGCCCAGCTCAGCCCAGCTCACCCACTCCGCCCAGCTCCGCTCAGCCCAGCCCAGCCC
>gi|546669842|gb|AWWX01450698.1|        GCCCAGCCCAGCCCAGCCCA    GCTCAGCCCAGCAAAGCTCAGCCAAGCTCAGCCCAGCTCAACAAAGCCCAGCTAAGCTCAGCCCAGGTAAACCCAACTAAGCCCAGCTCAGCTCAGCTCAGCCCAGCCCAGCCC
>gi|546669842|gb|AWWX01450698.1|        GCCCAGCCCAGCCCAGCCCA    GCCCAGCAAAGCTCAGCCAAGCTCAGCCCAGCTCAACAAAGCCCAGCTAAGCTCAGCCCAGGTAAACCCAACTAAGCCCAGCTCAGCTCAGCTCAGCCCAGCCCAGCCCAGCCC
>gi|546669977|gb|AWWX01450566.1|        AAAAAGTTTGTTTGGGTTTTT   TAAAGAATATGTATTACAAGGTTACTCCTAACTGTGAGAATCATTAAGCCTTTTTTTTCTATGAGATAATGTGGATGGTCGCCTATGTATGGGGTTGGCCAAAAAGTTTGTTTG
>gi|546669925|gb|AWWX01450616.1|        CACCTTGATCTTGGACTTCTAGCC        CCAGAGAAAAAAGAAGAGAAAAAAAATCACTTGGGGACATAGCAAGAAGGTGGCCATCCTCAAACCAAGGAGAGAAGCCAGAAGAAACCAAACTTTCCAACACCTT

Use full paths for your input files, and redirect the output to taste.

yes Rudic sir i want exactly this format but why it is chopping off character which i have marked as *

TGGGGTTGTACTGGGTCTTGGTTACAGGATCTTTAGTTGCAGCATGTGGGATCTAGATCCCTGTCCAGGGCCCTGAGTATGGGGAGCTCAGAGTCTTAGCCACAAGACCACCAG*************************************************************************************

actually i want next characters also upto 230 but it is just showing 100+ chatacters

can you please provide me the script which you are using. Ishall be thankfull to you

Post#9 is the result of the script in #7, lines chopped at 178 chars to keep the post reasonably small. The real result has the desired -100 to +100 chars.

yes it seems it is working. Thanku very very much Sir indeed!!!!!!! :slight_smile: :slight_smile:

To get the output format requested in post #6 (no > at the start of the output lines, line breaks at 70 characters in the last field, and double quotes around the last field, you could also try:

#!/bin/ksh
results_file="${1:-result.ods}"
fasta_file="${2:-smalldata.fasta}"
awk '
BEGIN {	printf("%s\t%s\t%s\n", "sequence id", "extracted region small",
		"extracted region big upstream and downstream")
}
FNR == NR {
	# Process 1st input file.
	if(NR > 1) {
		# Skip header.
		beg[$1, ++si[$1]] = $2
		len[$1, si[$1]] = $3 - $2 + 1
		ubeg[$1, si[$1]] = ($2 > 100) ? $2 - 100 : 1
		ulen[$1, si[$1]] = $3 - ubeg[$1, si[$1]] + 101
	}
	next
}
/^>/ {	if(f) {	# Process previous entry and clear accumulated data.
		psi()
		b = 0
		d = ""
	}
	# Grab sequence ID from this line in the 2nd file.
	if(!(f = substr($1, 2)) in si)
		# This sequence ID is not in our list of those to be processed.
		f = ""
	next
}
f {	# We have a line in the 2nd file associated with a sequence ID to be
	# processed, gather data:
	d = d $0
}
function psi(	i, j) {
	# Produce output for each requested region of the sequence ID specified
	# by f.
	for(i = 1; i <= si[f]; i++) {
		printf("%s\t%s\t\"", f, substr(d, beg[f,i], len[f,i]))
		spot = ubeg[f, i]
		left = ulen[f, i]
		for(left = ulen[f, i]; left > 0; left -= 70) {
			printf("%s%s\n",
				substr(d, spot, (left > 70) ? 70 : left),
				(left > 70) ? "" : "\"")
			spot += 70
		}
	}
}
END {	if(f)	# Process last entry.
		psi()
}' "$results_file" "$fasta_file"

which, with the data provided in post #1 in this thread produces the output:

equence id	extracted region small	extracted region big upstream and downstream
gi|546671471|gb|AWWX01449637.1|	CACAAGACCACCAGGGAAGT	"TGGGGTTGTACTGGGTCTTGGTTACAGGATCTTTAGTTGCAGCATGTGGGATCTAGATCCCTGTCCAGGG
CCCTGAGTATGGGGAGCTCAGAGTCTTAGCCACAAGACCACCAGGGAAGTTTCCAGTTACACGATCATTT
TAGTTAGATAAATATTTTGTGTTTACATTATTACTGTATCAGTGATATTCACACTGAATTATACAATGTG
ATTTTTACAC"
gi|546671514|gb|AWWX01449617.1|	ACACATACACATGCACACACAAC	"CCCCCGTAGTGGGGGTAGGTTGCTCTGTCAAGACCAAGGGCCAATTATTTTCTTACCATGAAAACCAAGA
AGAAGGTGACTACAGGTGATTCAACCTCTAACACATACACATGCACACACAACGTGGACACTCAGAGAGT
TGAGTTAAAGCATAACTATTTTACCTCCAAATTACTGCTAATGCTGAAAAGTACAGGTATTTATCTAATG
TGTTTCAGGGTCA"
gi|546669842|gb|AWWX01450698.1|	GCCCAGCCCAGCCCAGCCCA	"GCCGAGTTCAGCTCAGCTCAGCCCAGCAAAATTCAGCCCAGCTCAGCCCAGCAAAGCTCAGCCCAGCTCA
GCCCAGCTCACCCAAGCTCAGCTCAGCTCAGCCCAGCCCAGCCCAGCCCAGCTCACCCACTCTGCCCAGC
TCAGCCCAGCAAAGCTCAGCCAAGCTCAGCTCAGCTCAACAAAGCCCAGCTCAGCTCAGCCCAGGTCAAC
CCAACTAAGC"
gi|546669842|gb|AWWX01450698.1|	GCCCAGCCCAGCCCAGCCCA	"CAGCTCACCCACTCTGCCCAGCTCAGCCCAGCAAAGCTCAGCCAAGCTCAGCTCAGCTCAACAAAGCCCA
GCTCAGCTCAGCCCAGGTCAACCCAACTAAGCCCAGCCCAGCCCAGCCCAGCTCACTCATGCCACCCTGC
TCAGGCCAGCTCAACCCAGCTCAGGCCAGCTCAGCCCAGCTCAACCCAGCCCAGCCCAGCTCACCCACTC
TGCCCAGCTC"
gi|546669842|gb|AWWX01450698.1|	GCCCAGCCCAGCCCAGCCCA	"CCCAGCTCAGCTCAGCCTAACCCAGCTCAGCCCAGCTCACCCACTCCGCCCAGCTCCGCCCAGCTCAGCC
CAGCTCACCCACTCCGCCCAGCTCCGCTCAGCCCAGCCCAGCCCAGCCCAGCTCCGCTTAGCCCAGCCCA
GCCCAACCCAGCTCACCCACTCTGCCCAGCTCAGGGCAGCTCAACCCAGCTCAGGCCAGCTCAACCCAGC
CCAGCCCAGC"
gi|546669842|gb|AWWX01450698.1|	GCCCAGCCCAGCCCAGCCCA	"GCTCAGCCCAGCAAAGCTCAGCCAAGCTCAGCCCAGCTCAACAAAGCCCAGCTAAGCTCAGCCCAGGTAA
ACCCAACTAAGCCCAGCTCAGCTCAGCTCAGCCCAGCCCAGCCCAGCCCAGCCCAGCTCACTCATGCCAC
CCTGCTCAGGCCAGCTCAACCCTGCTCAGGCCAGCTCAACCCAGCTCAGGCCAGCTCAGCCCAGCTCAAC
CCAGCCCAGC"
gi|546669842|gb|AWWX01450698.1|	GCCCAGCCCAGCCCAGCCCA	"GCCCAGCAAAGCTCAGCCAAGCTCAGCCCAGCTCAACAAAGCCCAGCTAAGCTCAGCCCAGGTAAACCCA
ACTAAGCCCAGCTCAGCTCAGCTCAGCCCAGCCCAGCCCAGCCCAGCCCAGCTCACTCATGCCACCCTGC
TCAGGCCAGCTCAACCCTGCTCAGGCCAGCTCAACCCAGCTCAGGCCAGCTCAGCCCAGCTCAACCCAGC
CCAGCTCACC"
gi|546669977|gb|AWWX01450566.1|	AAAAAGTTTGTTTGGGTTTTT	"TAAAGAATATGTATTACAAGGTTACTCCTAACTGTGAGAATCATTAAGCCTTTTTTTTCTATGAGATAAT
GTGGATGGTCGCCTATGTATGGGGTTGGCCAAAAAGTTTGTTTGGGTTTTTCCACATGCTGGTATAGAAA
ACTTGAATACACTTTTTGGCCAACCCAGTAAGGGCTTTGCCTCATCTCTGTCTAGCCAAATTGCCACCTT
CCCTGCTAAGC"
gi|546669925|gb|AWWX01450616.1|	CACCTTGATCTTGGACTTCTAGCC	"CCAGAGAAAAAAGAAGAGAAAAAAAATCACTTGGGGACATAGCAAGAAGGTGGCCATCCTCAAACCAAGG
AGAGAAGCCAGAAGAAACCAAACTTTCCAACACCTTGATCTTGGACTTCTAGCCTCCAGAACTGTGAGAA
AATAAATTTCTGTAGAGTCACCCAGTCTGTGGTATTTTGTTATGGCAGACCTAGCAGACTGATATGCTCC
TTAAGGCAAGATGT"

I also have a version that will work with versions of awk that can't handle "long" strings, but it splits the last field on boundaries based on the uncombined input lines instead of the hard coded 70 character maximum segments used by the code above.

Note also that the above code will work even if the requested region starts before the 101st character. In that case it will truncate the leading context to start at character position 1. All of the code provided so far obviously truncates the trailing context if less than 100 characters of trailing context are present in the input.

I don't understand how this output is useful when there are multiple outputs for a single sequence ID and nothing in the output identifies what range from the input sequence is included in the output; but this is what you said you wanted...

cragun sir, how to run this command from terminal ? i mean i need to save it with .awk extension or to paste these lines directly into the terminal? i did do the later but it shows infinite loop.

---------- Post updated at 12:31 AM ---------- Previous update was at 12:29 AM ----------

your output format and output is excellent output as like rudic sir in post 7 and 9 but i also want to run this too sucessfully. :slight_smile:

You can save Don's suggestion to a file with a name of your liking, for example: /some_dir/fasta_extract

Then do the following to make it executable:

chmod +x /some_dir/fasta_extract

And then you should be able to run it like this:

/some_dir/fasta_extract /some_other_dir/result.ods /some_other_dir/smalldata.fasta 

If all files are in the same directory, and you are also in that same directory, then you can use:

./fasta_extract result.ods smalldata.fasta

And if the input files actually have these names, then you can run it is:

./fasta_extract

Since these are the default names that are used in the script.

With all these commands you can use redirection to put the data in a new file:

command > newfile
1 Like

In addition to what Scrutinizer already said, note that the script and both of the input data files must be in UNIX text file format (with a single <newline> character as the line terminator); not Windows format (with <carriage-return> <newline> characters pairs as the line terminator); and not text produced by some text formatting tool like Microsoft word .

i am trying to run :slight_smile:

hello sir, i am getting good results with this script, but what if i want to extract another col from file2 followed by seq_id column?

---------- Post updated at 05:47 AM ---------- Previous update was at 05:29 AM ----------

i mean how can i modifies this script

awk '
BEGIN           {print "\query id\tsequence id\textracted region small\textracted region big upstream and downstream"
                }
NR==FNR &&
FNR>1           {CNT[$1]++
                 S[$1,CNT[$1]]=$2
                 E[$1,CNT[$1]]=$3
                 next
                }
                {split ($1, T, " ")
                }
T[1] in CNT     {i=T[1]
                 $1=x
                 for (j=1; j<=CNT[T[1]]; j++)
                        print RS i "\t" substr ($0,S[i,j],E[i,j]-S[i,j]+1) "\t" substr ($0, S[i,j]-100, E[i,j]-S[i,j]+201)
                }
' result.txt RS=\> FS='\n' OFS= 1.fasta >output_1

to extract one more column data means column no. 4 from the file2 i.e result.xls

Please use code tags as required by forum rules!

The better the spec, the better the solution, as you certainly learned. With what you show us (i.e. no input nor output sample), I'd propose to save the new column in an array (as you do with the other fields) when reading result.xls (or .txt, unclear to me), and then print it in the for loop together with the other relevant fields.

query_id  subject id	  s. start	 s. end
3453  gi|546669925|gb|AWWX01450616.1|  282	   305
5676  gi|546671471|gb|AWWX01449637.1|	  771	   790
8765  gi|546669842|gb|AWWX01450698.1|	  1523  1542
6578  gi|546669842|gb|AWWX01450698.1|	  1644  1660
9087  gi|546671514|gb|AWWX01449617.1|	  1926  1948

like i want to extract query id along with subject id from this xls file.