If I have a long list data inside a file, how I can divide the data into different file?
I need three data inside each file.
For example, my data source got 300 sequence.
I need it to divide 3 sequence in a file. Total desired output are 100 files that content 3 sequence each.
Do anybody got idea to solve my trouble?
Thanks a lot for all of your guide.
---------- Post updated at 02:31 AM ---------- Previous update was at 01:57 AM ----------
shell equivalent:
c=2
while read line; do
case $line in
\>*) c=$((c+1));; # if label is found then increase counter
esac
if [ $c -eq 3 ]; then # if 3 labels have been found then
exec>${line#>} # redirect output to file "label"
c=0 # reset counter
fi
echo $line # print output to current output file
done<infile
Thanks a lot for your suggestion.
But it seem like it only split two sequence and put the third sequence header read as the file name?
Can I know what is the problem going on?
Thanks again for your kindly help and advice
Yup.
You are right.
My other sequence of header contains space and some got ":" inside the header.
But after I using sed to substitute all of this spaces and ":" with "_".
Like:
sed 's/ /_/g' file_data > file_data.out
sed 's/:/_/g' file_data.out > file_data.final.txt
By using the command that you suggested, end up all of my file name will got the "?" at the end of the file name.
Besides that, after I look at the contents of each file produced, it only contains two sequence instead of three sequences inside each file.
Really thanks of your advice
Hi Scrutinizer, do you have any idea to get my desired output result?
I try to replace the space of header with "_" and try your suggested code.
Unfortunately, it still can't work
Thanks a lot for your advise.
The problem is, I put random spaces and : characters inside the labels of your input examples you gave and both scripts still work as expected. I have to assume your real world data sets somehow do not correspond with the input format you provided. You would have to take a small part (say 7 records) of an actual, anonymized, file, then run my scripts on them to see if they also produce the strange results and then post that example input file here, and also list the strange resulting file names and their content, so I can have a look.