input2.bed (This file is a binary file not readable by the terminal).
But, there is a program in our field that executes by taking this
input2.bed
program input_file -chrom -start -end output_file
Now, my task is this
Read input1.bed's each record
Feed it in the following way to the program, so that the program executes in a continuous loop for each record in input1.bed this way and generate the output files with each input1.bed's record as their name
program input2.bed -chrom=chr1 -start=100 -end=200 chr1_100_200_op.bed
program input2.bed -chrom=chr1 -start=120 -end=300 chr1_120_300_op.bed
program input2.bed -chrom=chr1 -start=145 -end=226 chr1_145_226_op.bed
program input2.bed -chrom=chr2 -start=567 -end=600 chr2_567_600_op.bed
Now, ignore the first three columns of the above output file, but consider the maximum fourth column value, which is 111.11 and replace the entire contents of my chr1_100_200_op.bed with just the file name, which will be this one
cat chr1_100_200_op.bed
chr1_100_200 111.11
This is it. Please ask me as many questions as you have for a better solution. Thanks a ton for all your time.
while read CHROM START END NAME
do
# Create the bed file
program input2.bed -chrom=$CHROM -start=$START -end=$END ${CHROM}_${START}_${END}_op.bed
# Replace column 1 with filename,
# column 2 with the last column,
# reduce it to 2 columns,
# and print all lines.
awk '{$1=F ; $2=$NF; NF=2 } 1' F="${CHROM}_${START}_${END}" ${CHROM}_${START}_${END}_op.bed > /tmp/$$
cat /tmp/$$ > ${CHROM}_${START}_${END}_op.bed
done < input1.bed
# Remove temporary file
rm -f /tmp/$$
For 3 and 4, you start with 3 lines and end with 1 line. Is this intended? I've assumed it's not, that you want 3 lines out for 3 lines in.
For 3 and 4, usually the output file has thousands of records. But, I want to consider the maximum value of fourth column and print the filename as another column.
So, the three records will go out and only one record will remain, as in the example.
while read CHROM START END NAME
do
# Create the bed file
program input2.bed -chrom=$CHROM -start=$START -end=$END ${CHROM}_${START}_${END}_op.bed
# Replace column 1 with filename,
# column 2 with the last column,
# reduce it to 2 columns,
# and print all lines.
awk '(!M)||(M<$NF){ M=$NF } END { print F, M }' F="${CHROM}_${START}_${END}" ${CHROM}_${START}_${END}_op.bed > /tmp/$$
cat /tmp/$$ > ${CHROM}_${START}_${END}_op.bed
done < input1.bed
# Remove temporary file
rm -f /tmp/$$