Cat files situation

Hello,
I am PhD student (Biomedical sciences) and very new to Linux. I need some help with the following task :

I have files in the following format for their names :

An_A1_nnn_R1.txt;  An_A1_nnm_R1.txt;  An_A1_nnoo_R1.txt
An_A2_nnn_R1.txt;  An_A2_nnm_R1.txt;  An_A2_nno_R1.txt

..................
An_A900_nnn_R1.txt;  An_A900_nnm_R1.txt;  An_A900_nno_R1.txt

Between An_A(1..900) and R1 the string can be anything.

I need to do this:

cat An_A1_nnn_R1.txt  An_A1_nnmm_R1.txt   An_A1_nnoo_R1.txt > An_A1.R1.txt
```[/i]


obviously I can do 900 times a cat command but I am sure there should be and easy way using a for loop or something like that. 

any ideas?

Thanks in advance

Julio

If you have them all in one directory, you can easily do something like this:

cat *.txt >> your_big_file.text
1 Like

thanks Neo, maybe I did not explain it correctly..
what you suggest will cat all files in the directory...but I need to cat them depending on the number after "An_A" . So, since the name of the file go from An_A1* to An_A900* I finally will have 900 cat results..

---------- Post updated at 05:18 PM ---------- Previous update was at 04:56 PM ----------

I thought on doing something like this:

-----------------------------------

for i in {1..900}
       do 
                 cat *A$i*R1.txt > A$i.R1.txt
      done

--------------------------------------------

but it does not work...

You should append instead:

cat *A${i}*R1.txt >> A${i}.R1.txt
1 Like

*A$i*R1.txt would include A$i.R1.txt so A$i.R1.txt would be both read from and written to..

So instead try:

for i in {1..900}
do 
  cat *_A$i_*_R1.txt >> A$i.R1.txt
done

or

for i in {1..900}
do 
  cat *_A$i_*_R1.txt > A$i.R1.txt
done

if you want to empty the target files first..

1 Like

actually , using "trial and error" I just found that all these 3 cat codes do the job

----------------------

#!/bin/bash

for i in {1..900}
 do 

    #cat *A"$i"*R1* > A"$i".R1.txt
    #cat *A"$i"*R1* >> A"$i".R1.txt
    cat *A${i}*R1.txt >> A${i}.R1.txt

done

---------------------

thanks again Yoda and Neo

Hi Julio, I still see this as problematic. If you repeat the command then the second time the target files will exist and match the pattern, so they will be cat onto themselves.. I did a short test and I ended up with a very big file...

thanks ,
I guess if I change the target name to something like : A"$i".R1.fastq.txt
it should work since since the string fastq is not present in any of the files to be merge with cat? what do you think?

Yes that should work too, as long as the output file names do no match the input file patterns..

1 Like