Remove original file from directory after bash executes

cmccabe · August 23, 2016, 5:45pm

The below bash works great, except I can not seem to delete the original file $f from the directory. Thank you

For example, after the bash executes there are 8 files in the directory:

123.txt (original file)
123_remove.txt
123_index.txt
123_final.txt
456.txt (original file)
456_remove.txt
456_index.txt
456_final.txt

Bash

for f in /home/cmccabe/Desktop/microarray/*.txt; do
     bname=$(basename "$f")
     pref=${bname%%.txt}
     sed /^#/d "$f" > /home/cmccabe/Desktop/microarray/"${pref}"_remove.txt # strip off all lines with #
     awk -F'\t' -v OFS='\t' '{$0=((NR==1) ? "R_Index" : (NR - 1)) OFS $0} 1' /home/cmccabe/Desktop/microarray/"${pref}"_remove.txt > /home/cmccabe/Desktop/microarray/"${pref}"_index.txt # add R_Index
     awk -F'\t' 'NR==1{Q=NF;print} NR>1{for(i=1;i<=Q;i++){if(!$i){$i="."}};print}' OFS="\t" /home/cmccabe/Desktop/microarray/"${pref}"_index.txt > /home/cmccabe/Desktop/microarray/"${pref}"_final.txt # replace null values with .
done

cd /home/cmccabe/Desktop/microarray
find . -type f -iname \*.txt -delete (removes all .txt files)
find . -type f -iname \*_remove.txt -delete
find . -type f -iname \*_index.txt -delete

desired output in directory

123_final.txt
456)final.txt

Don_Cragun · August 23, 2016, 7:04pm

You're missing the point. The two files you haven't removed with your script are named 123.txt (original file) and 456.txt (original file) . Those names end with the string file) not with the string .txt .

To get rid of them, try:

rm -f *'.txt (original file)'

or:

rm -f *'.txt '*

cmccabe · August 23, 2016, 7:23pm

The file names are 123.txt and 456.txt , I only added the (original file) to show what was not being removed. Thank you :).

Don_Cragun · August 23, 2016, 7:33pm

Adding comments to what appears to be output from the command ls -1 (that is a digit 1, not the letter lowercase el) and adding comments only confuses those of us trying to help you.

Please show us the output you get from the commands:

ls -1

and:

ls -1|od -bc

Chubler_XL · August 23, 2016, 7:35pm

You should be careful of:

find . -type f -iname \*.txt -delete

As this will also remove 123_final.txt and 345_final.txt files as they match the \*.txt pattern.

perhaps this is closer to what you are after:

find . -type f -iname \*.txt ! -iname \*_final.txt -delete

RudiC · August 24, 2016, 2:45am

As usual - what's the real result of that operation and what be the error messages?

You may know - or suspect at least - that your approach is overly complicated?

cmccabe · August 24, 2016, 9:15am

Here is the output of the commands @Don Cragun. Thank you :).

ls -1

123_final.txt
123_index.txt
123_remove.txt
123.txt
456_final.txt
456_index.txt
456_remove.txt
456.txt

ls -1|od -bc

0000000 061 062 063 137 146 151 156 141 154 056 164 170 164 012 061 062
          1   2   3   _   f   i   n   a   l   .   t   x   t  \n   1   2
0000020 063 137 151 156 144 145 170 056 164 170 164 012 061 062 063 137
          3   _   i   n   d   e   x   .   t   x   t  \n   1   2   3   _
0000040 162 145 155 157 166 145 056 164 170 164 012 061 062 063 056 164
          r   e   m   o   v   e   .   t   x   t  \n   1   2   3   .   t
0000060 170 164 012 064 065 066 137 146 151 156 141 154 056 164 170 164
          x   t  \n   4   5   6   _   f   i   n   a   l   .   t   x   t
0000100 012 064 065 066 137 151 156 144 145 170 056 164 170 164 012 064
         \n   4   5   6   _   i   n   d   e   x   .   t   x   t  \n   4
0000120 065 066 137 162 145 155 157 166 145 056 164 170 164 012 064 065
          5   6   _   r   e   m   o   v   e   .   t   x   t  \n   4   5
0000140 066 056 164 170 164 012

---------- Post updated at 08:15 AM ---------- Previous update was at 08:13 AM ----------

I will give the

find . -type f -iname \*.txt ! -iname \*_final.txt -delete

Thank you @Chubler_XL

@RudiC there was no error as the bash did execute just didn't remove all the desired files, but I will try out the posted command. You are probably right in that there is less complicated/more optimal approach to this, but as I am a scientist I still have much too learn about shell scripting and being more efficient, but I am learning. Thank you :).

RudiC · August 24, 2016, 10:22am

As Chubler_XL already pointed out, the

find . -type f -iname \*.txt -delete

will delete ALL .txt files INCLUDING the original ones, UNLESS e.g. permissions get into the way, which in turn would result in error messages. BTW, the find ... for "_remove" and "_index" are unnecessary when above is run first.

RudiC · August 24, 2016, 10:48am

Try

awk -F'\t' -v OFS='\t' '

/^#/            {next
                }

FNR == 1        {Q = NF
                 if (FN) close (FN)

                 print "echo rm " FILENAME

                 FN = FILENAME
                 sub (/\./, "_final.", FN)
                }

                {for (i=1; i<=Q; i++) if (!$i) $i="."
                 print FNR==1?"R_Index":FNR-1, $0 > FN
                }
' OFS="\t" /home/cmccabe/Desktop/microarray/*.txt  | sh

in lieu of your original bash script in post#1. Unfortunately, I can't test it, but it should create the desired output files (..._final...) with the desired index in front of every single line, and delete the original files (if the "echo" is removed).

Please report on success or failure...

cmccabe · August 24, 2016, 11:44am

@RudiC the awk works great, thank you :). I am more-or-less seeing how it works, much more efficient.

Don_Cragun · August 24, 2016, 2:19pm

Also untested, but with no need for find (unless you also want to process files in subdirectories) or awk :

for file in /home/cmccabe/Desktop/microarray/*.txt
do	[ "${file%_final.txt}" != "$file" ] && continue
	echo rm "$file"
done

or, more efficiently,

cd /home/cmccabe/Desktop/microarray/ &&
for file in *.txt
do	[ "${file%_final.txt}" != "$file" ] && continue
	echo rm "$file"
done

And, in both case remove the echo if the list of commands displayed matches your expectations.

cmccabe · August 25, 2016, 1:55pm

Thank you @Don Cragun for another learning approach, I appreciate it