Renaming Files Based on Contents

Hello everyone,

I currently have a situation which is causing me some issues for keeping up with certain files. I will explain this to the best of my abilities.

I have a list of files as follows

50_REPORT_1111 - file contains the word Car
50_REPORT_2222 - file contains the word House
50_REPORT_3333 - file contains the word Dog
50_REPORT_4444 - file contains the word Apple
50_REPORT_5555 - file contains the word Orange
50_REPORT_6666 - file contains the word Grape

As you can see the name between the _ and the _ are all the same, however each file contains different information. The files are all text.

So what i'd like to do is some sort of Unix script that would go through and read the files if they contained "_REPORT_" in the file name and after the first time it matched the word Car while reading the file, it would then rename the file to 50_REPORT.CAR_1111 and then continue searching the specified directory for another file that contains _REPORT_ until there are no more left and all of the files have been renamed.

I am honestly not sure where to start this, so if someone could at least point me in the right direction I might be able to figure it out from there.

I hope what I said makes sense.

Thank you for your time reading this.

You should have the list of words you're searching for.

for F in $(ls *REPORT_*)
do
    for W in $LIST_OF_WORDS
    do
        if grep $W $F
        then
            NEW=${F:0:9}.$W_${F:9}
            echo $NEW
            # mv $F $NEW # Uncomment this to process
            break
        fi
    done
done

I understand the idea... I think.

You are making the value of F the file name by listing any file in the directory with *REPORT_*

For W in you are saying have a seperate file with all of the words in it. I see you are using $LIST_OF_WORDS as the variable but where do you call in the actual file name?

Also if the word or phrase that is the file is CARS WITH RED PAINT, I may rename that file to 50_REPORT.CARSRED_1111, so based on what I understand from what you wrote it is using the list of words to write out the file name. So if I had CARS WITH RED PAINT in the list of words or the phrase, it'd make the report name 50_REPORT.CARS WITH RED PAINT_1111. Is there a way to have it make the name of the files different?

I apologize I am trying to pick this up but what you wrote is a little bit over my head.

You are understanding how it works.
One thing you can do is to make a file with the phrases and the corresponding name you want to insert in the filename like :

"CARS WITH RED PAINT"    CARSRED
"CARS WITH BLUE PAINT"    CARSBLU
...

The script would become

for F in $(ls *REPORT_*)
do
    while read W N
    do
        if grep "$W" $F
        then
            NEW=${F:0:9}.$N_${F:9}
            echo $NEW
            # mv $F $NEW # Uncomment this to process
            break
        fi
    done < file-with-names
done

Ok so to test this, to make sure I am doing this right... I created the file "names" and i placed

"CARS WITH RED PAINT"    CARSRED
"CARS WITH BLUE PAINT"    CARSBLU 

in the file.

I then created two files, one that had nothing but CARS WITH RED PAINT and one that had nothing but CARS WITH BLUE PAINT. Files were named 99_REPORT_2222 AND 99_REPORT_3333 and I placed them into the same directory. I then just changed the script you wrote out from "file-with-names" to "names and placed that into a file called "testscript". I did a chmod 777 so I could execute it no the testscript and then ran it. The result was

root@testbox:/testscript # ls
99_REPORT_2222  99_REPORT_3333  names           testscript

So I am assuming I am doing something wrong or not understanding exactly how I am suppose to do it.

Any advise would be appreciated.

Thank you

After reading online for a bit I found "while read line" instead of while read, I am not sure if it is applicable but I did get a little further, or so it seems. Instead of coming directly back to a # prompt it now shows..

root@testbox:/testscript # ./testscript
CARS WITH RED PAINT
./testscript[7]: NEW=${F:0:9}.$N_${F:9}: bad substitution
root@testbox:/testscript #

Side note, based on what I can tell it is not pulling the $N value which I am assuming is suppose to be the new name of the file

Ok i found what went wrong :
I'ts i the read which retruns something like

$W="CARS
$N= WITH RED PAINT" CARSRED

You will have to put the name before the string and no need to quote it l
names:

CARSRED    CARS WITH RED PAINT
CARSBLU    CARS WITH BLUE PAINT

testscript:

#!/bin/bash
for F in $(ls *REPORT_*)
do
    while read N W
    do
        if grep -q "$W" $F
        then
            NEW=${F:0:9}.${N}${F:9}
            echo $NEW
            # mv $F $NEW # Uncomment this to process
            break
        fi
    done < names
done

I made the modifications that you mentioned but I am still stuck on this error.

./testscript[7]: NEW=${F:0:9}.${N}${F:9}: bad substitution

Any ideas where this one might be coming from?

Strange. I'd like to know what the values of F and N are before that. So insert this line before the problematic one

echo "F='$F' - N='$N'"

F='99_REPORT_2222' - N='CARSBLU'
./testscript[8]: NEW=${F:0:9}.${N}${F:9}: bad substitution

That's right, I don't have the same problem what we can do ist to find what expression gives you that error so comment that line

# NEW=${F:0:9}.${N}${F:9}

and put this under to see what happens :

echo ${F:0:9}
echo ${F:9}

what shell you are using, is it csh?...in frans posts it seems either bash or ksh

I am using ksh according to the command ps -p $$. This is an AIX 5.3 TL 11 box not sure if that helps with anything

NEW=${F:0:9}.${N}${F:9}

Try wrapping this in quotes like:

NEW="${F:0:9}.${N}${F:9}"

root@testbox:/testscript # ./testscript
F='99_REPORT_2222' - N='CARSBLU'
./testscript[8]: NEW="${F:0:9}.${N}${F:9}": bad substitution

I put it on quotes and received that, however if anyone does mind can they explan what exactly the 0:9 is doing? I might be able to play with it a bit more if I understood that part of it.

Thanks

---------- Post updated at 11:29 PM ---------- Previous update was at 11:24 PM ----------

I just saw your post, I did the
echo ${F:0:9} and the
echo ${F:9} one at a time but they are both giving the error bad substitution.

It just seems to end the script once it hits that error.

${F:0:9} # extracts 9 chars from pos 0 from string $F.
${F:9} # extracts from pos 9 to end.

If this doesn't work yous should use a sed or awk expression to perform that.
Something like

NEW=$(echo $F | sed "s/REPORT/REPORT.$N/")

well I am not really sure why the other command didn't work and I figured you were pulling character positions with the 0 and 9.. but sed seems to work.

I was reading online about ksh versions last night and I ran into a couple articles regarding ksh93 and support for using substrings. The version currently on the sy stem is M-11/16/88f, I am assuming it is something like ksh88 but I had to get to bed last night so I didn't finish my research.

Well thank you guys for all of your help, I guess I am going to use sed. If anyone has any ideas though as to why it may not have worked feel free to share.

Thanks again