BASH: File name part to list reference problem.

I've made a habit of including a four-letter "tail" on image file names I download from the Web, so I can both match them with IPTC Transmission References of my own making and rename them later using either a GUI renamer or a script I've written myself. Now I want to automate the process of writing the TRs to the files by way of Exiv2, leaving just the renaming stage of my routine before moving on to "filing them away" in categorized subfolders, CDs, etc.

Using OpenOffice Calc, I was able to create a list of these four-letter suffixes and the TRs to which they correspond, sort by the former and output to a text file. I added an extra field of IPTC Categories (also of my own making -- doesn't seem to matter when Categories is in the process of being dropped from the IIM). The script I have works with one file at a time, as my line-by-line command-line tests in a terminal emulator have proven, but something goes haywire when applied (as I have done so) to a whole folder of files and the complete list of suffixes, TRs and categories all at once.

I doubt I'm either using the right loop types to process this data, nor am I at all sure that I have the loops that are there nested correctly in the script. The output I've got so far happens to be the "natural" name of the last suffix in the list. What I want is the "natural" name corresponding to the suffix of the file being "looked at" by the script.

Here's the script as it reads so far:

for a in $(ls *.jpg); # Find files ending in "jpg" in the current directory
do
	bargirl=$(echo $a)
	while read 'line';
	do
		souse=$(echo $line)
		drunk=${souse%:*}
		verydrunk=$(echo $souse | cut -d":" -f2)
		firstdrink=$(echo $verydrunk | cut -d, -f1)
		seconddrink=$(echo $verydrunk | cut -d, -f2)
		thirddrink=$(echo $verydrunk | cut -d, -f3)
done<downloads-xreference.txt # The text file with the three columns (suffix, natural name and Category). Terminate the 'while read' loop
jackdaniels=$(echo ${bargirl:(-4)}) # Take the ".jpg" off the file name	
singlemalt=$(echo $bargirl | cut -d'.' -f1) # Ditto
puregrain=$(echo ${singlemalt%????}) # Now strip the suffix off -- comes from the renaming script
jimbeam=$(echo ${singlemalt:(-4)}) # "What was that suffix code again?"
if [[ $jimbeam -eq $seconddrink ]]; # Does it exist in column 3 of the text file? Does it match the one on the file we're looking at? 
then
	chaser=$firstdrink # Variable "chaser" should be the matched 4-letter suffix.
	echo $chaser # "Show me the money."
fi
done #Terminates the 'for' loop

BZT

Yes, it looks like some of your problem is your loop structure. The while loop is capturing the last record from your text file and comparing that against the file name component. I would guess that you want the filename, or some part of the filename, compared to the information in each line of the text file until you find a match.

It's not efficient, but the logic below might do what you need.



for f in $(ls *.jpg)
do
        suss off filename data from f

        while read buf
        do
                if filename data == desired field in buf
                then
                        echo what you need
                        break                   # exit while
                fi
        done <data_file
done

I couldn't tell from your code whether the filenames had the suffix tacked onto the end (picture.jpgxxxx) or picture.xxxx.jpg. The ls command implies the latter, but you seem to be stripping off xxxx as the trailing 4 characters. You might be getting tripped up with this too. If you post a sample of filenames and a sample of your text file, it'd make giving suggestions a bit easier.

Nitpicking now.... the statement

souse=$(echo $line)

can be written more simply:

souse="$line"

This is easier to read, and depending on the shell it more efficient. There might also be ways to make sussing the field data from the text file more efficient; using external processes like 'cut' introduce overhead that can eat your lunch as far as performance is concerned.

They were more like this (a current example right from the "victim" directory):

gae72-7201-077-005biki.jpg

My rename script, which I run once these have been annotated according to the four letters preceding the ".", chops off that four-letter substring with what you saw in my script in this thread as the 'puregrain' variable (in the other script, it's an m.)

I asked about performance and efficiency in simplification on one thread I started over on the LQ forum. The person there who was giving me advice (and has done so, before and since) was rather vague about it. If it's a shell-by-shell or build/version-by-build/version thing, I understand why now. Trial and error -- don't mind it so long as my bash will still fork commands in the morning :slight_smile:

BZT