#!/bin/bash
SRC=$1
function main {
validate
run
}
function validate {
if [[ -z $SRC ]]; then
printf "You need to supply a source.\n";
exit 1;
elif [[ -f $SRC ]]; then
printf "This is a file! You need to specify a directory.\n";
exit 1;
fi
}
function run {
FILES=`find $SRC -type f |sed 's_ _\ _g' | head`
for FILE in $FILES
do
printf $FILE"\n";
done
}
main
As you can see, the file name break up in to separate lines due to spaces and special characters. I've tried to use sed on line 24 to replace the spaces with the escape characters. But that didn't work.
See Useless Use of Backticks. Never ever use variables/backticks for open-ended lists. It's generally pointless, likely to truncate your data, and splits where it pleases instead of where you expect.
Since you're putting it in a loop anyway, you might as well save yourself the trouble and make it more direct.
The usual way to do this is
find ... | while read LINE
do
...
done
If you need to use the same results more than once, you could save find's output into a temporary file and read from that.
That wouldn't work for me. I'm planning on processing the files in the script. So, I need to keep the name as the original.
---------- Post updated at 03:31 PM ---------- Previous update was at 03:14 PM ----------
So, would LINE be the "per entry" variable that I process? Like an i in:
for i in $LIST
I was never good at loops.
Would this be more "memory efficient"? Since it's in a file that would be one less thing for it stuff into memory? The targets can generate a very large list and the "server" doesn't really have that much ram; 256MB.
Yes, exactly, except you don't need to have the entire $LIST in a variable like that (a bad idea for the reasons given above).
It'd be far more efficient than cramming it into a variable. Depending on your shell, it's quite questionable whether you can cram 256 megs of text into a variable and expect it to even work no matter how much RAM you have.
I'm one of those "BASH Punks". 256MB is how much memory I have in the "server". It's an old POS that I use to process things on instead of my workstation.
Yeah, I guess I'll go this route with my script then...
Either use mktemp to generate a name for the temporary file, or use files like /tmp/$$-appname . This will allow several instances of the script to run without stomping over each other's temp files.
Well, I got other stuff going on too. It's also my media server, NFS server, Bittorrent seedbox, and a backup server to copy stuff off of my Dreamhost servers. "That's CRON. He fights for the Sys Ops."
Let's try putting double quotes round string variables and avoiding for because it cannot deal with lists containing space characters. Also remove many extraneous semi-colon characters and ensure that we run Test [ ] not a Conditional Expression [[ ]] . Positive test for Directory rather than assuming (wrongly) that everything which is not a file is a directory.
Not sure what the sed is for, so I left it out.
#!/bin/bash
SRC="${1}" # Start Directory
function main {
validate
run
}
function validate {
# Is parameter missing?
if [ -z "${SRC}" ]
then
printf "You need to supply a source.\n"
exit 1
fi
# Is parameter a directory?
if [ ! -d "${SRC}" ]
then
printf "This is not a directory! You need to specify a directory.\n"
exit 1
fi
}
function run {
find "${SRC}" -type f | while read filename
do
printf "${filename}\n"
done
}
main
Ps. I must have missed the bit in post #1 which says what the script is meant to do!
Well, eventually the script is going to go through music library; $SRC, and analyze each file; which are all MP3. All the files that contain ID3V2 tags with a specifc genre will be copied into another directory in preparation for a rsync over a remote server. The new directory structure is going to be created according to the tags; Library => Artist => Album => Track. (I'll probably end up putting up another post on proper ASCII safe filename conversion later. But, that's off-topic.)
---------- Post updated at 09:33 PM ---------- Previous update was at 09:32 PM ----------
None of my filenames start with a dash; I know that for certian. If anything, the dash will likely be in the somewhere in the middle of the filename.
You may as well use printf correctly anyway. Aside from possible issues with a leading dash with GNU coreutils and bash printf implementations, there would be problems if anything in the filename looks like a format specifier or escape sequence.
Could you possibly educate me on how then? Feel free to PM me so we don't go off topic.
---------- Post updated at 02:34 AM ---------- Previous update was at 02:31 AM ----------
I believe in coding that type of flaw out. Hence the post.
---------- Post updated at 05:58 AM ---------- Previous update was at 02:34 AM ----------
My updated script... still having some issues
#!/bin/bash
SRC=$1
TMP=`mktemp`
function main {
validate
run
}
function validate {
if [[ -z $SRC ]]; then
printf "You need to supply a source.\n";
exit 1;
elif [[ -f $SRC ]]; then
printf "This is a file! You need to specify a directory.\n";
exit 1;
fi
}
function run {
printf "Please wait...\n";
find $SRC -type f -iname "*.mp3" > $TMP
while read FILE
do
analyze_mp3
done < $TMP
}
function analyze_mp3 {
GENRE=`id3v2 -l "$FILE" | grep TCON | awk -F: '{printf $2"\n"}' | sed -e 's_^[[:space:]]*__g' | sed 's_([0-9][0-9][0-9])$__' | sed 's_([0-9][0-9])$__' | sed 's_([0-9])$__'`
printf "FILE:\t$FILE\n";
printf "GENRE:\t$GENRE\n";
}
main
I keep getting errors like this.
FILE: /media/Data_Bucket/Audio/Alex_M.O.R.P.H._&_Woody_van_Eyden/hardenergy.lv_(Disc_3)/03_-_A_State_Of_Trance_400_Pre-Reocrded_Guestmix_(18-04-09).mp3
GENRE:
./sync-indus.bash: line 38: printf: `_': invalid format character
FILE: /media/Data_Bucket/Audio/Alex_Reece/100GENRE:
FILE: /media/Data_Bucket/Audio/Alex_Twister/We_Will_Rock_You_(Remix)_[256_KBPS]_[Alex_Twister].mp3
GENRE:
./sync-indus.bash: line 38: printf: `_': invalid format character
FILE: /media/Data_Bucket/Audio/Acetate/100GENRE:
FILE: /media/Data_Bucket/Audio/Aceyalone_Chairman_Hahn/Reanimation/11_-_WTH_You.mp3
GENRE:
FILE: /media/Data_Bucket/Audio/Act,_The/Too_Late_at_20/01_-_Too_Late_At_20.mp3
GENRE: Powerpop
I'm thinking it might have something to do with the % in the file/directory names.
binary@bitslip:/media/Data_Bucket/Audio/Alex_Reece$ ls -l
total 4
drwx------ 1 binary binary 4096 2011-07-16 00:05 100%_Drum_&_Bass_(Disc_1)
binary@bitslip:/media/Data_Bucket/Audio/Alex_Reece$ cd ../Acetate/
binary@bitslip:/media/Data_Bucket/Audio/Acetate$ ls -l
total 0
drwx------ 1 binary binary 0 2011-07-16 00:01 100%_Drum_&_Bass_(Disc_2)
Is that something that is best dealt with using sed?
Not only is it not off-topic, incorrect use of printf is the root of your problem.
The first argument to printf is a format string. A conversion specifier (aka format specifier) is a sequence of characters within a format string which begins with a % .
What will be the format of the output of those printf commands? It's impossible to say. You have handed over control of the format string to external sources. How printf will behave and how many arguments it will require depend on the type and number of conversion specifiers in the format string. The type and number of specifiers in turn depends on the variable values $FILE and $GENRE.
You want to be very careful about what you allow into that first argument to printf.
You are correct.
No. There's no need to mangle the file names just to print them out. What you need to do is not allow arbitrary data into your format string.
Note how the format string is now invariant. Whatever the value of the variables, the format string never changes (a point driven home by switching to strong single-quotes).
The same bug lurks in your awk one-liner:
Make $2 an argument and in the format string replace it with an appropriate conversion specifier (%s in this case).
In shell scripting, this type of error is typically nothing worse than garbled output, but a format string bug in a language like C can be a major security issue. For more info, see Uncontrolled format string.
You can't use single quotes inside double quotes. They don't nest -- it's treated as the end of the single quote.
Awk doesn't use single quotes anyway. Use double quotes.
You can't use variables inside quotes in awk. Get rid of those quotes around $2.
You forgot the comma after the first argument, and the brackets.
You can also get rid of that grep by putting the regex inside awk itself. That's done really easily.
awk -F: '/TCON/ { printf("%s\n", $2) }'
In fact, you can replace that entire enormous pipe-chain with it. awk is a whole programming language, not a glorified cut. And you can match the numbers in one regex instead of three by using ?, a specifier like * that means "zero or one of the previous character".
awk -F: '/TCON/ {
gsub(/[ \t]*/, "", $2); # Strip whitespace
gsub(/_[0-9]?[0-9]?[0-9]$/, "__", $2); # Replace _123 at the end with __
printf("%s\n", $2); }'
@Corona688 Think he wanted to strip leading whitespace, and a bracketed string of up to 3 digits from end of field 2 (he was using _ as the sed delimiter).
This slight change should cover it:
GENRE=`id3v2 -l "$FILE" | awk -F: '/TCON/ {
gsub(/^[ \t]*/, "", $2); # Strip leading whitespace
gsub(/\([0-9]?[0-9]?[0-9]\)$/, "", $2); # Remove bracketed string up to 3 digits from the end
printf("%s", $2); }'`