file name transformation

I've got a multitude of text data files that carry exactly the same kind of data. Unfortunately some of them have a different filename format

some are: 'category'_'month'-'year'_act.txt

an example being: daf_Apr-1961_act.txt

and some are: 'category'_ 'year'-'month'_act.txt

an example being: daf_1961-04_act.txt

any suggestions for transforming the former to the latter?

Shell script is the only option here.Try using case.Following can be tried..

case $name in
          01 ) $month=Jan
              ;;
          02 ) $month=Feb
              ;;
             ....... so on
      esac
  done

Thanks Nua7

What about rearranging the sequence of the file title = swapping the month and the year around. I was more concerned if that was possible.

Yes that should be possible, Split the file name in variables as $category, $date, $month and rearrange as you wish..

Thanks!
nua7

Something like this?

$ echo "daf_1961-04_act.txt"|sed 's/\(.*_\)\(.*\)-\(.*\)\(_.*\)/\1\3-\2\4/'
$ daf_04-1961_act.txt


$ echo "daf_Apr-1961_act.txt"|sed 's/\(.*_\)\(.*\)-\(.*\)\(_.*\)/\1\3-\2\4/'
$ daf_1961-Apr_act.txt

Regards

This is Fantastic Franklin52..!!

vrms , that should work..

Thanks guys this is great stuff!!

However, this might be me being very dumb, but is there a way of 'saving' this new file name to the original. And could this then be automated to do more than one file at a time. Sorry I'm a bit new to this.

Use a loop, for instance:

for file in *.txt
do
  sed 's/\(.*_\)\(.*\)-\(.*\)\(_.*\)/\1\3-\2\4/' "$file" > "$file"_new
  mv "$file"_new "$file"
done

Regards

Hi guys the above script looks perfect (even to my untrained eye)

and the previous one

$ echo "daf_1961-04_act.txt"|sed 's/\(.*_\)\(.*\)-\(.*\)\(_.*\)/\1\3-\2\4/'
$ daf_04-1961_act.txt

has shown me that the sed script works

however when I run it in a loop

I see the new files being created (They are quire big so they take a while)

However the file names produced as a result of the loop aren't changed (they come out exactly as they went in). It's really making me scratch my head. I think the problem's 99% sorted but somethings' wrong.

Thanks

remove the move (mv) command and you will get the desired result.

I'm afraid that hasn't had the desired effect rana_d

But thanks for the suggestion

vrms

I'm sorry for the misunderstanding, this should give the command to move the files:

echo "daf_1961-04_act.txt"|sed 's/\(.*_\)\(.*\)-\(.*\)\(_.*\)/mv & \1\3-\2\4/'

Try it in a loop to see if the you get the desired output to mv the selected files:

for file in *.txt
do
  sed 's/\(.*_\)\(.*\)-\(.*\)\(_.*\)/mv & \1\3-\2\4/' "$file"
done

If the output is correct you can pipe the output of the sed command to sh to mv the files in the loop as follow:

for file in *.txt
do
  sed 's/\(.*_\)\(.*\)-\(.*\)\(_.*\)/mv & \1\3-\2\4/' "$file" | sh
done

Regards

When I run it sends the whole content of the files in the directory to the screen

[quote]
If the output is correct you can pipe the output of the sed command to sh to mv the files in the loop as follow:

for file in *.txt
do
  sed 's/\(.*_\)\(.*\)-\(.*\)\(_.*\)/mv & \1\3-\2\4/' "$file" | sh
done

Regards

[quote]

When I run it, this comes up and the filenames are unchanged

sh: llllll: not found
sh: l: not found
sh: lllllllllllllllllll: not found
sh: rrrrrrrrrrrrrrrr: not found
sh: rrrrrrrrrrrrrrrrr: not found
sh: rrrrrrrrrrrrrrrrr: not found
sh: rrrrrrrrrrrrrr: not found

Where llllllllllllllll and rrrrrrrrrrrrrrrrrr etc are the contents of some dummy files I'm using to verify the scripts

So I'm afraid something is not quite right again

Thanks

vrms

I was assuming that you have only .txt files with names in the same format.
To select filenames with a format like "daf_1961-04_act.txt" you can do something like:

for file in `ls | grep ".*_....-.._.*"`
do
  sed 's/\(.*_\)\(.*\)-\(.*\)\(_.*\)/mv & \1\3-\2\4/' "$file"
done

Regards

When I do the above on directory containing :

daf_2001-03_act.txt
daf_2001-04_act.txt

The content of the files are put on the screen as before.
When I do an ls I get the following files

daf_2001-03_act.txt daf_2001-03_act.txt~ daf_2001-04_act.txt daf_2001-04_act.txt~

As can see there are two new files with a ~ following them. But the filenames are the same.

Hmmm. This is a tough cookie isn't it!

Those files look like temporary files created by a program.
If you want to exclude those files you can extend the grep command e.g.:

for file in `ls | grep ".*_....-.._.*[^~]$"`
do
  sed 's/\(.*_\)\(.*\)-\(.*\)\(_.*\)/mv & \1\3-\2\4/' "$file"
done

Regards

Firstly thanks to everyone who posted on this thread.
I've finally got it working correctly!!

The code I used is as follows:

for file in $(ls -1)
do
  newfile="$(echo $file | sed 's/\(.*_\)\(.*\)-\(.*\)\(_.*\)/\1\3-\2\4/')"
  mv $file $newfile
done

:b: