rename files Ax based on strings found in files Bx

Hi,

I'm not very experienced in shell scripting and that's probably why I came across the following problem:

I do have several hundred pairs of text files (PF00x.spl and PF00x.shd) where the first file (PF00x.spl) needs to be renamed according a string that is included in the second file (PF00x.shd).

Example:

PF00x.shd contains a string (length varies from file to file; example would be: AB09098765 7 6 5) within the follwing keywords "bviwacsbviwacs" and "Adb"

The goal is to rename PF00x.spl to AB09098765 7 6 5.spl and so on for the remaining files.

I hope my description makes sense... any suggestions how I could solve that problem?

Thanks a lot!

give some example lines from the second file rather than describing them

Here you go (extract from PF001.shd):

�2�H&��a �2�H&� b v i w a c s b v i w a c s s m u A B 4 6 2 6 6 7 _ 2 0 0 9 0 8 1 4 7 3 9 4 3 7 1 0 - B e r i c h t A d b P L F A d b \ \ P C 2 6 7 I W A C S

And I'm looking for the string marked bold.

Thanks,

This works in ksh when run in the same directory as the target files: -

ls *.shd | while read FILE_NAME
do
        if [[ -w ${FILE_NAME%.*}.spl ]] ## Ignore any file with no .spl file
        then
                NEW_FILE_NAME=$( nawk ' BEGIN {
                        ## Spaces in target strings ????
                        first = "b v i w a c s b v i w a c s"   
                        last = "A d b"
                }
                ( $0 ~ first ) && ( $0 ~ last ) {
                        startm = index( $0, first ) + length( first )
                        endm = index( $0, last ) - 1
                        file_name = substr( $0, startm, endm - startm )
                } END {
                        ## Remove spaces from new file name
                        gsub( / /, "", file_name )              
                        print file_name
                } ' $FILE_NAME).spl

                ## Copy. change cp to mv if required
                cp ${FILE_NAME%.*}.spl $NEW_FILE_NAME           
        else
                echo "File ${FILE_NAME} has no matching .spl file"
        fi
done

Your first post had no spaces in the enclosing strings but your second did as well as having unprintable characters so I created a test file with spaces in as a worst case. Running the code as a script called "inch" with these files: -

TX5XN:/home/brad/forum/inch>ls -l
total 8
-rwxrwxrwx 1 brad root 682 2009-10-07 20:07 inch
-rw-r--r-- 1 brad root 143 2009-10-07 18:50 PF00x.shd
-rw-r--r-- 1 brad root   0 2009-10-07 19:37 PF00x.spl
-rw-r--r-- 1 brad root   0 2009-10-07 19:59 PF01x.shd

Gave this output: -

TX5XN:/home/brad/forum/inch>inch
File PF01x.shd has no matching .spl file

TX5XN:/home/brad/forum/inch>ls -l
total 8
-rwxrwxrwx 1 brad root 682 2009-10-07 20:07 inch
-rw-r--r-- 1 brad root 143 2009-10-07 18:50 PF00x.shd
-rw-r--r-- 1 brad root   0 2009-10-07 19:37 PF00x.spl
-rw-r--r-- 1 brad root   0 2009-10-07 19:59 PF01x.shd
-rw-r--r-- 1 brad root   0 2009-10-07 20:11 smuAB462667_2009081473943710-Bericht.spl

TX5XN:/home/brad/forum/inch>ls -l
total 8
-rwxrwxrwx 1 brad root 682 2009-10-07 20:07 inch
-rw-r--r-- 1 brad root 143 2009-10-07 18:50 PF00x.shd
-rw-r--r-- 1 brad root 0 2009-10-07 19:37 PF00x.spl
-rw-r--r-- 1 brad root 0 2009-10-07 19:59 PF01x.shd

You're going to be on the short end of a banning any minute now...

Aha
Ok sir large
I am a Palestinian
I love forums and the U.S.
Ahan recorded heck with you
You want a kick me out Thank you
PDC kept Thank you too

Hello, !

In case you forgot to read the forum rules, here is quick copy.

Cheers.

The UNIX and Linux Forums

@steadyonabix

Thanks a lot! I'm going to try to adapt your code...

Regards,

@steadyonabix thanks a lot for your code!

I tried to adapt your code but I always get only one file named ".SPL" even when I change the keywords to a single letter for testing purposes...
Seems like I'm doing something the wrong way.

Additional background information: I currently have to use cygwin with bash on WinXP...
Therefore I amended the code as listed at the end of this post.
The keywords are b v i w a c s b v i w a c s (at least when I'm looking at them in bash) and A d o b e

the result I'm looking for would be smuAB462691_200908148 19 27 296 - Bericht.SPL

I've attached two files (one spl and one original shd) for illustration. It would be great if you could take a short look at them.

Thanks again for your help!

-------
current code:
-------

ls *.SHD | while read FILE_NAME
do
        if [[ -w ${FILE_NAME%.*}.SPL ]] ## Ignore any file with no .spl file
        then
                NEW_FILE_NAME=$( awk ' BEGIN {
                        ## Spaces in target strings ????
                        first = "b v i w a c s   b v i w a c s"   
                        last = "A d o b e"
                }
                ( $0 ~ first ) && ( $0 ~ last ) {
                        startm = index( $0, first ) + length( first )
                        endm = index( $0, last ) - 1
                        file_name = substr( $0, startm, endm - startm )
                } END {
                        ## Remove spaces from new file name
                        gsub( / /, "", file_name )              
                        print file_name
                } ' $FILE_NAME).SPL

                ## Copy. change cp to mv if required
                cp ${FILE_NAME%.*}.SPL $NEW_FILE_NAME           
        else
                echo "File ${FILE_NAME} has no matching .spl file"
        fi
done

Hi

I am going out tonight so will not be able to look at this before tomorrow.

I don't use Cygwin so can't promise much.

There is no substitute for experimenting with it yourself though in the meantime.

Good luck

Hi Inch

Well this turned out to be more interesting than I originally thought it would be: -

When I got your file I found it would not behave with awk or sed or any of the usual utilities so I took a look at its internal structure by doing an octal dump of the contents.
Here is the part containing your start string with the individual letters highlighted in bold: -

TX5XN:/home/brad/forum/inch>od -c FP00000.SHD | pg                                                                                                                   

0003440   H   & 375 032 001 002  \0  \0   b  \0   v  \0   i  \0   w  \0
0003460   a  \0   c  \0   s  \0  \0  \0   b  \0   v  \0   i  \0   w  \0
0003500   a  \0   c  \0   s  \0  \0  \0   s  \0   m  \0   u  \0   A  \0

As you can see each letter is delimited by a number 0 or null, so anything like awk or sed will fail as they look for string based files, not null delimited chars.
This is why your original paste of the file contained so many unprintable characters and spaces between each letter of your target string.

Having established the problem it is now a simple fix, just remove the nulls prior to manipulating the strings: -

TX5XN:/home/brad/forum/inch>tr -d "\000" < FP00000.SHD | strings
Adobe PDF
PRIV
EBDA
Standard
bviwacsbviwacssmuAB462691_200908148 19 27 296 - BerichtAdobe PDFAdobe PDF ConverterWinPrintNT EMF 1.008\\PC267IWACS

So adding the stripping of the nulls to the tool gives us the ability to correctly process the string we want to turn into a file name: -

TX5XN:/home/brad/forum/inch>inch
TX5XN:/home/brad/forum/inch>ls -l
total 168
-rw-r--r-- 1 brad root  2080 2009-08-14 20:10 FP00000.SHD
-rw-r--r-- 1 brad root 75368 2009-10-08 11:46 FP00000.SPL
-rwxrwxrwx 1 brad root   696 2009-10-09 17:48 inch
-rw-r--r-- 1 brad root 75368 2009-10-09 17:48 smuAB462691_2009081481927296-Berich.SPL

Here is the modified code: -

ls *.SHD | while read FILE_NAME
do
    if [[ -w ${FILE_NAME%.*}.SPL ]]    ## Ignore any file with no .SPL file
    then
        NEW_FILE_NAME=$( tr -d "\000" < $FILE_NAME | strings | nawk ' BEGIN {
            ## Spaces in target strings ????
            first = "viwacsbviwacs"
            last = "Adobe"
        }
        ( $0 ~ first ) && ( $0 ~ last ) {
            startm = index( $0, first ) + length( first )
            endm = index( $0, last ) - 1
            file_name = substr( $0, startm, endm - startm )
        } END {
            ## Remove spaces from new file name
            gsub( / /, "", file_name )    
            print file_name
        } ' ).SPL

        ## Copy. change cp to mv if required
        cp ${FILE_NAME%.*}.SPL $NEW_FILE_NAME
    else
        echo "File ${FILE_NAME} has no matching .SPL file"
    fi
done

Note there is no real error checking for existing files etc, I will leave you to add that yourself.
You should also note it needs to be run in the target directory so you will need to modify it if you want to handle multiple directories etc.
Until you add this kind of validation I would copy the files into a work directory and process them there first to avoid any unfortunate mishaps or lost data.

Hope it is usefull.........

cool - thanks a lot!
Now it's perfectly working.