Need to find a string, check the next line, and if it matches certain criteria, replace it with a s

midniteslice · November 16, 2009, 7:39am

Hey Fellas.

I am new to scripting. I have searched through the forums and found a lot of good info, but I can't seem to get any of it to work together. I am trying to find a particular sting in a file, and if the next string matches certain criteria, replace it with a string from a csv file. Roughly about 1500 times in the file. I have been trying to do this with awk/sed, but i haven't had the best of luck.
example:
find the string that starts with data7
if the string on the next line has the word prisoner then replace it with the string in line one on the csv file.
the next instance of the pattern would be replaced by the string in line 2 of the csv file. and so on.
the word prisoner appears multiple times in the file, but i only want to replace the ones that directly follow the string containing "data7"
does this make sense?

When i say i'm brand new to scripting i mean BRAND new. I'm taking some classes, but i need to get this done as soon as possible and what i have learned so far hasn't gotten me much closer. any help ya'll can give would be greatly appreciated.
Thanks!

ghostdog74 · November 16, 2009, 7:49am

show your code, and your input csv files.

midniteslice · November 16, 2009, 8:15am

I don't have my linux box handy, and thats where all my code snippets are, i'll post them when i get home tonight. Here are some examples of the files:
File 1:

Type: SystemDeclarationData;7   #inthis example we'll use the "JDLM"  the first 
1BN22IN JDLM CLASS I IV VIII SP         #string i would need to find would be the 
                                                            #"SystenDeclarationData" string.  The next 
                                                            #line contains "JDLM", so i would need to 
                                                             #replace it.  Notice that the term "jdlm"
                                                             #appears elsewhere in the file.  these 
                                                             #instances need to stay as is.  




U0010000003

JDLM CLASS I IV VIII SP

















1
1
25
-1
0
0
0
0
0
Type: SupplyLoad;4
0
0
end;

end;

Type: SystemDeclarationData;7
1BN22IN JDLM CLASS V SP





U0010000004

JDLM CLASS V SP

















1
1
26
-1
0
0
0
0
0
Type: SupplyLoad;4
0
0
end;

end;

i don't have a copy of the csv file yet. as far as i know it will only consist of two columns.

NAME  
last, first mi    

NUMBER
this will be a 5-10 digit number

There will be about 1500 instances that will need changed, if not more.
I appreciate the quick replies, I'll post the code that i have as soon as i get home tonight.
Thanks again

frans · November 16, 2009, 10:05am

Try this (but a sample could help !)

#!/bin/bash
Input="data.csv"
I=1
while read LINE
do    #~ find the string that starts with data7
    if [ "${LINE:0:5}" = "data7" ]
    then
        read LINE    #~ if the string on the next line has the word prisoner then replace it with the string in line 1.
        if $(echo "$LINE" | grep -q prisoner)
        then
            head -n$I "$Input" | tail -n1
            (( I ++ ))    #~ the next instance of the pattern would be replaced by the string in line 2 and so on.
        else echo "$LINE"
        fi
    else echo "$LINE"
    fi
done < "$Input" > OutputFile

midniteslice · November 16, 2009, 1:03pm

Thanks FRANS! This looks a lot simpler than I thought it would be. I'll try it out as soon as i get on my linux box.
I for the most part understand whats happening here. Could you explain the purpose of the ":0:5" in this line? I found what the curly brackets do, but as of yet, not the colons.

 if [ "${LINE:0:5}" = "data7" ]

Thanks for all the help!

---------- Post updated at 12:05 PM ---------- Previous update was at 10:43 AM ----------

As far as samples go, i have a file (format unknown). There are about 15000 lines or so that contain the word "prisoner". Each prisoner will be assigned a number that will be pulled from a .csv file. The input file looks like this (only right around 44000 lines long):

Type: SystemDeclarationData;7    
1BN22IN prisoner



U0010000003

prisoner VIII SP

















1
1
25
-1
0
0
0
0
0
Type: SupplyLoad;4
0
0
end;

end;

Type: SystemDeclarationData;7
1BN22IN prisoner CLASS V SP





U0010000004

prisoner CLASS V SP

















1
1
26
-1
0
0
0
0
0
Type: SupplyLoad;4
0
0
end;

end;

The output would then look something like this:

Type: SystemDeclarationData;7    
100000056



U0010000003

prisoner VIII SP

















1
1
25
-1
0
0
0
0
0
Type: SupplyLoad;4
0
0
end;

end;

Type: SystemDeclarationData;7
20000056





U0010000004

prisoner CLASS V SP

















1
1
26
-1
0
0
0
0
0
Type: SupplyLoad;4
0
0
end;

end;

Notice that only the "prisoner" string underneath the "Type: SystemDeclarationData;7" string changed.
The .csv file will end up being either a one or two column file depending on whether or not they want names along with the numbers.
I don't know if that helps or not. don't have much to contribute as far as a real sample goes at the moment. They pretty much gave me basic structure, a goal, and said have fun. heh.

---------- Post updated at 01:03 PM ---------- Previous update was at 12:05 PM ----------

Frans:
We are on the right track!! I modified the scropt that you posted a little bit:

#!/bin/bash
changefile="test_csv.csv"         #this is the file the new string is pulled from
Input="test.fplan"    #this is the file that needs changed
I=1

while read LINE
do    #~ find the string that starts with data7
    if [[ "${LINE}" =~ "SystemDeclarationData" ]]
    then
        read LINE    #~ if the string on the next line has the word prisoner then replace it with the string in line 1.
        if [[ "${LINE}" =~ "JDLM" ]]
        then
            head -n$I "$changefile" | tail -n1
            (( I ++ ))    #~ the next instance of the pattern would be replaced by the string in line 2 and so on.
        else echo "$LINE"
        fi
    else echo "$LINE"
    fi
done < "$Input" > /home/bj/Desktop/test_complete

The only problem i'm having is that it's deleting the line that contains "SystemDeclarationData". otherwise it's working great. can you tell me what i'm doing wrong?

frans · November 16, 2009, 1:24pm

midniteslice:

Thanks FRANS! This looks a lot simpler than I thought it would be. I'll try it out as soon as i get on my linux box.
I for the most part understand whats happening here. Could you explain the purpose of the ":0:5" in this line? I found what the curly brackets do, but as of yet, not the colons.
 if [ "${LINE:0:5}" = "data7" ]

it extracts 5 characters from position 0 (the first)

The script is wrong i'll modify it to work properly.
In a couple of minutes, OK?
The script

#!/bin/bash
Input="test.fplan"
I=1
while read LINE
do    #~ find the string that starts with data7
    echo "$LINE"
    [ "${LINE:0:5}" =~ "SystemDeclarationData" ] || continue #~ continue if the condition is not satisfied
    read LINE
    if $(echo "$LINE" | grep -q prisoner)
    then    head -n$I $Input | tail -n1
        (( I ++ ))
    else echo "$LINE"
    fi
done < $Input > test_csv.csv

midniteslice · November 16, 2009, 2:08pm

I got it working with an echo command. essentially, i guess both are doing the same thing, just outputting in a different format?. Yours is a whole lot cleaner!! heh. I can't thank you enough for the help. Credit goes to you my friend!