Need to replace the date inside a node of several rdf files

Hi,
I have a rdf zip file. This zip file consists of several *.rdf files.
I need to replace the date (this is different for each rdf) inside the node "Date_de_Publication_Periodique" of these rdf files.
e.g.,

 
awk '/Date_de_Publication_Periodique/ && /XMLSchema#date/' MM_NN-A1B1C1_ABC.rdf

gives:

 
 <j.1:Date_de_Publication_Periodique rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2013-10-30</j.1:Date_de_Publication_Periodique>

I need to replace this date: 2013-10-30 with my desired date.
With the shell script, I want to unzip the file, take out the strings (from the rdfs) for the mentioned node containing the date and replace it with my desired date (fixed date) for all the rdf files and then zip it back.
Following are my failed attempts:

awk '/Date_de_Publication_Periodique/ && /XMLSchema#date/' MM_NN-A1B1C1_ABC.rdf | awk '{gsub(/\d{4}\-\d{2}-\d{2}/,"JJJJJJ"); print}'
 
awk '/Date_de_Publication_Periodique/ && /XMLSchema#date/' MM_NN-A1B1C1_ABC.rdf | awk '{sub(/\d{4}\-\d{2}-\d{2}/,"JJJJJJ"); print }'

Need suggestions.

Try:

awk '/Date_de_Publication_Periodique/ && /XMLSchema#date/ {sub(/[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}/,"JJJJJJ"); print} MM_NN-A1B1C1_ABC.rdf

Thanks for the reply but your code is not working:

 
awk '/Date_de_Publication_Periodique/ && /XMLSchema#date/ {sub(/[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}/,"JJJJJJ"); print}' MM_NN-A1B1C1_ABC.rdf
 
        <j.1:Date_de_Publication_Periodique rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2013-10-30</j.1:Date_de_Publication_Periodique>
 

Chubler_XL's solution works for me on AIX & Debian.

What's your OS? If you're running SunOS/Solaris then try with /usr/xpg4/bin/awk .

I'm using linux and executing command in bash shell:

Try with sed:

sed '/Date_de_Publication_Periodique.*XMLSchema#date/s/[0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\}/JJJJJJ/' MM_NN-A1B1C1_ABC.rdf
1 Like

Thank you subbeh and chubler_xl and carlom.
I figured out the working solution:

I forgot to add another requirement that I need to pick only the node with
"Date_de_Publication_Periodique" and "XMLSchema#date" and then replace the date inside with the desired date
While

is returning that line but with sed

it was returning multiple lines.

The sed command will return every line in file, with the desired one modified. If your want only that one line, make it sed -n '/.../p' file
But - didn't you want to zip back the files after modification?

Are you sure you want to output only those lines? How do you intent to change these in the original file and add it back in the zip archive?

Yes sorry! I didnt think about that. I first thought of using sed -i 's//g' file. But that is not possible.

---------- Post updated at 09:32 AM ---------- Previous update was at 09:10 AM ----------

I ran the command suggested by Subbeh and checked the file content this time. The desired line has the date replaced by the required one.

 
&lt;j.1:Date\_de\_Publication_Periodique rdf:datatype="http://www.w3.org/2001/XMLSchema\#date"&gt;JJJJJJ&lt;/j.1:Date\_de\_Publication_Periodique&gt;

So I will have to redirect the result to the file with the same name before zipping it.

Don't! redirect to a tmp file, and then rename that to the original file. Redirecting to the org file will clear that before trying to read it.

Yes, I did the same.

Thank you ! :slight_smile: