how do you parse 1 line at a time of file1 ie. line(n) each line into new file

web_developer · June 28, 2009, 12:26am

File 1
<html>ta da....unique file name I want to give file=>343...</html>
<html>da ta 234 </html>
<html>pa da 542 </html>
and so on...

File 2
343
234
542
and so on, each line in File 1 one also corresponds with each line in File 2

I have tried several grep, sed, while .. read, do, done scripts and to no avail.

i need a ksh script that will do the following
(readORawkOR???) 1 line at a time, >OR?? into its corresponding unique identifier in the HTML code, but the unique identifier in the code is not at the beginning or the ending of the line, its in the middle, I have <html> and </html> at the begining and end of each line..

Any example scripts would be great...

Thanks

kshji · June 28, 2009, 4:33am

You can do this many way, but here is one example how to use parameter expansion.

#!/bin/ksh
# read lines from stdin
while read line
do
        # remove begin of line including <html>
        a1=${line#*<html>}
        # remove end of line including </html>
        a2=${a1%</html>*}
        # remove all char except numbers (replace not numbers with nothing)
        a3=${a2//[^0-9]/}
        print $a3
done

And then run it

chmod a+x thisfile
cat file1 | ./thisfile > file2

web_developer · June 28, 2009, 11:40am

in summary.. SEE Below

In file1, line 1 (<html>...unique identifier23432..</html>) needs to be > to the identifier in line 1 in file2 (creating a NEW file name for each record)(23432).html (creating new file based on unique identifier)

---------- Post updated at 08:51 AM ---------- Previous update was at 08:36 AM ----------

#!/bin/ksh

#create counter
cnt=0
# read lines from stdin
while read line
do
# remove begin of line including <html>
a1=${line#<html>}
# remove end of line including </html>
a2=${a1%</html>}
# remove all char except numbers (replace not numbers with nothing)
a3=${a2//[^0-9]/}
print $a3
#increment cnt for testing creation of new unique identifier
cnt=$(($cnt+1))
done > $cnt.html
$ksh test3.ksh

test3.ksh[17]: : bad substitution
$
This 1 file is created:
0 Jun 28 08:41 0.html
blank and no 1,2, 3 and so forth..

Any other ideas?

kshji:

You can do this many way, but here is one example how to use parameter expansion.

#!/bin/ksh
# read lines from stdin
while read line
do
   # remove begin of line including <html>
   a1=${line#*<html>}
   # remove end of line including </html>
   a2=${a1%</html>*}
   # remove all char except numbers (replace not numbers with nothing)
   a3=${a2//[^0-9]/}
   print $a3
done

And then run it

chmod a+x thisfile
cat file1 | ./thisfile > file2

Thank you but unfortunately, this will not create what I need..

in summary.. SEE Below

In file1, line 1 (<html>...unique identifier23432..</html>) needs to be > to the identifier in line 1 in file2 (creating a NEW file name for each record)(23432).html (creating new file based on unique identifier)

---------- Post updated at 08:51 AM ---------- Previous update was at 08:36 AM ----------

#!/bin/ksh

#create counter
cnt=0
# read lines from stdin
while read line
do
# remove begin of line including <html>
a1=${line#<html>}
# remove end of line including </html>
a2=${a1%</html>}
# remove all char except numbers (replace not numbers with nothing)
a3=${a2//[^0-9]/}
print $a3
#increment cnt for testing creation of new unique identifier
cnt=$(($cnt+1))
done > $cnt.html
$ksh test3.ksh

test3.ksh[17]: : bad substitution
$
This 1 file is created:
0 Jun 28 08:41 0.html
blank and no 1,2, 3 and so forth..

Any other ideas?

kshji:

You can do this many way, but here is one example how to use parameter expansion.

#!/bin/ksh
# read lines from stdin
while read line
do
   # remove begin of line including <html>
   a1=${line#*<html>}
   # remove end of line including </html>
   a2=${a1%</html>*}
   # remove all char except numbers (replace not numbers with nothing)
   a3=${a2//[^0-9]/}
   print $a3
done

And then run it

chmod a+x thisfile
cat file1 | ./thisfile > file2

---------- Post updated at 11:40 AM ---------- Previous update was at 08:51 AM ----------

Thank you but unfortunately, this will not create what I need..

kshji · June 28, 2009, 12:00pm

I'm not sure what you are trying, sort input file example and what you like to be result example.

while ...
do
     # a3 is the key value, look first example script
     > $a3.html
done

cnt=1
while read line
do
    # create/overwrite empty file using some variable value
    > $cnt.html
    # or put something to file
    print something > $cnt.html
   ((cnt+=1))
done

web_developer · June 28, 2009, 12:07pm

My input file is a list of html code for products that have a unique key as their id numbers in the description of the code..

test1
<html>(the code for product #####)</html> <==a complete webpage
i wanted to use the cnt value to represent a unique means of createing a new and different file fore each line in he test1 file so in essenct, it is creating a new html file for each line which I have tested and verified is seperated by a carrage return and no tabs or carage returns in the line itself.

filename.txt is another possible input file i tried to used a mv script to change the name of the cnt.html files created by the first script