Using sed or awk to replace digits in files

duke0001 · October 4, 2019, 11:47am

Hello;

I am not good at file and stream editing. I need to replace a few digits in two files. The lines in files looks like this:

Line in the first file,

/dw300/data/obe/2019273.L800JR.1909.273

Line in second file,

1|2019273.L800JR.1909.273

I will write a function to connect to database and select Last Julian Business Day and Julian Date digits to get two output digits as:

Last Julian Business Day      Last Julian Date Digits
2019304                       1910.304

Then I need to use the first number 2019304 to replace 2019273 in these two files and replace 1909.273 with 1910.304 in two files too.

After replacement, the line in two files should looks like this:
Line in the first file,

/dw300/data/obe/2019304.L800JR.1910.304

Line in second file,

1|2019304.L800JR.1910.304

Function output digits can be in one select statement or separate select statement. Output digits will be used in sed or awk command to replace old digits.

I am learning sed or awk to manipulate and edit files. Please help me to figure out how to use either sed or awk to replace these digits in the files. If you can include the logic or simple explanation on your command, it will be greatly appreciated.

My Unix environment is Solaris 11 and ksh shell.

Thanks for your advice and help.

duke0001 · October 4, 2019, 4:02pm

Hi, All;

I also tested using sed to replace the digits in the one line as:

sed 's/[0-9]\{7\}/2019304/' test_file1.txt,

---replacement is entered at this circumstance.

I got output as:

 /dw300/data/obe/2019304.L800JR.1909.273

Now the problem is I couldn't replace 1909.273 with the same command logic. I need to figure out how sed can search to the pattern as digit.digit . What is the delimiter I should use here? Please provide your advice. Thanks.

Scrutinizer · October 4, 2019, 4:31pm

Try [0-9]\{4\}\.[0-9]\{3\}
The dot needs to be escaped with a backslash, since it is a special character in regular expressions

duke0001 · October 5, 2019, 12:23pm

Scrutinizer:

Thanks for advice. I have tested with your input and command like this:

sed 's/[0-9]\{7\}/2019304/; s/[0-9]\{4\}\.[0-9]\{3\}/1910.304/' test_file1.txt

Then output is what I need.

/dw300/data/obe/2019304.L800JR.1910.304

At this time, replacement digits is manually entered. In real situation, I will use parameter for replacement digits.

Thanks a lot for your help. I would welcome other experts to comment on my command to see whether there are better ways to do this work.

Scrutinizer · October 5, 2019, 6:44pm

You could try an alternate approach like this, piping the output of your database command into an awk script, who processes it as stdin:

some_database_command |             
awk '
  NR==FNR {                                    # When reading the first file (stdin in this case)
    if(FNR==2) {                               # When we encounter the second line
      business_day=$1                          # Save the values
      date_digits=$2
    }
    next                                       # Do not process the rest in case of the first file.
  }

  /L800JR/ {                                   # for the two input file if the line contains "L800JR"
    split($NF,F,".")                           # split the last field on the dot character
    $NF=business_day "." F[2] "." date_digits  # Recreate the last field using the second split field
  }

  {
    print > (FILENAME ".new")                  # print the two input files to "filename".new
  }
' - FS=/ OFS=/ file1 FS=\| OFS=\| file2        # Read stdin (-) as the first "file" and use "/" and
                                               # "|" as field separators for the two files respectively

duke0001 · October 6, 2019, 12:43am

Scrutinizer:

I used this SQL query to fetch business_day and Date_digits into shell variables jdate1 and jdate2 . Then use sed to replace them. It worked well.
If using my SQL query, how it work with your awk code?. My query like this:

jdate1=`sqlplus -s /nolog <<EOF
connect / as sysdba
set pagesize 0 feedback off heading off
SELECT TO_CHAR(DW_ADHOC.F_FIND_LAST_BUSINESS_DAY, 'YYYYDDD') "Last_Business_Day" from dual;
exit
EOF`

This output is : 2019304

jdate2=`sqlplus -s /nolog <<EOF
connect / as sysdba
set pagesize 0 feedback off heading off
SELECT TO_CHAR(DW_ADHOC.F_FIND_LAST_BUSINESS_DAY, 'YY')||to_char(to_date(TO_CHAR(DW_ADHOC.F_FIND_LAST_BUSINESS_DAY, 'DDD'), 'j'), 'MM')
||'.'||TO_CHAR(DW_ADHOC.F_FIND_LAST_BUSINESS_DAY, 'DDD') "Julian Date Digits"  FROM dual;
exit
EOF`

This output is: 1910.304 ,

Then I used:

sed 's/[0-9]\{7\}/'$jdate1'/; s/[0-9]\{4\}\.[0-9]\{3\}/'$jdate2'/' test_file1.txt > test_file1_new.txt

sed 's/[0-9]\{7\}/'$jdate1'/; s/[0-9]\{4\}\.[0-9]\{3\}/'$jdate2'/' test_file2.txt > test_file2_new.txt

The two files has been replaced with correct Julian date digits in variables. I redirect output to a new file because Solaris do not support sed -i to direct change in place. Then I can overwrite the file with mv back to original name for another Application to use. I want to learn from you how to use your code awk to do the work. Thanks.

duke0001 · October 7, 2019, 5:39pm

Scrutinizer:

I posted some feedback for your to review. Thanks for your advice and help.

Scrutinizer · October 7, 2019, 10:55pm

@duke0001, You are welcome...

In order for the awk approach to function, your DB query would need to produce output like you specified in post #1:

Since you are using Solaris 11, probably you would need to use /usr/xpg4/bin/awk ....

You can vary with the condition /L800JR/ or leave it out entirely if not needed....