How to extract string between two specified characters and end of line?

Hi All,

I am trying to extract a string between two characters in a file and then look up that string in a separate file. E.g. first file is ABC.txt and its contents are

abc.$$Date
xyz.$$Year.dat
abc.xyz.$$Unit

I want to extract Date in the first line , Year in the second line and Unit in the third line and look it up in a separate file xyz.txt. I tried sed command but missing something. Any help is appreciated.

Could you show what you've tried and what you think is "missing"?

I used this

sed 's/.*\$//' test.txt >> aaa.txt

This command works to fetch the data correctly from the $$ till the end of line. But its not sufficing the condition to fetch data from $$ to .

Simply use two s commands:

sed 's/.*\$\$//; s/\..*//' test.txt

In the case there is no $$ it will still remove everything after a dot.
The following would avoid that:

sed '/.*\$\$/ { s///; s/\..*//; }' test.txt

The / / is a selector that works like an if. If true the code in the { } braces is run. The first s command substitutes what matched in the previous / / .

1 Like

Thanks a lot @MadeInGermany. Now if I want to lookup these values in a separate file and substitute the params with actual values what will the code be. Like My other file is having values for all these params like

$$Date=20200303
$$Year=2020
$$Unit=201

The final file will look like this

abc.20200303
xyz.2020.dat
abc.xyz.201

--- Post updated at 04:33 PM ---

I am trying with this but getting error.

#!/bin/bash
BASEDIR=/home/wip4599/
IFILEEXTN=bbb.txt
 key= $(sed 's/\(.*\)\=.*/\1/' bbb.txt)
  value=$(sed 's/.*\=//g' bbb.txt)
  sed 's/$key/$value/g' test.txt
done < $input"

when you say I am trying with this but getting error. , what does it exactly mean?
What error?
What have you done to investigate the "error"?
Have you considered changing this sed 's/$key/$value/g' to that sed "s/$key/$value/g" ?

The s command in sed takes one $key value and one $value value.
Apparently you need a loop that in each cycle gives one $key/$value pair.

The below code is throwing error too

#!/bin/bash
while read line
do
key= $(sed 's/\(.*\)\=.*/\1/' $line)
value=$(sed 's/.*\=//g' $line)
sed "s/$key/$value/g" test.txt
done <  bbb.txt

Hi

key= $(sed 's/\(.*\)\=.*/\1/' $line)

There should be no space after the equal sign

key=$(sed 's/\(.*\)\=.*/\1/' $line)

put error messages in the post too

Furthermore, sed gets the $line as a filename.
If it should get it as input then do

echo "$line" | sed ...

Some shells (bash, zsh) take as well

sed ... <<< "$line"
1 Like

You can leave the $start or $end arguments empty and it will use the start or end of the string.

echo get_string_between("Hello my name is bob", "my", ""); //output: " name is bob"

private function get_string_between($string, $start, $end){ // Get
    if($start != ''){ //If $start is empty, use start of the string
        $string = ' ' . $string;
        $ini = strpos($string, $start);
        if ($ini == 0) return '';
        $ini += strlen($start);
    }
    else{
        $ini = 0;
    }

    if ($end == '') { //If $end is blank, use end of string
        return substr($string, $ini);
    }
    else{
        $len = strpos($string, $end, $ini) - $ini; //Work out length of string
        return substr($string, $ini, $len);
    }
}

Thanks for the suggestion. that worked. But now the last part of replacing it in a different file is not working. I tried the below two codes.

#!/bin/bash
while read line
do
key= echo "$line" | sed 's/\(.*\)\=.*/\1/' 
value= echo "$line" | sed 's/.*\=//'
sed 's/$key/$value/g' test.txt
done < bbb.tx
#!/bin/bash
input=/home/wip4599/test.txt
while read line
do
key= echo "$line" | sed 's/\(.*\)\=.*/\1/' 
value= echo "$line" | sed 's/.*\=//'
sed 's/$key/$value/g' $input
done < bbb.txt

--- Post updated at 08:04 PM ---

@MadeInGermany

--- Post updated at 08:42 PM ---

used the below format and it worked

key=$(echo "$line" | sed 's/\(.*\)\=.*/\1/')

Yes that is the correct format.
And it should be sed "s/$key/$value/g" with normal (double-)quotes, so the shell can expand $key and $value.

A more efficient way is to let the read command spllit into fields and directly assign them to distinct variables.
We can set the "input field separator" IFS to = (by default it is whitespace). By prefixing IFS="=" we set this environment variable only for the following read command.

input=/home/wip4599/test.txt
while IFS="=" read key value
do
  sed "s/$key/$value/g" $input
done < bbb.txt

both the codes are replacing all the matching pattern. If I have $$Date and $$Date_1 it is replacing like 20200305 and 20200305_1 but I have different values for $$Date_1. How to skip that. if I try this echo statement its working but not when im using variables key and value.

echo $(sed -e 's/^$$Date$/20200305/g' test.txt)

sed -n "s/$key/$value/gp" $input

only prints lines where a substitution occurred.

But it won't print any existing lines that have not be substituted.
The following bash-4 script is a universal fix:

#!/bin/bash
# bash 4+ or ksh 93+ required

input=test.txt

# aa[] is an associative (string-indexed) array
typeset -A aa
#declare -A aa # is bash-only

# loop through the key/value file, store in aa[]
while IFS="=" read key value
do
  aa[$key]=$value
done < bbb.txt

# loop through the target file, do not split on whitespace
while IFS= read line
do
  # for each $line
  # loop through the array(-indices)
  for key in "${!aa[@]}"
  do
    # get the corresponding value
    value=${aa[$key]}
    # do the substitutions $key -> $value
    line=${line//$key/$value}
  done
  # printf is more robust than echo
  printf "%s\n" "$line"
done < $input

I have put some comments that explain how it works.
Compared to the sed solution that processes the input file many times, this solution causes less I/O (but uses more memory: the whole bbb.txt file must fit into memory).
It even fixes the potential problem that the values may not contain a / character because it clashes with the / dividers in sed.

Thanks @MadeInGermany . It did the trick and gave the expected output.

Hi,

the code is working for $$values that has a . after it or end of line. In case I have the record as below it is not replacing it.

$$WF_PRM_TARGET_extract_$$WF_PRM_SOURCE.$$WF_PRM_DIVISION.dat

.Here only $$WF_PRM_DIVISION is getting replaced

I tried this <

sed '/.*\$\$/ { s///; s/\..*//; s/\._*//; }' test.txt > aaa.txt

/>

--- Post updated at 06:29 PM ---

<code>
sed '/.\$\$/ { s///; s/\..//; s/\._*//; }' test.txt > aaa.txt
</code>