How to remove character with sed and work with the rest?

JameEnder · June 26, 2020, 9:15pm

I´m working on a script that converts a file with markdown into an html file, the md format looks like

# This is my heading
This is my text 
- This is a list item

And i would like to have a function (now i have something called "convert") that i could use like this
convert "char" "tag" "file"
For example

convert "#" "h1" "/temp/file1.txt"
convert "-" "ul" "/temp/file1.txt"
convert "" "p" "/temp/file1.txt"

And get an output of

<h1>This is my heading</h1>
<p>This is my text</p>
<ul>This is a list item</ul>

I would love to use something pretty light weight, not sure if sed is the right choice.
Thank you in advance

RavinderSingh13 · June 27, 2020, 2:07am

Hello @JameEnder, could you please try following.

awk '
/^#/{
  sub(/^# +/,"")
  print "<h1>" $0  "</h1>"
  next
}
/^-/{
  sub(/^- +/,"")
  print "<ul>" $0 "</ul>"
  next
}
!/^#/ && !/^-/{
  print "<p>" $0 "</p>"
}
' Input_file

Written on mobile so couldn't test it should work but.

Thanks,
R. Singh

JameEnder · June 27, 2020, 11:42am

Works very well! Could i somehow get an explanation of whats happening? Its a bunch of giberrish for me

MadeInGermany · June 27, 2020, 1:54pm

The awk code loops over each input line. Outside a { } you can use a selector that works like an if condition for the following { code block }

If you are more familiar with shell code - the following shell code is quite similar

while IFS= read line
do
  case $line in
  ("#"*)
    # chop the leading character then a space
    line=${line#?}
    line=${line#" "} 
    echo "<h1>$line</h1>"
  ;;
  ("-"*)
    line=${line#?}
    line=${line#" "} 
    echo "<ul>$line</ul>"
  ;;
  (*) # The * matches everything so this is like an "else" 
    echo "<p>$line</p>"
  ;;
  esac
done < Input_file

JameEnder · June 27, 2020, 2:28pm

Awesome, thank you, this looks a bit less daunting to me

RavinderSingh13 · June 27, 2020, 2:30pm

@JameEnder, sure here you go.

Explanation: there are 3 conditions I am checking here. 1st if a line starts from # then substitute one or more occurrences of it with null and print h1 tag before and after the line. In 2nd condition checking if a line starts from - then substitute one or more occurrences of it, then printing ul tag before and after current line.

Finally checking 3rd condition if line is NOT starting from # and - then add

before and after current line and printing it.

Thanks,
R. Singh

MadeInGermany · June 27, 2020, 7:18pm

The next jumps to the next input cycle, skipping the following code.
The 3rd condition is not really needed.

mr16ga1 · June 28, 2020, 5:50am

Hello JameEnder, l have done this kind of edit before, I always make a copy of the file to work on, then after my script works I make one more copy of the file (file.orig) then copy the good working file to where it needs to be located. This procedure has save me more than one time. From your question it sounded like you might be new to sed and awk.

system · August 27, 2020, 5:59am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.