How to remove character with sed and work with the rest?

I´m working on a script that converts a file with markdown into an html file, the md format looks like

# This is my heading
This is my text 
- This is a list item

And i would like to have a function (now i have something called "convert") that i could use like this
convert "char" "tag" "file"
For example

convert "#" "h1" "/temp/file1.txt"
convert "-" "ul" "/temp/file1.txt"
convert "" "p" "/temp/file1.txt"

And get an output of

<h1>This is my heading</h1>
<p>This is my text</p>
<ul>This is a list item</ul>

I would love to use something pretty light weight, not sure if sed is the right choice.
Thank you in advance

Hello @JameEnder, could you please try following.

awk '
/^#/{
  sub(/^# +/,"")
  print "<h1>" $0  "</h1>"
  next
}
/^-/{
  sub(/^- +/,"")
  print "<ul>" $0 "</ul>"
  next
}
!/^#/ && !/^-/{
  print "<p>" $0 "</p>"
}
' Input_file

Written on mobile so couldn't test it should work but.

Thanks,
R. Singh

1 Like

Works very well! Could i somehow get an explanation of whats happening? Its a bunch of giberrish for me

The awk code loops over each input line. Outside a { } you can use a selector that works like an if condition for the following { code block }

If you are more familiar with shell code - the following shell code is quite similar

while IFS= read line
do
  case $line in
  ("#"*)
    # chop the leading character then a space
    line=${line#?}
    line=${line#" "} 
    echo "<h1>$line</h1>"
  ;;
  ("-"*)
    line=${line#?}
    line=${line#" "} 
    echo "<ul>$line</ul>"
  ;;
  (*) # The * matches everything so this is like an "else" 
    echo "<p>$line</p>"
  ;;
  esac
done < Input_file

Awesome, thank you, this looks a bit less daunting to me

@JameEnder, sure here you go.

Explanation: there are 3 conditions I am checking here. 1st if a line starts from # then substitute one or more occurrences of it with null and print h1 tag before and after the line. In 2nd condition checking if a line starts from - then substitute one or more occurrences of it, then printing ul tag before and after current line.

Finally checking 3rd condition if line is NOT starting from # and - then add

before and after current line and printing it.

Thanks,
R. Singh

The next jumps to the next input cycle, skipping the following code.
The 3rd condition is not really needed.

1 Like

Hello JameEnder, l have done this kind of edit before, I always make a copy of the file to work on, then after my script works I make one more copy of the file (file.orig) then copy the good working file to where it needs to be located. This procedure has save me more than one time. From your question it sounded like you might be new to sed and awk.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.