Select a DOM element by CSS class and echo it in the terminal

I want to select a DOM element by CSS class and echo its textContent in the terminal.

I could download the webpage with curl or view it with lynx but I have no idea how to proceed from there.

a sample webpage file, the desired "CSS class" and the desired output would help.

This is the URL (I can't post a full link, I get an error saying that I can't put a link to that host):

  • packagist.org/packages/drupal/core

The element I want to get its textContent is the package version element.

As you can find, the CSS class of this element is version-number.

The output should be 10.2.2.

Without squinting too hard...something like this - YMMV:

$ curl -s https://packagist.org/packages/drupal/core | awk -F'[<>]' '$2~/^span.*version-number/ {print $3}'
10.2.2
1 Like

Thanks

Do you know another way without awk?

I never learned awk, only sed and can't learn awk soon and would generally avoid working with something I didn't learn its elements yet.

This sed does about the same:

sed -n 's/^[^<]*<span class="version-number">\([^<]*\)<.*/\1/p'

EDIT: put " around version-number, thanks to the following comment.

1 Like
curl -s https://packagist.org/packages/drupal/core | sed -nr '/.*span.*version-number.*/s/[^0-9.]*([0-9.]*).*/\1/p'

version-number is in between ", btw

2 Likes

using grep and cut:

curl -s https://packagist.org/packages/drupal/core | grep span.*version-number | cut -d'>' -f2 | cut -d'<' -f1
1 Like