Thanks.
Yeah I got that far, but I'm hoping to be able to dynamically chain multiple greps/seds/awks etc together.
The original intent was to parse webpages - using grep/sed etc is actually simple for this.
Here's the intent - I automate getting stuff - like parsing through comics and getting them for me.
I'm looking for make it more dynamic.
e.g. here's how I'm getting one of my comics:
If I want to define all data for a comic, I specify a url, and filter rules.
rooturl:
http;//www.mangahere.co/manga/kingdom/
Chaps:
grep "a class" | grep kingdom | sed 's@<.href="@@g' | sed 's@".>@@g' | sed 's-\ --g'
Page:
grep "option value" | sed 's-^.value="--g' | sed 's-"\ .--g'
Image:
grep jpg | grep "render(this)" | sed 's-<img.src="--g' | sed 's-".--g'
Chapters (lists pages) - filter:grep "a class" | grep kingdom | sed 's@<.href="@@g' | sed 's@".>@@g' | sed 's-\ --g'
beomagi@Ganymede ~
$ curl -s http://www.mangahere.co/manga/kingdom/ | grep "a class" | grep kingdom | sed 's@<.*href="@@g' | sed 's@".*>@@g' | sed 's-\ --g'
http://www.mangahere.co/manga/kingdom/v37/c423/
http://www.mangahere.co/manga/kingdom/v37/c422/
http://www.mangahere.co/manga/kingdom/v37/c421/
http://www.mangahere.co/manga/kingdom/v37/c420/
http://www.mangahere.co/manga/kingdom/v37/c419/
http://www.mangahere.co/manga/kingdom/v37/c418/
http://www.mangahere.co/manga/kingdom/v37/c417/
http://www.mangahere.co/manga/kingdom/v37/c416/
http://www.mangahere.co/manga/kingdom/v37/c415/
http://www.mangahere.co/manga/kingdom/v37/c414/
Pages (has link to image)- filter:grep "option value" | sed 's-^.value="--g' | sed 's-"\ .--g'
beomagi@Ganymede ~
$ curl -s http://www.mangahere.co/manga/kingdom/v37/c422/ | grep "option value" | sed 's-^.*value="--g' | sed 's-"\ .*--g'
http://www.mangahere.co/manga/kingdom/v37/c422/
http://www.mangahere.co/manga/kingdom/v37/c422/2.html
http://www.mangahere.co/manga/kingdom/v37/c422/3.html
http://www.mangahere.co/manga/kingdom/v37/c422/4.html
http://www.mangahere.co/manga/kingdom/v37/c422/5.html
http://www.mangahere.co/manga/kingdom/v37/c422/6.html
http://www.mangahere.co/manga/kingdom/v37/c422/7.html
http://www.mangahere.co/manga/kingdom/v37/c422/8.html
http://www.mangahere.co/manga/kingdom/v37/c422/9.html
And lastly from here
Image filter:grep jpg | grep "render(this)" | sed 's-<img.src="--g' | sed 's-".--g'
beomagi@Ganymede ~
$ curl -s http://www.mangahere.co/manga/kingdom/v37/c422/4.html | grep jpg | grep "render(this)" | sed 's-<img.*src="--g' | sed 's-".*--g'
http://z.mhcdn.net/store/manga/8198/37-422.0/compressed/g004.jpg?v=11425389828
So the idea is to create templates for a script to follow through for various comics. So If I want to add comics, I just have to come up with a url to initially follow, and rules to define pages. The grep/sed etc would need to be dynamic. I've done it in other languages, but bash is kinda convenient, and more than anything now it's irking me that I can't figure out why I can't make a system call echoing a variable and piping to grep.