Working with grep and Bash

Hi, I am currently working on a Bash shell script that

  • Downloads a webpage, in this case youtube.com
  • Extracts Number of views, Extracts Title of video, Extracts User who made it, and lastly Duration. Then I have to Out put this into columns.

To me this sounds like crazyness. I'm very new to bash and even using grep/cut commands So far this is all I got. I decided to pass all my commands to variables to try and make a table then output but its very messy. What I am trying to do is

make tables for

Views | Title | User | Time

0000 blah me 0:10

etc but I have no idea how to cut out information from the html and organize it.

#! /bin/bash

wget -O Link.html youtube.com

echo [Views]            [User]          

x=$(grep views Link.html | grep -v 'div' | cut -d ' ' -f5 | sort -n | tr -d ',')
y=$(grep title Link.html | grep -v 'div' | grep -v span |cut -d '>' -f1 | grep -v class | grep -v '<')


IFS="'echo'"
for line in $x
        do
                echo $line
        done

#echo $y
~                                                                                         
~                                                                                         
~                                                                                         
~                    

Its probably really messy but I'm lost on how to extract the correct titles which are usually in the " " ex. title="Harlem Shake (Matt and Kim Edition)"

Also I have no idea how to organize or sort it so it matches the correct amount of views. I'm not looking for the answer I'm looking for the solution. So I can do the same for the other parts of this sh myself.

Once again I am very new to this stuff lol.

The best can get are the views to show up organized. The titles are all over the place.

I don't think using BASH for extracting data from downloaded file is a going to be an easy task.

I suggest to use AWK instead which is designed specifically for data extraction and reporting.

On top of bipinajith's statement, it would be helpful if you posted (edited) samples of what you've got, e.g. a few meaningful lines of your downloads - DON'T post the videos! It's difficult to guess the line layout from the greps you've posted above.

Hi sorry guys for the late response I managed to figure out how to extract all my information and send them to variables. It maybe sloppy but right now it doesn't matter.

What I was looking for is lets say I have some lines like this

<div class="feed-item-content-wrapper clearfix context-data-item" data-context-item-title="Mike Chang's Superbowl Workout - Part II" data-context-item-user="sixpackshortcuts" data-context-item-views="114,925 views" data-context-item-id="V1l1TGUUaj0" data-context-item-type="video" data-context-item-time="8:44" data-context-item-actionuser="sixpackshortcuts">


I would need to Extract the Views, duration(time) username, and title. Which I did using these simple codes and passed them to a variable

#! /bin/bash

#wget -O web2.html youtube.com

echo [Views]            [time/Duration] 
echo ----------------------------------
echo    

views=$(grep views web2.html | grep -v 'div' | cut -d ' ' -f5 | tr -d ',')

users=$(grep item-user web2.html | cut -d '"' -f6)

duration=$(grep item-time web2.html | grep -o '"[^"]*"' | grep : | grep -vi '[a-z]' | tr -d '"')

title=$(grep item-title web2.html | cut -d '"' -f4)

So lets say I echo "$views" it will show something like

11
2244
2423532
2342
2324

echo $title

I'm hungry
Pewdiepie rocks

My issue now is as you can see in the code its incomplete, I have to construct a table with columns and display all my data in columns. I have no idea what soever where to start I need it to show up like

[Views]---------[Title]--------------------[user]

[LEFT] 12 ------------Pewdiepie--------------Pewdiepie
1212 --------IDONTCARE!--------Somedeadguy
1212 --------BARRELS!!-----------idonthavewifi
1 --------------Poop]--------------------etc
[/LEFT]

(minus the '-' it was used for an example to look more clear)
and so on with the time duration. How can i go about outputting the contents of the variables in such a way?

You might want to consider using awk . As a starting point, working for your sample, try and then adapt to your needs:

awk     '/item-(user|views|time|title)/ {getline Ar[++i]}
         END {for (n in Ar) printf "%15s ", Ar[n]
              printf "\n"}
        ' RS="[=\"]*" file
Mike Chang's Superbowl Workout - Part II sixpackshortcuts   114,925 views            8:44 

Not every awk implementation will accept above RS construct, so you might need to experiment a bit.

In my case I cant use awk yet, if I get a quiz or exam in the near few days/week it will be using quick and dirty methods of grep, cut, expr etc.

Is it even possible to read in my $views, $titles, $users etc into an array that will list them going down? or will i have to find another method in doing this?

Oh, this is homework? Then pls follow the forum rules; #6.
And, yes, it is possible, although not effeicient.