Need to modify csv-file with bash script

Hi Guys,

I need to write a script, that exports the "moz_places" table of the "places.sqlite"-file (firefox browser history) into a csv-file. That part works. After the export, my csv looks like this:

...
4429;http://www.sqlite.org/sqlite.html;"Command Line Shell For SQLite";gro.etilqs.www.;3;0;0;98;410;1265905218764118
4562;http://www.unix.com/shell-programming-scripting/127699-question-about-function-calls.html#post302386693;"Question about Function calls - The UNIX and Linux Forums";moc.xinu.www.;1;0;0;182;47;1263646391422534
...

Now I want to extract just the visited URL and the date of this visit for all visits.

I did this like this:

awk -F ';' '{print $10, $2}' temp.csv

this works and the result looks like this

1263646291226002 http://www.unix.com/cfmgoogle.php?cx=partner-pub-6323928554267084%3Absye1r5tx7j&cof=FORID%3A10&q=xml&sa=Grep&siteurl=www.unix.com%2F
1263646297314065 http://www.unix.com/cfmgoogle.php?cx=partner-pub-6323928554267084%3Absye1r5tx7j&cof=FORID%3A10&q=xml&sa=Grep&siteurl=www.unix.com%2F#1333
1263646362729683 http://www.unix.com/cfmgoogle.php?cx=partner-pub-6323928554267084%3Absye1r5tx7j&cof=FORID%3A10&q=xml+shell&sa=Grep&siteurl=www.unix.com%252F

Okay, and here's my problem. I don't need a timestamp in front of the visited URL, I need a date. It should look somehow like this:

...
Do 11. Feb 19:00:23 CET 2010 www.google.com
...

(Sorry, my date is in german)

I know that I have the reduce the timestamp which I get of firefox to the first 10 characters, because the rest doesn't belong to the date. I already found out how to do this with awk:

awk -F ';' '{print((substr($10,1,10)))}' < temp.csv

but how can I convert this

1265911223 www.google.com

into this

Do 11. Feb 19:00:23 CET 2010 www.google.com

I know how I can convert one single date in commandline

date -d @1265911223

but how can i do this for the whole csv. I tried to store the timestamp into a variable, but i don't know how to do this for line after line. As I tried, it stored all timestamps in a long sting in the variable. I thought it would be easier to do this... :frowning: Maybe someone of you can help:confused:

Greetings

Sebi

echo "1265911223 www.google.com"|awk '{str="date -d @"$1; print system(str)}'

Hey,

thanks for the answer. That works, for one single line and for a right formated "timestamp". I think, my document has about 2000 lines, so how can I transfer your idea to my document? And another problem is the " timestamp. Firefox stores the last_visit_date-entry in a string which looks like this

1263818137802359

but the timestamp is only the first 10 characters of this sting

1263818137

I don't know what the other 6 characters are for, and that does not matter for me at the moment. So, I know that I can reduce the string to the first 10 characters with the substr command, for example

substr($1,1,10)

which will do exactly what I discribed. The Problem ist the csv, that you can see in the "Question-post" at the beginning of this thread. I export a sqlite-database to csv. This csv, hast 10 values, separated by ";" in each line. But I only need the second and 10th value of each line. The 10th value is this (somehow) strange timestamp, that must be reduced to the first ten characters and changed into a real date. After all I want to store the date followed by the visited URL in a new document that should look like this

"date" "URL"
"another date" "another URL"
...
...
...

How can I do this?:confused::confused::confused:

Try this:

awk '{
  "date -d @ "substr($10,1,10) | getline date
  print date, $2
}' temp.csv

What about this:

$ sqlite3 places.sqlite << EOF | while read datum url; do echo $(date -d @$datum) $url; done
> .header off
> .mode csv
> .separator " "
> select substr(h.visit_date,0,10),p.url from moz_places as p, moz_historyvisits as h \
> where p.id=h.place_id order by h.visit_date limit 10;
> EOF

If it works as intended, remove the line limit, and save the SQL statements to a file, sourcing that instead.

@Franklin52

Thanks for this fast answer,

but I get a very long error message, which has something to do with the "date" command. At the beginning it says

sh: Syntax error: Unterminated quoted string

sh: Syntax error: Unterminated quoted string

The rest of the message is in german, so I will try to translate one of the error massages (they have all the same content, just the argument is different):

date: The argument ""1263646291"" hasn't got the leading "+".
When an option is given, to specify the date, every argument, that isn't an option, must be an formatstring, that begins with an "+".
"date --help" gives more Information. 

Hope you can understand this translation... :wink:

Sorry, I forgot the fieldseparator:

awk -F ";" '{
  "date -d @ "substr($10,1,10) | getline date
  print date, $2
}' temp.csv

BTW: Did you get the correct output with the command provided by linuxpenguin?

echo "1265911223 www.google.com"|awk '{str="date -d @"$1; print system(str)}'

@Franklin52
I already saw that the fieldseparator was missing...

But, even if I use the fieldseparator, there's still the probleme which i tried to translate. It seems that there's a problem with the date syntax, or something like that.

BTW:
If I write this at the commandline

echo "1265911223 www.google.com"|awk '{str="date -d @"$1; print system(str)}'
this is the result

Do 11. Feb 19:00:23 CET 2010
0

@pludi

Hey pludi,

also an interesting way. But also in your example there's a problem modifying the date. If I try your code, the result is this

date: ung�ltiges Datum �@126364422;http://www.google.de/�

date: ung�ltiges Datum �@126364535;http://www.google.de/�

date: ung�ltiges Datum �@126364535;http://www.google.de/#hl=de&source=hp&q=unix+forum&btnG=Google-Suche&meta=&aq=f&oq=unix+forum&fp=749e75dee4f1c52f�

date: ung�ltiges Datum �@126364536;http://www.unix.com/�

date: ung�ltiges Datum �@126364538;http://www.unix.com/shell-programming-scripting/�

date: ung�ltiges Datum �@126364547;http://www.google.de/�

date: ung�ltiges Datum �@126364556;http://www.google.de/�

By the way. It seems that you know the firefox databases. Do you know where exactly the visit date is stored. I've searched a long time, but I didn't get the right answer. In my moz_places in places.sqlite is a value called last_visit_date. In your example, you used the visit_date from the moz_historyvisit table. Is there an difference between these two values? The problem is, that I didn't found any information about last_visit_date at all, but I found information about the value you used (visit_date in moz_historyvisit). If yours is the right one, I would prefer your way to extract the date and URL...

Unfortunely I can't test it out at the moment without the GNU date....

I'll take a look again if you don't have the solution till tonight (EU).

It's important to define a single space as a separator in SQLite, as that's the easiest way to use it with the while read-loop. If you need a different separator, add it in the echo instead.

And no, I don't really know the databases, I just used a table that seemed right. Better check if the values you get are/could be correct.

@Franklin52

Hey Franklin,

thanks for your help. I found the mistake... In your post, there's a blank between @ "substr($10,1,10), so the timestamp did not stand right behind the @-Symbol... Again, thanks for your help... :slight_smile:

@pludi

I also wanna thank you, maybe I can use your version somehow... :slight_smile:

I couldn't test the command, but glad to know you get it working now. :slight_smile:

Regards