I have large files with url-s ending on "|<number>" which is the Page Rank for the website as shown in the example below http://www.machinokairo.com/2012/05/post-39.html|2
I am using "grep" to sort out all url-s in a particular way: first, remove all ending on "|0" and write the output to a file, then remove all ending on "|1" and write the output to a new file and so on up to "|5". Each time I remove certain PR and have the rest in separate file. For now I use the following commands to do that
Thank you for your input.
Command works in a bit different way that I need and
creates output files from sitelist_PR .txt to sitelist_PR9 .txt, actually I needed only to PR.6, but that is fine.
There are two things however that are different from the desire output:
each file contains only url-s with PR same as in the filename i.e. sitelist_PR1 .txt contains url-s with PR1 only - my goal was to remove those url-s and have all the rest higher than PR1 in this file;
when I look at the file name I see blank space before .txt
---------- Post updated at 02:33 PM ---------- Previous update was at 02:29 PM ----------
Thank you for your efforts but I really need a script or one line command. I have already tried the following
#!/bin/bash
for PR in {0..5} ; do
grep --invert-match "|${PR}$" sitelist.txt > sitelist_PR${PR}.txt
done