As you can see, there are many lines in the file (in this case 30000). Because of that I'm using a trick to download many URLs simultaneosly with this:
cat url.list | xargs -n 1 -P 10 <<MAGIC COMMAND THAT WILL SAVE ME>>
The problem is that I'd like to rename the output file with the same value of the name field, like: 1.html, 2.html, ..., 30000.html ecc, and use curl to limit the size of the file to 50KB. So the curl command should be something like:
curl -r 0-50000 -L $URL -o $filename.html -a $filename.log
How can I have it done?
I can parse the output of the pipe with echo $URL | sed -n -e 's/^.*name=//p' but I don't know how use this in the same line grabbing the output of a pipe in 2 variables ($URL and $filename).
while IFS="?&=" read URL X X X X X FN REST; do echo $FN, $URL; done <url.list
1, http://domain.com/teste.php
2, http://domain.com/teste.php
, ...
30000, http://domain.com/teste.php
The X es are dummy variables. Instead of the echo , put in your magic command. There have been threads on "parallel" execution with some tricks; use the search function in here.
Thank you so much for your help @RudiC , I'll try your tips this night and post here back. I walked a little yesterday with this codem and figured out how to use xargs to "parallelize" the jobs to curl:
xargs -n 1 -P 10 curl -s -r 0-50000 -O < url.list
But the problem is that I can't rename the file as I want. So what I did is cd to my destination directory path, and then I run the code above. But I notice that if there is some similar filenames in differents URLs the first file is overwrote by the last one. Because of that, if I want to keep the same destination directory, be able to rename the output is mandatory.
Thank you for your reply! I did not make this work. I tried to write a script only for this function and call it, tried put "inline" command inside a screen, and always is the same error:
xargs: fetch_urlxargs: fetch_urlxargs: fetch_url: No such file or directory: No such file or directory
export -f is a bash feature and I use it here to insure the internal function fetch_url is exported to sub shells. This is needed as xargs is an external command to the shell and runs the assembled commands in a new shells.
I assumed, as you were using GNU xargs (-P feature is a GNU extension), that you were also using the bash shell. I've updated my original post to specify the required shell, and this is all you may need to do to get you version working.
However, if you do not wish to use bash, you could put your function in an external script so that it can be called from xargs for example: