I'm trying to crudely hack my way through some data processing.
I have file.txt with around 17,000 lines like this:
ACYPI002690-PA.aa.afa.afa.trim_phyml_tree_fullnames_fullhomolog.txt 3 72 71
ACYPI002690-PA.aa.afa.afa.trim_phyml_tree_fullnames_fullhomolog.txt 97 111 71
ACYPI003779-PA.aa.afa.afa.trim_phyml_tree_fullnames_fullhomolog.txt 7 66 66
The first column corresponds to file names, the second to line numbers. I am running this code to generate the text of a command that I need to execute for each of these lines (I don't think the code matters for this question but just in case...).
awk '{print $1,$2}' file.txt | sed 's/\(A.*txt\) \([0-9][0-9]*\)/cat \1 | awk "NR==\2" | grep -Eo "comp[0-9]{1,6}_c[0-9]{1,2}" | sort | uniq | wc -l/
When I try to run these commands using xargs
awk '{ if ($4 >= 10) print $1,$2}' file.txt | head -5 | sed 's/\(A.*txt\) \([0-9][0-9]*\)/cat \1 | awk "NR==\2" | grep -Eo "comp[0-9]{1,6}_c[0-9]{1,2}" | sort | uniq | wc -l/' | xargs -0 -s 5000 bash -c
it works for a small number of lines (i.e. the head -5 term). But when I try to run it on the entire file I get the error:
awk '{print $1,$2}' file.txt | sed 's/\(A.*txt\) \([0-9][0-9]*\)/cat \1 | awk "NR==\2" | grep -Eo "comp[0-9]{1,6}_c[0-9]{1,2}" | sort | uniq | wc -l/' | xargs -0 -s 5000 bash -c
xargs: insufficient space for argument
I also tried saving the commands as a file (commands.txt) using the above script, then executing using a while loop
while read -r line; do command " $line"; done <commands.txt
But I get the error like this for each command:
-bash: cat ACYPI002749-PA.aa.afa.afa.trim_phyml_tree_fullnames_fullhomolog.txt | awk "NR==3" | grep -Eo "comp[0-9]{1,6}_c[0-9]{1,2}" | sort | uniq | wc -l: command not found
Any ideas how I can get this done?
EDIT: I realized I can get this done using source but I'm still interested to know what's wrong with the above approaches.
source commands.txt