Piping commands using xargs

Need help in piping commands using xargs

I have several .tar.gz files that I need to list the folder content in a subdirectory.
For example,

a.tar.gz
b.tar.gz
c.tar.gz

The following command works great for each .tar.gz file but it's a pain to run the tar command for each file.

tar -tf a.tar.gz|grep folder1/folder2

I tried this command and failed.

ls *.tar.gz|xargs tar -tf|grep folder1/folder2

I got this error message " tar: a.tar.gz: Not found in archive" for each gz file.

What do I need to fix to run this command successfully?
Or do I need to use a different command to process all the .tar.gz files?

Thank you.

I'm afraid there are more than one erroneous approaches in above:

  • the files seem to be gzip ped (by their ending) - gunzip before presenting them to tar . In fact, I'm slightly alienated that your single manual command should run error free...
  • Shouldn't there be a message like tar: This does not look like a tar archive ?
  • the -f option takes just one single file (except for a multi-volume archive which I doubt you have here)

You might want to try the -n1 option to xargs and see how far you get.

Perhaps something like this might do it:-

for targz in *.tar.gz
do
   tar -tf $targz
done | grep folder1/folder2

The problem you are seeing is because tar is seeing the input as -f first_file_listed item1_to_extract item2_to_extract so you might get away with adding -n 1 to xargs like this:-

ls *.tar.gz|xargs -n 1 tar -tf|grep folder1/folder2

This will at least run tar for each file separately. I'm not sure how it will handle the pipe to grep, but it should work.

I hope that these help. If not, please run it with debug on your shell (i.e. set -x first) and paste the output in CODE tags. It would help if you have a small set of small (few members) tar files to keep the output manageable.

Kind regards,
Robin

Thank you, Robin.

I tried both examples you provided and they both list the folder content. The only issue is both examples only print out the results without displaying the input file name so I don't know which input file produces the results.

What needs to be modified to print out the input file name and the results?

Thank you again for your help.

You could try:-

for targz in *.tar.gz
do
   echo "$targz" >&2                    # Write to STDERR, so show up on the screen
   tar -tf $targz
done | grep folder1/folder2

...or...

ls *.tar.gz|xargs -tn 1 tar -tf|grep folder1/folder2

The -t flag for xargs shows you what it is executing each time.

Do either of these help?

Robin

Hi Robin,
With pipe buffering, I don't think there is any guarantee that the output from the echo or from xargs -t sent to STDERR won't appear on the screen before some output from grep of the previous archive. And the same thing could happen if you try to capture both STDOUT and STDERR and redirect them to a single output file.

To get the name of the archive being processed in the standard output stream and survive the grep in the pipeline, one might try something more like:

for targz in *.tar.gz
do
   echo "folder1/folder2 files in $targz..."   # Write to STDOUT so script output can be redirected
   tar -tf $targz
done | grep folder1/folder2

As RudiC mentioned, some versions of tar will get lost if given a zipped archive. But, assuming that the tar on april's system gunzip s a file automatically if its name ends in .gz , this should work.

1 Like

I tried both methods from Robin and they both worked. I see the file names and the grep results following each file name.

I tried Don's suggestion

echo "folder1/folder2 files in $targz..."

. That did not give me the file names. I only see the grep results.

Thank you all for your help.

Note that if you change the pattern you're looking for in the grep you also have to make a corresponding change to the first part of the string in the echo .

I want to thank you all again for the replies. When I first read your comments about the standout output issues, I really didn't understand what you were talking about. Today I had to use the grep results and manipulate it with sed. I noticed after the sed, the file names were not lined up in the right place. I re-read your replies many times and did a lot of debugging. I finally understand!!!

Don't know if it's the best way to fix the problem but it does the job:

for targz in *.tar.gz
do
   echo "$targz" >&1
   tar -tf $targz | grep folder1/folder2>&1
done | sed ...

I wouldn't know to do any of these if you did not make the suggestions.
Thank you very much!!!

Glad you benefited from these forums, and hope you will in the future.

Two comments, though:

  • the >&1 is pointless, as it means "duplicate fd1 (file descriptor) from fd1", and echo writes to fd1 anyhow.
  • as sed has "grepping" capabilities by default, why don't you do everything needed in one sed command? If you need help on this, pls. post sample data and specific requirements.

When I was started testing, the grep ... was put after the done . As a result, I only saw the grep results without the filenames so it appeared the filenames were "missing". I added the >&1 to make sure the filenames go to standad output because I didn't know why the names were "missing" at that time. I finally moved the |grep ... inside the loop and the names showed up in the output.

After reading your comments, I removed the >&1 and I still got the same output. Cool. Thank you.

Whoa, sed has grep capabilities??? Let me read about it first and if I still need help, I will post a new thread. Thank you!

Note that xargs reads a file one line at a time or it gets the equivalent data via a pipe. Its input file could be something you created/edited or from an ls or find command. The first argument of xargs is the command you want to run, other arguments are parameters (if any) of that command. Other arguments of your command are the list read into xargs.

For example, if myfile.txt contains:
file1
file2
file3

you could look for the word sample in all three files, ignoring case with the line

xargs <myfile.txt grep -i sample
   --- and what would be excuted is ---
grep -i sample file1 file2 file3

.

HTH