tar --exclude with curly braces

I'm having trouble understanding the exclude option in tar. From some web sites, it seems one is able to exclude several strings by enclosing them in curly brackets. However it seems to be "random" what gets excluded when using the curlies.

I've been using the exclude-from=myfile option in a script but would like to inline myfile into a variable and use $exclude={this,that,the_other} with the --exclude=$exclude option to keep the script self-contained.

Simple examples to illustrate:

Say I have directories C and D and want to make them into a tar file.

tar -cvf mytar.tar C D

This works.

Now for the sake of argument, I want to exclude C.

tar -cvf mytar.tar C D --exclude=C

This works as expected (C is excluded).

Now I bring in the curlies...

tar -cvf mytar.tar C D --exclude={C}

This does not exclude C, which is not what I expected.

If I try to exclude a subdirectory of C, it also doesn't work.

tar -cvf mytar.tar C D --exclude={C/subdir}

Nothing is excluded.

If I exclude both C and D, then it does work (both are excluded).

tar -cvf mytar.tar C D --exclude={C,D}

Any idea how I can get the same functionality as --exclude-from with --exclude?

Reading thru the gnutar (version 1.26) source I can't see any intended support for the {} matching.

I'd say this is probably an unsupported "feature" you should probably stick with the documented method of using multiple --exclude arguments for each pattern to exclude.

---------- Post updated at 11:32 AM ---------- Previous update was at 11:17 AM ----------

Thought: Could the shell be expanding your parameters?

$ echo {C}
{C}
$ echo {C,D}
C D

Chubler_XL's thought is right on the mark. The shell running the command supports brace expansion and it is generating multiple --exclude options.

When there's no comma between the braces, there's no brace expansion. When there's no brace expansion, the pattern passed to tar includes the braces themselves. Unless there's a filename with literal braces, that exclude will not match anything.

To confirm: set -x; tar .... ; set +x

Regards,
Alister

Thanks, at least I'm not going crazy :wink: Searching around a little more, it seems there is a way to include multiple arguments. Basically you create a temp file with the patterns to exclude and point --exclude-from (or -X) to that file. See below.

#!/bin/bash

include="C D"
exclude="C xyz"

tar -cvf mytar.tar $include -X <(for i in ${exclude}; do echo $i; done)

---------- Post updated at 03:20 PM ---------- Previous update was at 02:23 PM ----------

My mistake, this script has problems with wildcard expansion. E.g. if you wanted to exclude *.tar then it would expand the asterisk prematurely rather than passing it to the tar command. Here is a workaround - not too pretty but could be worse.

include="C D"
exclude=("C" "*.tar") # note this is an array now

mycommand="tar -cvf mytar.tar $include"
for i in `seq 1 ${#exclude
[*]}`; do mycommand=$mycommand\ --exclude="${exclude[$i-1]}"; done

eval $mycommand

What is the eval for?

Eval is to be avoided. If someone puts `rm -Rf ~/` in your filenames, eval will execute that.

Sure, just use mutliple --exclude options.

It doesn't make much sense to resort to a dynamic, command-building process when the "variable" is constant data hardcoded into the script. Why not just add one --exclude pattern per pattern? Your script will be simpler, more readable, more efficient, and safer.

Still, if you're determined to store exclusion patterns in a variable, the following is much less complicated:

include='C D'
exclude='C
*.tar'

tar -cvf mytar.tar $include -X <(cat <<EOX
$exclude
EOX
)

This does not work with the only version of GNU tar that I have (an elderly 1.16, circa 2006). strace shows tar closing all file descriptors in the range 3 to 1023, sabotaging the shell's (bash) attempt to pass the cat process substitution on fd 63.

However, although I have not tested it, it should work with newer versions of GNU tar. A pair of 2007 commits removed both the function call and the function definition implementing the fd closures.

tar.git - GNU Tar - src/tar.c - (main): Don't call closeopen

tar.git - GNU Tar - src/misc.c - Don't include <sys/time.h>, <sys/resource.h>; no longer needed.

Regards,
Alister

The reason for using eval $mycommand is so that other users can edit the top of the script where exclude and include are defined rather than having to read through the code. Your second construct is elegant - but does not work on OpenSuse 11.4.

But why are you using it? What function does it perform that tar by itself couldn't? You don't even need eval to expand wildcards, the shell does that when evaluating any unquoted variable.

Try $mycommand without the eval.

The problem I encountered was that wildcards *were* expanded when I didn't want them to be. If exclude="C *.txt" then "for i in ${exclude}" expands the asterisk in .txt to a list of .txt files in the current directory, rather than passing the string ".txt" to tar. I couldn't see an easy way around it, hence the eval. Happy now or still hate it? :rolleyes:

As it is "eval" won't help you avoid this problem, but in fact make it worse. Without "eval" your command line becomes interpreted by the shell once, with it twice. I don't think "eval" should be avoided generally (like you seem to believe Corona688 is thinking), but in this case it is simply the wrong tool - and probably very dangerous, as Corona688 pointed out correctly.

If you want to avoid interpretation of a certain string use quotation - that is, what it is for. For intance:

exclude="*.txt"

$ echo $exclude
a.txt b.txt c.txt
$ echo "$exclude"
*.txt
$ echo '$exclude'
$exclude

Now, let us apply this to your problem. I use ":" as a delimiter here, like in the PATH variable, because commas or something such could be used in filenames and i want to avoid that:

#! /bin/ksh

excludelist="foo:*.bar:baz.*"

exclude=""

while [ -n "$excludelist" ] ; do
     exclude="$exclude --exclude \"${excludelist%%:*}\""
     if [ "${excludelist%%:*}" = "${excludelist}" ] ; then
          excludelist=""
     else
          excludelist="${excludelist#*:}"
     fi
     # this is just to show how it works, can be removed:
     print - "excludelist: \"$excludelist\" args to tar: \"$exclude\""
done

tar <options> $exclude <more options>

I would like to say for the record, that i don't think the contents of the variable excludelist should be modified by modifying the scripts text at all. Instead a configuration file should be passed to the script, which contains all the exclusions. How to construct a syntax for such a configuration file and how to parse it i have described here, but the way of feeding the options to tar will still be as lain out here.

I hope this helps.

bakunin

After giving the problem some afterthought, i think to know how the "eval" came to pass:

If you want to use wildcards with a special meaning and pass these to a utility without having them interpreted by the shell you have to use quotation:

opt="*"
utility $opt     # won't work unexpanded
utility "$opt"   # will work without expansion by shell

If you try to bind together several options into a single variable and put this variable into a quotation it will count as a single parameter:

opt="-a * -b ??"
utility $opt     # will become expanded, therefore useless
utility "$opt"  # will not become expanded, but will count as only *one* option

There is a typical gotcha, which i found in some IBM software not too long ago: mysteriously the IBM utility made run "/dev" out of diskspace, which is a very bad idea under AIX. How did that come? Easy: first there was some line like:

tgt="/dev/null"
something > "$tgt"

But "something" also produced output on stderr and someone - very economically - changed "$tgt" therefore:

tgt="/dev/null 2>&1"
something > "$tgt"

This will not work any more, because now the output goes to a file named "null 2>&1", which is a regular file in "/dev" - of course, it will eventually run out of space.

If you now use "eval", this will restart the evaluation process and hence make the argument count correct again. Alas, this also restarts the evaluation process by the shell again and therefore wildcards will be expanded again too.

The only solution is to create your own option line in an evaluation-proof way like i have done in my script sketch: i surround the delicate wildcards with quotation of my own to protect them when i send the commandline unquotedly to the shell (see the "\"", which will create literal quotation marks in the output).

I hope this helps.

bakunin

You're using eval to avoid expansion? That's the oddest thing I've heard this week.

It's not that I "hate it". It's that it's "such a bad idea that you're going to kick yourself for it when something bad happens".

I believe tar has a --excludefile option or something like it which would be a much more elegant solution without putting self-modifying code into eval and hoping nothing bad happens.

It does.

From post #1:

Regards,
Alister