Understanding Xargs

I'm struggling to understand the man page entry about xargs in conjunction with the option -I. It states:

And the description reads:

Am I correct interpreting that as the fact that if no specified otherwise xargs -I can pass to a command following it no more than 5 arguments (a.k.a "strings") each of them containing unlimited number of occurrences of the data to be replaced using that command?

Furthermore what is a string in UNIX?

Example:
pass a list of highest level objects of current directory to some command

ls . | xargs some_command

Does ls return one string comprising multiple lines? How many arguments that passes for? Or is one line = 1 string? 1 argument?

A string in *nix is a sequence of (not necessarily printable) characters terminated by a null (0x00) character.

xargs usually collects lines read from stdin into one or more long (influenced by several options) parameter lists and executes the command / utility one or more times with the respective parameter list. With the -I option you can define where in that command execution the data from stdin show up. I'd recommend you do some exercises and testing with some innocuous commands and the -t (verbose) option.

If you need to filter files based on information only provided by "ls -l" (long list), you can do so but then remove everything through the last space with a sed command. Its output can then be passed to xargs and any do whatever needs to be done to the files.

I'm sorry, I wasn't here for a very long time thus effectively mismemorizing many things and the forum's very abstract, disorganized cold interface confused me as to where did I find a relevant section to post to. Sorry, anyway.

---------- Post updated at 11:11 AM ---------- Previous update was at 11:02 AM ----------

1.But do these lines form a string?

  1. The problem lies for me considering distinction between the notions "string", "line", "argument", "occurrence" as per the cited manual page.

From xargs manual:

That saying what are the "strings as arguments" exactly? Is that like a list containing items? If yes what those items are? Are they lines?

If yes, are occurrences contained in these lines?

I'd like to construct in my mind a sort of "object model" for any of the terms.

Now, these are very basic questions from quite broad a range of IT, and I'm not sure I can cover that to satisfaction. On top, there may be a language barrier, e.g with "occurrence". I'd advise to use a dictionary, as - to me - the meaning got immediately clear when seeing the translations. Please also consult introductory text books and / or man pages.

It might be worthwhile to internalize the concept of a string which you will encounter everywhere in IT (tools, databases, documents, files, ...) when needing to represent text. It can come in a variety of shapes, like fixed or varying length strings, zero terminated or with a leading length indicator, string constants, substrings, string concatenations, and what have you, and there are many tools, libraries, functions to handle them. Different digital items (numbers, logical values) can be output to screen as readable text representations only, not as the individual items themselves.

Then, there are text files, a loosely structured collection of (mostly) printable characters. In *nix systems, those consist of lines of characters terminated by a <new line> (\n, ^J, 0x0A) character. But this is not the only possible text representation. When reading a line from a file, you can put it into a single string variable, or split it into several substrings. If you do so by applying spaces and / or punctuation chars for separating, the substrings will be words. But any other separation is possible albeit not necessarily sensible. So, a line is sort of a superset of (a group of) strings.

A "command line" specifies a collection of a command name (perhaps including a path), zero or more options (with possible arguments), and zero or more parameters. Any of those is a (possibly one character) string, analysed by the command interpreter, and then supplied to the program being executed. Please be aware that the terms "argument" and "parameter" are not strictly distinguished between and both are loosely and interchangably used. (I neglected possible local variable assingments and redirections to avoid overcomplicaton.)

I'm not sure I got you right: there's absolutely no alleged "language barrier"; why the word "occurrence" caught your eye I didn't get either. That was not some kind of mistyping (which would be a common thing among both native and non-native English speakers even then and certainly would not be the worst one to consider it intolerable). I use dictionaries well enough when I deem it necessary. Thank you for your advise.

I think I made it unambiguously clear when I referenced man pages in all of my earlier posts. I suppose you have a solid tech background so I guess for you the language of man pages is anything but non-perceptible. However for many people lacking that level of expertise some phrases, terms, word choices, word combinations and the overall jargon those pages are written in, feel very painful and so are those "introductory books" written by the extra-class specialists, apparently reckoned on individuals sharing the same level of knowledge, these specialists failing to render the material in a way more acceptable by general audience; on the other hand those few that attempt to do that often fall victims of incoherent, inconsistent, over-simplified, too rushed manner of unfolding of instructional material. I suppose this forums were not designed to serve concerns of a tiny stratum of �lite expertise bearing IT-specialists only. Everyone can read any manual or watch a tutorial but not everyone is guaranteed to understand it. That's why forums exist: for these persons to get opportunity to raise their issues and expect their questions to be addressed. And if not then what's the point of the forums such as this one at all?

I thought (and I really meant to) I started my thread in "Beginners" section of this forum but for some reason it was moved to Shell and Programming though the very fact of someone asking about xargs by no means implies the person asking it is a shell programmer and neither am I. My level is basic as opposed to Advanced obviously suggested by that title.

That's the thing, what I asked was not to "cover to full satisfaction" but to give a short outline, a compact account of the notion of the string. I know what string is in AppleScript and the authors were smart enough to explain that, but in all the tutorials on Unix (applicable to bash shell on the Mac) I watched, their authors simply skip this step as overly obvious (for their taste) and thus redundant (for their taste).

Ok, that basically conforms to my understanding however some clarification needed regarding standard in-/out -put first and foremost:

  1. Lines are delimited by using one of those new line characters.
  2. Strings are determined by splitting one string of the read data (which to put it simply is an entire output of a read command - in this case standard output) to many subsets/substrings.
  3. Any punctuation, whitespace char, save new line/zero, sets new string which is a word (because these delimiters are by default obviuosly).
  4. Any other separator also sets a new string.

That's interesting for I could see it conversely, based on points 1-4: a string is a superset of lines since a line of characters delimited by an appropriate symbol constructs a string because every non-separated character or/and word is a part of a bigger string item (that could be turned into a substring), especially if reading a text file returns none other than string. No? I'm thinking in terms of stdin and stdout.

Returning to the main topic:

I understand that xargs only passes arguments to another command. What do they mean by replstr and replacement arguments (no more than 5 of replacement arguments if we use -I option without specifying that -R sub option)? What is replaced with what and where? Can I replace certain text items of input? Wouldn't it be wiser to use sed?

I saw even this (read a file which contents are a list of fruits and append a new words (constantly repeating like "I like") to an every line containing a name of a fruit):

cat fruits.txt | xargs -I {} echo "I like {}"

, where {} is some placeholder (and in maths and AppleScript "{}" actually means a multitude and a list respectively and these two are related notions).

The result (stdout):

I like banana
I like apple
I like pear
..etc (10 more fruit names)

What's the technique? I see that -R is not specified, what does that tell us? "Specify max number of arguments that -I will do replacements in": yeah, but what's the class of a value R takes?

In the way the description is put it's not immediately clear for me how to use these options.

P.S. The first 2 screenshots of xargs man pages as shown in Dash for Mac.

A single sentence in a man page doesn't stand alone. If you look at the SYNOPSIS section of the xargs man page on macOS you'll see:

SYNOPSIS
     xargs [-0opt] [-E eofstr] [-I replstr [-R replacements]] [-J replstr]
           [-L number] [-n number [-x]] [-P maxprocs] [-s size]
           [utility [argument ...]]

Every time you see "replstr" on this man page as underlined text, it will be referring to the option-argument given to a -I or a -J option. Every time you see "replacements" on this man page as underlined text, it will be referring to the option-argument to the -R option. As noted on the description of the -I option, if no -R option is present on the command line, the value of "replacements" defaults to 5.

In the command line:

cat fruits.txt | xargs -I {} echo "I like {}"

or the better command line (using fewer system resources, running faster, and producing the same output):

xargs -I {} echo "I like {}" < fruits.txt

there are two occurrences of the string {} : first occurrence is as the option-argument to the -I option and second is in the quoted argument to the echo command. With the -I {} option, each time {} appears (or occurs) in the command line arguments that xargs will be executing, it will be replaced by the contents of the current line of input read from standard input. So, if you get:

I like banana
I like apple
I like pear
...

from one of the above command lines, you'll get:

I like banana, I really like banana
I like apple, I really like apple
I like pear, I really like pear
...

from the command line:

xargs -I {} echo "I like {}, I really like {}" < fruits.txt

in other words, each occurrence of the -I option-argument in the command line arguments will be replaced by a copy of the current input line.

Does this help?

That's the problem: I know that instead of {} we could use any other symbol which would stand for the same thing. I know that xargs is essentially a loop that collects collects arguments and makes an utility execute it till the original input is exhausted. But what {} in the command line stands for actually? Is it empty space? If yes, is that the empty space that xargs is watching for in input? Or maybe it's a variable?

Also.

  1. Could it be that in this case using xargs is equivalent to Find-And-Replace. Can we replace any word of the input to whatever comes to our mind

2.Does -R means a number of arguments replacement occurs in?

  1. Can we replace any input with whatever we want with -I ?

Could you please give an example of using xargs with both -I and -R options?

1 Like

I don't understand your question. You chose to use the string {} when you used it as the option-argument to the -I option. You could have chosen any string you wanted. You chose {} .

No! There is no Find operation here. You are constructing a command line to be executed by xargs for each line it reads from its standard input. Each occurrence of replstr ( {} in your example) in the first replacements arguments given to the command in your invocation of xargs that contain the string replstr will be replaced by the contents of one line read from standard input.

The -R option-argument specifies the maximum number of arguments containing replstr in which replstr will be replaced by the contents of an input line.

NO! The xargs utility doesn't replace any input! It replaces the replstr -I option-argument string found in command line arguments being used to construct commands to be invoked by xargs in up to replacements arguments that contain replstr.

If the file in.txt contains:

a
b c d
e

then the output produced by the command:

xargs -IX -R3 echo 1 X 2 XX 3 "X X" 4 X < in.txt

will be:

1 a 2 aa 3 a a 4 X
1 b c d 2 b c db c d 3 b c d b c d 4 X
1 e 2 ee 3 e e 4 X

Note that with -R3 , the X in the last argument to be passed to echo is not replaced by the contents of an input line because all occurrences of X have already been replaced in three earlier arguments. With the command:

xargs -IX -R1 echo 1 X 2 XX 3 "X X" 4 X < in.txt

the output would be:

1 a 2 XX 3 X X 4 X
1 b c d 2 XX 3 X X 4 X
1 e 2 XX 3 X X 4 X

replacing replstr ( X in this case) only in the first argument that contains replstr.

1 Like

I think there's a simpler way to solve this problem.

The -I option is written to solve a hole in xargs's features.

xargs lets you do this:

echo a b c | xargs echo -option # xargs runs echo -option a b c

But what if you wanted to run echo a b c -option ? The basic syntax has no way to do that, it can't append arguments, just prepend.

But the -I syntax lets you do that:

echo a b c | xargs -I %% cat %% -option # Substitute %% -option with a b c, resulting in a b c -option

That is all -I does.

As for what line means to xargs? Usually, nothing. It splits apart text on any whitespace and only cares about lines in particular if you tell it to.

1 Like

That's how I ultimately came to the understanding: