copy specific files and count them - not as easy as it seems!

Hi all:

Here's my dilemma: to identify files of a specific type, copy them to a new location while preserving the original file attributes (date, time, full path, etc), and at the same time capture the count of the number of files identified as a variable for later reporting.

Here's where I am so far:

find . -type f -iname '*.xls*' | tee wc -l | cpio -dumpv /destination_path

I know that to capture the output of wc -l as a variable, I need to enclose the command in backticks:

xlscnt=`find . -type f -iname '*.xls*'| wc -l`

the problem comes when I try to tee the output from the find command into a wc -l and into a cpio -dumpv. I get to do one or the other, but not both.

Sure, I can do a separate find on the output files and run a wc on that, but that's duplicated effort. There's got to be a way to do this!

OS is Mac OSX, though that shouldn't matter, I'm doing this in bash.

I now defer to people much smarter than I am, and look forward to your assistance!

Bash and some systems ksh can do this:

find . -type f -iname '*.xls*' | tee >(
  cpio -dumpv /destination_path
 ) | wc -l | read cnt
1 Like

Thanks for the assist! It got me a bit further along.

I ended up with the following:

xlscnt=`find . -type f -iname '*.xls*' | tee >(cpio -dumpv &outpath) |wc -l`

from a command line, this will copy out xls files, and store the count from wc -l in the variable, xlscnt.

The problem now is, this works from a command line prompt, but it doesn't work in a bash script. What I get in the bash script (named myscript.sh) is the following:

myscript.sh: command substitution: line 20: syntax error near unexpected token `('
myscript.sh: command substitution: line 20: `find . -type f -iname \'*.xls*\'|tee >(cpio -dumpv $outpath)|wc -l'

Line 20 of myscript.sh is as follows:

xlscnt=`find . -type f -iname '*.xls*'|tee >(cpio -dumpv $outpath)|wc -l`

I want to capture the variable xlscnt for inclusion in a report later.

One step closer.........

#!/bin/bash
# tested on bash 4
shopt -s globstar
shopt -s nullglob
numfiles=0
for files in **/*.xls
do
    cp -p "$files" /destination
    ((numfiles++))
done
echo $numfiles

1 Like

Thanks, bash-o-logist!
I tried your solution, but was getting an error on line 3 of the shopt command.

My overall script asks the user for a source path, and a destination path, then reads file from the source and writes to the destination. I tried to use cp before, but since several of the source paths contain blank spaces, I was getting errors because cp couldn't interpret the space.

Ultimately, I'll be expanding my script to allow for capturing other types of files too, with a count for each file type. That's mainly why I was trying to accomplish this using the method I posted as line 20 previously.

If it works on the bash command line, and not in a bash script, is the script really being bash interpreted, not sh? It must be chmod to executable as well as having the right path to bash on the #!path first line, see man execvp.

I like "script_chain | read var_name" over "var+name=$( script chain )" over "var_name=`script_chain`", it just flows left to right, no unnecessary nesting.

1 Like

That was the final step I needed, DGPickett! Making the script executable took care of it.

Thanks for the assist, I really appreciate it!

This is very central to the whole unix thing -- many false rumors float around this. A script is a pretty fully qualified executable in UNIX, compared to other languages. You are allowed one argument on the first line, BTW, for things like sed and awk that need -f or such to tolerate commands from a file.

@ DGPickett, could you explain what this construct does,

Relative to the code posted. I just don't get it.

TIA.

There are three ways to get stdout strings into a variable, and the back-quotes is the oldest, but I like it least, as there is no symmetry check in vi, or nesting. So, unless I am writing for bourne sh, I never say '`'.

The $(command) introduced with ksh allows vi % symmetry checking and allows nesting. (Be careful to use balanced paren pairs (...) in 'case $var in' cases!)

However, the simplest thing for simple values, not multi-line and not preserving whitespace, is just '| read var'. It introduces no nesting.

For lists in to lists on command line: '| xargs -n999 command '

For lists to get individual processing: '| while read var ; do ... ; done'