Hi, Robert
Good that you have a solution. I am going to present another solution, one that does not use awk. I definitely recommend that you learn awk, especially if you are going to deal with columns (I usually call them fields). If you did wish to use only the command line, then here is one approach. First, we will ignore files that do not have exactly 6 lines. Then we will ignore all remaining files that do not have a total of 24 fields, which we will assume 4 fields on each of the 6 lines. We will do this, in turn, by assuming that each line would have 3 commas (separators), thus 6 X 3 -> 18. The utilities we will use are wc to count lines and characters, and tr which can delete (and transform) characters. So consider a file that has your 6 lines, and 4 fields per line. We will delete everything except the commas, and then count those commas. Any file that has other than 18 will be ignored. Any filename that remains will be collected into a text string (variable).
This script has a lot of other stuff in it to show the data files, results, etc., such as a debug function db. (To see what the results look like without debugging output, simply reverse the order of the 2 lines defining the db function.) Just concentrate on the ideas above and look at the files on which wc and tr operate. Ready?
#!/usr/bin/env bash
# @(#) s2 Demonstrate counting lines and fields, wc and tr.
# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
em() { pe "$*" >&2 ; }
db() { : ; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
C=$HOME/bin/context && [ -f $C ] && $C head wc tr
FILE=${1-data?}
pl " Input data files $FILE:"
head $FILE
pl " Line counts for data files $FILE:"
wc -l $FILE
pl " Results:"
l6x4=""
for item in $FILE
do
if [ $(wc -l <$item) != 6 ]
then
db " ignore file $item, not 6 lines"
continue
else
if [ $(tr -d -c ',' <$item|wc -c) != 18 ]
then
db " ignore file $item, not 18 commas"
continue
fi
l6x4="$l6x4 $item"
fi
done
db " list of files to copy: $l6x4"
pe " sample: cp $l6x4 some-directory"
exit 0
producing:
$ ./s2
Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-7-amd64, x86_64
Distribution : Debian 8.11 (jessie)
bash GNU bash 4.3.30
head (GNU coreutils) 8.23
wc (GNU coreutils) 8.23
tr (GNU coreutils) 8.23
-----
Input data files data?:
==> data1 <==
1.1,1.2,1.3,1.4
2.1,2.2,2.3,2.4
==> data2 <==
1.1,1.2,1.3,1.4
2.1,2.2,2.3,2.4
3.1
4.1
5.1
6.1
7.1
==> data3 <==
1.1,1.2,1.3,1.4
2.1,2.2,2.3,2.4
3.1,3.2,3.3,3.4,3.5
==> data4 <==
1.1,1.2,1.3,1.4
2.1,2.2,2.3,2.4
==> data5 <==
1.1,1.2,1.3,1.4
2.1,2.2,2.3,2.4
3.1,3.2,3.3,3.4
4.1,4.2,4.3,4.4
5.1,5.2,5.3,5.4
6.1,6.2,6.3,6.4
==> data6 <==
1.1,1.2,1.3,1.4
2.1,2.2,2.3,2.4
3.1,3.2,3.3,3.4
4.1,4.2,4.3,4.4
5.1,5.2,5.3,5.4
6.1,6.2,6.3;6.4 <- semi-colon is not a comma
==> data7 <==
1.1,1.2,1.3,1.4
2.1,2.2,2.3,2.4
3.1,3.2,3.3,3.4
4.1,4.2,4.3,4.4
5.1,5.2,5.3,5.4
6.1,6.2,6.3,6.4
==> data8 <==
1.1,1.2,1.3,1.4
2.1,2.2,2.3,2.4
3.1,3.2,3.3,3.4
4.1,4.2,4.3,4.4
5.1,5.2,5.3,5.4
6.1,6.2,6.3,6.4,6.5
==> data9 <==
1.1,1.2,1.3,1.4
2.1,2.2,2.3,2.4
3.1,3.2,3.3,3.4
4.1,4.2,4.3,4.4
5.1,5.2,5.3 <- only 3 fields: so 2 commas
6.1,6.2,6.3,6.4,6.5 <- but 5 fields: so 4 commas
-----
Line counts for data files data?:
2 data1
7 data2
3 data3
6 data4
6 data5
6 data6
6 data7
6 data8
6 data9
48 total
-----
Results:
db, ignore file data1, not 6 lines
db, ignore file data2, not 6 lines
db, ignore file data3, not 6 lines
db, ignore file data4, not 18 commas
db, ignore file data6, not 18 commas
db, ignore file data8, not 18 commas
db, list of files to copy: data5 data7 data9
sample: cp data5 data7 data9 some-directory
OK, so what's good here? It's fairly straightforward command-line stuff, although some, like the comparisons may be new. That means you don't need to learn another language like awk or perl (but, of course, you probably should.) Just simple utilities are used.
So what's bad? First, awk and perl can process multiple operations, whereas here, each file causes a number of separate executions for each file. For a small number of files, say a few hundred, that probably would not matter much, especially if you can get a script running quickly compared to the time it may take to get an awk script (or especially a perl script) running correctly. Secondly, we assumed your files are all syntactically correct, in that each line has 4 fields. If it does not, like data9, the sum of the commas could still be 18, but it may cause an error down the line.
The latter issue could be addressed by looking at each line. Another utility (hefty though it might be) is datamash, which you can use to verify the number of fields in a file. For example:
$ datamash -t, check 4 fields <data9
line 5 (3 fields):
5.1,5.2,5.3 <- only 3 fields: so 2 commas
datamash: check failed: line 5 has 3 fields (expecting 4)
Because it is another code that deals with fields, you may wish to look at it:
datamash command-line calculations (man)
Path : /usr/local/bin/datamash
Version : 1.2
Type : ELF 64-bit LSB executable, x86-64, version 1 (SYS ...)
Help : probably available with -h,--help
Repo : Debian 8.11 (jessie)
Home : https://savannah.gnu.org/projects/datamash/ (pm)
Home : http://www.gnu.org/software/datamash (doc)
The code datamash was in my repository for a machine like:
OS, ker|rel, machine: Linux, 3.16.0-7-amd64, x86_64
Distribution : Debian 8.11 (jessie)
datamash (GNU datamash) 1.2
Best wishes ... cheers, drl