Recursive Cat?

extnic · August 31, 2012, 12:49pm

I have a directory, say "DirA" with a bunch of subfolders 'Sub1, Sub2... etc.' Each subfolder has a number of csv files.

I want to delete the top ten rows of each csv file, then concatenate them and save the output as a csv file with the same name as the subfolder.

My code currently looks like this, but I really don't know what I'm doing at all.

for dir in ./; do 
	cat **/*.csv > $dir.csv
done

Thanks,
extnic

Corona688 · August 31, 2012, 1:06pm

What do you mean by 'delete the top ten rows'? Do you mean physically remove the rows from the file, or just not put them in your output?

matches anything, two *'s in a row is redundant.

If there's few enough files, it could be as simple as

tail -n +11 folder/*/*.csv > output

jim_mcnamara · August 31, 2012, 1:07pm

cd DirA
find . -type d > /tmp/mydirs.txt
while read dir
do
     cd $dir
     ls *.csv | while read csv
     do
           awk 'FNR>10' "$csv" >> big.csv
     done
     mv big.csv  ${dir}.csv
     cd ..
done < /tmp/mydirs.txt

This just creates the csv files without whacking off the top 10 lines of the csv's.
The final large csv for the directory is in the directory.

extnic · August 31, 2012, 1:25pm

corona688:

What do you mean by 'delete the top ten rows'? Do you mean physically remove the rows from the file, or just not put them in your output?

matches anything, two *'s in a row is redundant.

If there's few enough files, it could be as simple as
tail -n +11 folder/*/*.csv > output

Thanks! But the output file looks like it is just listing the names of the csv files. I have 150 subdirectories, so I would love to be able to automate it instead of doing this for each of the subdirectories.

Corona688 · August 31, 2012, 1:35pm

If it prints no lines, then the files don't have more than 10 lines. You can give it -q to tell it to omit the file names.

Automate what? It's already selecting everything under folder/whatever/whatever.csv, what other folders do you want? You seemed to imply they were all at the same depth, folder/folder2/file.csv. If they're not, you need to use find and use its output to feed the tail command. xargs can turn a list of files on stdout into a list of arguments, as long as the filenames and folder names contain no spaces.

find /path/to/folderholdinghundredsoffolders/ -name '*.csv' | xargs tail -q -n +11 > output