I've written the script below to merge only .txt files that exist in one directory into one huge .txt file and ignore other files with other extensions.
now the result is one huge .txt file with all the contents of other .txt files
how can i add a File Name as a comment before each file?
i think i should specify the path in the system because i want to merge each directory with all .txt files in it in a separate file.
meaning
Directory A: has many files, i want to merge all .txt files in this directory only
Directory B: same thing merge all .txt file exist in this directory only.
and so forth.
when i executed the script u posted I got this error:
ls: cannot access /path/to/my/directories/*.txt: No such file or directory
#!/bin/ksh
system='/path/to/my/directories/DirectoryA'
for txtfile in $(ls $system/*.txt)
do
echo " #FileName : $txtfile"
cat $txtfile >> outputFileA.txt
done
When i removed system='/path/to/my/directory/DirectoryA'
i got this error
ls: cannot access /*.txt: No such file or directory
From the errors it gave you, it sounds like it's having trouble finding either the files or directories. I would verify that your path is correct. You could add a test for the directory in the script that could tell you this:
#!/bin/ksh
system='/path/to/my/directories/DirectoryA'
if [ -d $system ] # If the directory exists..
then
for txtfile in $(ls $system/*.txt)
do
echo " #FileName : $txtfile"
cat $txtfile >> outputFileA.txt
done
else
echo "Sorry, but that directory doesn't exist."
fi
As for wanting to repeat this on more directories, you could wrap a for loop around the above code like this:
for mydir in dirA dirB dirC
do
echo "Now processing $mydir ..."
# ... code from above ...
done
This is where I would define a function and pass the directory you want to process to the function as an argument. I hope that helps you.
Thank you for your response.
I think i didnt explain what i need clearly,
I have 145 directories each one has many files,
for example one of the directories is
ClassA that has grade.txt subjects.txt courses.txt description.xml
i want to have one file called ClassA.txt that contanis all the contents of grade.txt , subject.txt , courses.txt
and each part in ClassA.txt should have comment '//' and the file name
i.e
ClassA.txt
would be like //grade.txt
[content of grade .txt]
// subject.txt
[content of subject.txt]
and so forth
what i've managed to do so far is
#!/bin/sh
system='/home/path/to/first/directory'
for txtfile in `find ${system} | grep "\.txt"'$'` ; do
#echo $txtfile
cat $txtfile | `find ${system} -name '*.txt'` > ClassA.txt
done
i dont want to display the path of $txtfile as shown in the code above rather i would like to append the vale of $txtfile before each .txt file
unfortunately the code above isnt working, am getting permession denied error!
That's a rather pointless construct. It's inefficient (requires fork-exec to create a subshell and run ls), prone to breakage if any of the file names contain an IFS character (by default, space, tab, and newline), and is bound by the system's exec()'s limit (ARG_MAX).
A simpler, safer, more efficient alternative: for txtfile in "$system"/*.txt
Thank you all for ur efforts, unfortunately none of the suggested solutions has worked for me.
I managed to get the file name, the question is how to inser the file name before the concatenation happens.
Unless I missed it, you never made it clear if the output filename varies with directory and if so how to choose its name. To collect the desired concatenation of all .txt files in a directory, with each file's contents preceded by its filename, the solution that follows will create a file named ALL-TEXT-FILES.txt in each directory
You mention that you have 145 directories, but neglected to explain how the code is expected to visit them. Do you have a list to feed the script, either via a pipe or command line arguments? Or are they all in a hierarchy which can be simply traveresed with find from a single root location? I will assume the later and the following script can take a single argument, the location of this starting directory. If absent, the current working directory is assumed.
#!/bin/sh
find "${1:-.}" -type d -exec sh -c '
for d; do
out=$d/ALL-TEXT-FILES.txt
for f in "$d"/*.txt; do
{ [ -f "$f" ] && [ -r "$f" ]; } || continue
printf "//%s\n" "${f##*/}" >> "$out"
cat "$f" >> "$out"
done
done
' sh {} +
I tested it and it works as I intend.
However, there is a bug in this code (which is also present in some of the other suggestions). It's unlikely to be triggered, but it's lurking ... sleeping ... hoping.
In case anyone would prefer to find it themselves...
***** CAUTION: SPOILERS AHEAD *****
If a directory happens to contain a file whose name is identical to the output file, cat will enter an infinite loop of reading-writing to itself until the machine explodes. The non-lazy solution would be to use a unique tempfile (or at least a filename that is guaranteed to be outside the traversal).