Hi
I have data stored under the timestamp folder with subfolders ,i am planning write script to automate take the timestamp and process all the subfolders after completion ,then go to another time stamp and then process subfolders as well .
Can any one having any idea to do that(in that below script is $param1 as need to pass as a timestamp )
/zz/xx/20101001/a1
/zz/xx/20101001/a2
/zz/xx/20101002/a1
/zz/xx/20101002/a2
.....
.....
a.txt
1,2,3,4,5
for i in `cat /xx/a.txt |tr -s "," '\n' ` ; do
echo $i
input=/zz/xx/$param1/$i/*.z
output=/zz/xx/$param1/$i
sudo hadoop jar /xx/hadoop-mapreduce/hadoop-streaming.jar -D mapreduce.task.io.sort.factor=2 -D mapred.output.compress=true \
-input $input -output $output
rc=$?
if [ ${rc} -ne 0 ];then
echo "File not exist"
exit 1
fi
done