Currently using the below script to being all compressed files .gz files from source folder and appending to the target txt file uncompressed.
Teh target txt file is getting too large in size, right now the size of the target txt file is almost 350GB
hadoop fs -text /user/hive/warehouse/stage.db/CLINICAL_EVENT/CLINICAL_EVENT* | hadoop fs -put - /user/hive/warehouse/stage.db/Clinical_event/final/clinical_event.txt
Is there a way to create multiple files at the time of executing the above -text script.
while appending just want to maintain each file of size max 5GB?
as long as the files all are in folder Final, then the hadoop will automatically read.
is there way to create the files like:
clinical_event_1.txt
clinical_event_2.txt
clinical_event_3.txt
so on so forth.
Thanks a lot for the helpful info.
Thank you.