To check file size in hadoop

Hi Guys,

i am writng a script to check if the file size is lessthan limit say 10KB then exit
i need to check in hadoop file system , can someone advise

Minsize=10
for file in `hadoop fs -ls path/*`
do
Actualsize=$(du -k "$file" | cut -f 1)
if [ $Actualsize -lt $Minsize ]; then
    echo "File generated incorrectly for $file : Filesize - $Actualsize KB "
    echo "Exiting from Script file size found less than 10KB"
  exit 1;
fi
done

Would this thread be a useful start?

http://www.unix.com/unix-for-beginners-questions-and-answers/269617-file-name-its-count.html

Robin

Sorry that didnt help, i have made some changes like below but getting error

Minsize=10
for Actualsize in `hadoop fs -du -h /path | cut -d" " -f1`
do
if [ $Actualsize -lt $Minsize ]; then
    echo "File generated incorrectly for $file : Filesize - $Actualsize KB "
    echo "Exiting from Script file size found less than 10KB"
  exit 1;
fi
done

integer expression expected

Well, what value does $Actualsize actually end up being?

Its providing me decimal value for e.g. 1.5

As indicated by that error message, you seem to run bash - although you failed to mention that explicitly, by the way. Decimal numbers can't be processed by it.
So - why don't you use a command / tool that IS capable of calculating with floating point numbers?

Hi,

Thanks for your suggestion, is there a way to round the decimal value and compare

Does your hadoop fs allow for the -b option to output byte sizes?
If not: your problem has been solved umpteen times in these forums; try searching in here...

1 Like