Assign value to variable

Hi Guys,

I need to assign the value of which has rows to a variable, Can you advise how to do that

hive --orcfiledump /hdfs_path/ | grep  "Rows"
Rows: 131554

I need to assign this row count itself to a unix variable

count=$(hive --orcfiledump /hdfs_path/ | grep  "Rows")

Expected
count=131554

Even if i find multiple values like Rows:131554 Rows:131554 , I need to assign only one value to count varaible

Are those multiple values identical, i.e. always e.g. Rows:131554 ? If not, which one to select? If identical, and your shell (which alas you fail to mention) is (a recent) bash , try

read X count X < <(hive --orcfiledump /hdfs_path/ | grep -i rows | sort -u)

Those counts will be always identical, So will it work ?

You tell me.

With sed

count=$(hive --orcfiledump /hdfs_path/ | sed -n 's/Rows: //p')

If you want to make sure there is only one value printed (like a implicit |uniq ) you need to quit after the first match:

sed -n '/^Rows:/ {;s/^Rows:[^0-9]*\([0-9][0-9]*\).*/\1/p;q;}'

This will print the first line starting with "Rows:" and leave sed after printing it. I have tried to improve the regexp so that (variable numbers of) blanks before and behind the number eventually get deleted too.

I hope this helps.

bakunin

1 Like

Oh dear, my last post missed the "multiple occurrences" requirement.
Another variant of the sed solution

count=$(hive --orcfiledump /hdfs_path/ | sed -n '/^ *Rows: */{;s///;s/ .*//;p;q;}')

With awk

count=$(hive --orcfiledump /hdfs_path/ | awk '$1=="Rows:"/{print $2; exit}')

A pure shell variant

while
  read X count Y &&
  [ "$X" != "Rows:" ]
do
  :
done < <(hive --orcfiledump /hdfs_path/)