Reading file from remote system and work on it locally

aaOzymandias · February 22, 2011, 9:11am

Hi!

I am still fairly new to shell programming, but I have taken an interest to it and want to try some new stuff.

I intend to make a shell script (using bash) to read a file on a remote system, then do some work on it on the local system and display it.

In the long run I want to have a script that does this automatically for me and I only need to initiate the script and it does updates say every 10 minutes or atoher user defined time intereval (I guess spesefied as an argument as well eventually)

But first things first. I need to be able to read the file and manipulate it. More spesific it is a comma seperated file contianing some valuses I am interested in. Not all of them, but value 2, 7 and 13 from each line I want to list on my local system.

I have so far desigend my script to take in the arguments of IPadress and filename:

#!/bin/sh

# Look for args, if worng number of agruments print error and exit
# ARGS is number of arguments scripts expects
ARGS=2

if [ $# -ne $ARGS ]
then
    echo "$0:Error, missing or too many arguments!"
    echo "Usage: $0 IP filename"
    exit 1
fi

IPADRESS=$1
FILENAME=$2
echo $IPADRESS
echo $FILENAME

As you see its so far very, very basic and not really useful as is. My next "milestone" so to speak is reading the file from a remote system. I do not want to manipulate the remote file at all, only read it and get it to the local system, there I will do the manipulation and print what I want, in a manner that disturbed the remote system the least.

But I am not entierly sure how to read the file from the remote system in a good way (after all I am still really new at shell scripting, I have only ever done some c++ programming before). Should I make a copy of it localy and use that further in my script? Or read the information I want form the remote system and only get thoose values to work with further?

I would appreciate some helps and hints here as to how to proceed. Assume I am a total newbie (and some examples if any would also help me understand)

If I have posted in the wrong section, I am sorry, but this looked to be the place were such things belong.

(Sorry if my english is not top notch, its not my native language)

pludi · February 22, 2011, 9:25am

As always: it depends. It depends on the type of the remote system and the local system, and what services are available? Can you access the remote side using SSH/FTP/CIFS/NFS/...? What tool will you use to further process the file? Which side has more processing power available? How fast is the network between the two?

For example, with a fast network, a quick local machine, and SSH access I'd do something like this:

ssh user@remote 'cat /path/to/file' | while read line; ...

On a slow network, with enough processing capability on the remote side I'd let that machine do the processing and only transfer the results:

ssh user@remote 'while read line; do ...; done < /path/to/file'

aaOzymandias · February 22, 2011, 9:41am

Thank you, a good question. And actually, there are two differet systems, one is far away with high latency, acrsoss the globe over satellite link, but has realtive high bandwidth. One in just a few meteres away, even higher bandwidth and very low latency. All of them have ssh (I use ssh all the time to access them), so I think using ssh is the perfered method. Processing locally might be better as well I think, despite being over saetllite link. (Realtively, the files I have interest in is not too big, usually not more than ~60 lines, but can reach up to ~1000 is some rare cases).

pludi · February 22, 2011, 10:00am

OK, so bandwidth is not a problem. Are there a lot of files (> ~5k)? Do they each have a header? The reason I'm asking is that for a high file count on a high latency line it's more efficient to do fewer invocations of ssh.

aaOzymandias · February 22, 2011, 10:14am

There is only one file for each system.

The files look roughly like this:

value1,value2,...,value12,value13
value1,value2,...,value12,value13
...
value1,value2,...,value12,value13
value1,value2,...,value12,value13

Some of the values are strings with spaces included.

pludi · February 22, 2011, 10:21am

Well, then all you need would be

ssh user@$IPADRESS "cat $FILENAME" | awk 'BEGIN{ FS=OFS="," } { print $2, $7, $13 }'

aaOzymandias · February 22, 2011, 10:31am

Thanks a bunch! Gonna try it out later today

aaOzymandias · March 22, 2011, 5:00am

Hi again!

I have made some modifications to my script, and it displays fairly OK, but I have some issues with the first value soemtimes being a longer string then other value's. Then my display is messed up.

Code looks like this:

#!/bin/sh
# Usage: script [IP] [user] [filename]

ARGS=3

if [ $# -ne $ARGS ]
then
    echo "$0:Error, missing or too many arguments!"
    echo "Usage: $0 IP user filename"
    exit 1
fi

IPADRESS=$1
USERNAME=$2
FILENAME=$3

get_data()
{
    ssh $USERNAME@$IPADRESS "cat $FILENAME" | awk 'BEGIN{ FS=OFS="," } { print $2"\t\t"$4"\t"$9"\t"$11 }'
}

while true
do
    clear
    echo "Last updated: \c"; date --utc
    echo "Value2\Value4\Value9\t\t\Value11"
    echo "-----------------------------------------------------------------"
    get_data
    sleep 600
done

This is just to make it look more pretty (I guess the inf loop is not optimal either). The data is retrieved well as is, but handy to be able to get an easy overview of the data by displaying it properly.

I have used tabulators to get the desired positions, and this works fine execpt on the occations were the lengt of Value2 is different (the other values do not change as much, value4 might be 2 or 3 in lenght, value9 and value11 is the same all the time). So my question is what canI do to make sure the values are dispalyed propely? E.g. I want to have something that looks like this for all lenghts of values:

Desription             Desription              Desription            Desription
Value2                 Value4                  Value9                Value11
Value2                 Value4                  Value9                Value11
Value2                 Value4                  Value9                Value11

pravin27 · March 22, 2011, 5:31am

Use printf in awk. like below

printf "%-20s\t%-10s\t%-10s\t%-10s\n",$2,$4,$9,$11

aaOzymandias · March 22, 2011, 5:39am

Ah! nice, thanks!