Creating a PATH variable

Circuits · December 26, 2018, 3:41pm

I am new to shell scripting and I ran into a couple lines of code which I don't completely understand:

PATH=/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/entity/bin
data_dir=/usr/local/entity/project

I believe data_dir to be a more conventional link to a directory. However, I am not sure what PATH is. It also seems like a location but maybe it's multiple possible locations? Does the : signify that the location could be either: /usr/bin, /bin, /usr/sbin or /usr/local/entity/bin? Is the : in this context similar to some function like [ is to the test function?

Scrutinizer · December 26, 2018, 4:59pm

Hi, that is where the shell looks for executables.

PATH
This variable shall represent the sequence of path prefixes that certain functions and utilities apply in searching for an executable file known only by a filename. The prefixes shall be separated by a <colon> ( ':' ). When a non-zero-length prefix is applied to this filename, a <slash> shall be inserted between the prefix and the filename if the prefix did not end in <slash>. A zero-length prefix is a legacy feature that indicates the current working directory. It appears as two adjacent <colon> characters ( "::" ), as an initial <colon> preceding the rest of the list, or as a trailing <colon> following the rest of the list. A strictly conforming application shall use an actual pathname (such as .) to represent the current working directory in PATH. The list shall be searched from beginning to end, applying the filename to each prefix, until an executable file with the specified name and appropriate execution permissions is found. If the pathname being sought contains a <slash>, the search through the path prefixes shall not be performed. If the pathname begins with a <slash>, the specified path is resolved (see Pathname Resolution). If PATH is unset or is set to null, the path search is implementation-defined.
Since <colon> is a separator in this context, directory names that might be used in PATH should not include a <colon> character.

The Open Group Base Specifications Issue 7, 2018 edition

joeyg · December 27, 2018, 8:29am

Or, think of the PATH as your favorites. An ordered list of places to look for programs.
In your example, there were five BIN directories to search for any referenced programs.
And the BIN directories are where executable programs are typically stored.

gull04 · December 27, 2018, 9:35am

Hi,

It is quite common to change the order of the PATH directories, as an example on Solaris you may want to pick up a different version of a binary - for instance the posix compliant version of awk.

Regards

Gull04

bakunin · December 27, 2018, 9:44am

No. In fact it is a "variable". Suppose the following: i don't know where you installed a certain program, but i know how the place where the program is installed looks like because i work with it a lot. So, if you ask me where you can find the logs of the program, i'd tell you: "go to where you installed the program". In there, whereever that is, you find a directory "<somewhere>/logs" and inside of this you find a directory "<somewhere>/log/program-logs", ... Now, you might ask me something else about this program and again, i'd refer to the (to me unknown) location of the program as "<somewhere>" and direct you to some plcae relative to there.

This is what variables are for: you define them by assigning some value (in your case the value /usr/local/entity/project ) to a name ( data_dir ) and then you can use it everywhere without needing to know the exact value of it. Just like i used the name "<somewhere>" above. I could have used a variable:

program_location= .....    <= you would have to fill in the actual path here, say: /usr/myprogram

and in this case the logs would have been there:

program_location=/usr/myprogram
cd $program_location/logs
cd $program_location/logs/program-logs

The second and third lines use the variable we have defined before: $program_location is first replaced by what we assigned to the name "program_location" before, only then "/logs" or respectively "/logs/program-logs" is appended. So the real commands read:

cd /usr/myprogram/logs
cd /usr/myprogram/program-logs

PATH is a special kind of variable. Or, rather, a normal variables (everything i said above applies) but with a special function. You see, a program is a certain file (where executable code is stored). To execute the program you have to write the full name of the program file:

/path/to/the/executable/file/of/my/program

As the pathes get longer this might be a lot of work to type. Computer people are in general of the lazy sort (this is why we use computers to do it for us). Since many programs are collected in a few directories (i.e. most of the systems commands are stored in /usr/bin ) the PATH variable was invented. It holds a list of directories, separated by ":". The rule is: if a program is located in one of the directories in this list then you do not need to specify the path when calling it.

That means: your path looks like this:

PATH=/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/entity/bin

This is a list of directories: /usr/bin , /bin , /usr/sbin , /sbin and /usr/local/entity/bin . For instance there is a command cp (it copies files) located in /usr/bin . So, to call it and copy the file "file" to a new file called "new_file" you would have to use the command

/usr/bin/cp file new_file

But since /usr/bin is included in your PATH variable you can omit that and write:

cp file new_file

I hope this helps.

bakunin

Circuits · December 27, 2018, 11:29am

Just want to make sure I understand completely. Specifying the PATH variable means that the program will automatically assume a path for functions like cp or what-have-you without actually referencing the variable: PATH? This would make sense because the deceleration of PATH is the only place in the script where it is being referenced by name. Then again, the only place data_dir is being referenced by name is in the instantiation as well. Is it possible that data_dir is acting in a similar way to PATH or is it more likely that data_dir was just never used after it was declared?

joeyg · December 27, 2018, 11:51am

PATH is something that tells the computer where to look for programs, and in what order to search for the programs.
By programs, I am referring to many of the commands you use in unix - cp, rm, mv, cat - are actually programs.
So, if you type 'cat myfile', unix would look in /usr/bin folder first to find the program cat.exe; then it will do what cat is programmed for, display the contents of the parameter you typed, myfile

Defining data_dir is a common programming step. This avoids the need to specify it everywhere in your script, and (MOST IMPORTANTLY) allows you to adjust the location just once rather than throughout your script.

thus, in your script you could have the following commands (if the archiv_dir was also defined):
cat $data_dir"/workfile.txt"
mv $data_dir"/workfile.txt" $archv_dir"/workfile.arc"

Circuits · December 27, 2018, 3:44pm

Oh okay I was confused then. So the locations /usr/bin ect are where programs like cp or rm are located. In other words, its similar to importing a library in Python or Java. Once I specify those locations if I call one of the programs it will look there and see if that program exists and if it does it will run the program. I am only wondering because I am attempting to understand what this script is doing. Some parts of it are obvious, other parts are not. However, I find myself wondering why some programs like echo can be called without first specifying a PATH.

bakunin · December 27, 2018, 5:49pm

Yes, exactly. You could write a script using the mv , cp , rm , etc.programs, which are all located in /usr/bin (or /bin , depending on your system) by every time specifying the complete path to them. (I suggest to have a look - use the ls command, which is also located there and issue the following command, which will give you a similar output to this:

$ ls -l /usr/bin
total 145352
-rwxr-xr-x 1 root root          96 Nov 12 15:31 2to3-2.7
-rwxr-xr-x 1 root root       10104 Apr 23  2016 411toppm
-rwxr-xr-x 1 root root          39 Feb  5  2018 7z
-rwxr-xr-x 1 root root          40 Feb  5  2018 7za
-rwxr-xr-x 1 root root          40 Feb  5  2018 7zr
[...]

Somewhere in this list you will see all the programs i mentioned. You could always write /usr/bin/mv to call mv , /usr/bin/cp instead of cp and so on. But you can also include /usr/bin in your PATH variable and if you enter mv the shell (this is the program which takes your input and processes it) will, if it can't find mv , have a look in /usr/bin if it is there - and if it is there (actually it is), then it will use it.

Why don't you just post the script (enclosed in CODE-tags, please)? We can go over it together and answer your questions. Actually what we appreciate most are people willing to learn something for themselves instead of relying on us to do their work. I'd like to understand.... is a highly regarded goal here, infintely higher than please write for me a script which does....

This is actually an excellent question! The reason is that not everything you encounter in a script is a "command". There are three (four) distinct types of "things" (for lack of a better word) in a script (apart from such things as variable assignments, calculations and similar things):

1) "reserved words". These are basically what makes for the script language: while...do....done , if...then...else....fi and others are such "reserved words".

2) "built-in commands" or "built-ins", for short: over time it showed that some commands were used so often that the effort needed to load it from external so that it can be executed slowed down considerably the execution of shell scripts. This is why most shells (re-)created these programs inside them so that they could be used without having to load them as external programs. These are "built-ins" because this exactly is what they are: commands but built into the shell already. An example would be the echo command. In fact there is a /usr/bin/echo (or /bin/echo ) program you can use but there is also a built-in command echo in most shells. Because built-ins take precedence over external commands you can use the external program by specifying its full path (regardless of what your PATH variable says), but if you use echo without a path then the built-in is used.

3) external commands: These are the programs i told you about before. If you want to use them you either have to specify their full path or put the path were they are located in your PATH variable. Notice that built-ins and reserved words do (for obvious reasons) NOT need any path to be found.

4) aliases: you can create an "alias" for oftenly used commands, even for commands with a certain set of options. For instance when i use the ls command most times i use ls -lai (long form, show hidden files, show the inode number). Since i do not always want to type ls -lai (after all this are seven!!! keystrokes - way too much) i defined an alias:

alias l='/usr/bin/ls -lai'

and now i can use l and ls -lai would be executed. Think of an alias as something like a macro in other programming languages or a preprocessor statement in C.

You can find out which type the word you want to use is using the type command. In fact this is also an alias (for the bult-in command whence -v ), which is built into the shell, so that it is set by default - at least in the shell i use (Korn shell). In the bash shell the type command is a built-in itself. For what whence does in the Korn shell the bash shell relies on the external /usr/bin/whereis program. Here is what using the type command will look like:

# type while
while is a keyword

# type whereis
whereis is a tracked alias for /usr/bin/whereis

# type echo
echo is a shell builtin

# type /bin/echo
/bin/echo is /bin/echo

I suggest you try yourself and play around yourself to get acquainted.

I hope this helps.

bakunin

Circuits · December 28, 2018, 12:02pm

Sure yeah you guys have all been a great help so far, thanks. I will go ahead and show the script here. I have commented it to show what I understand and what I do not understand. Note that my comments are marked with // to differenciate them from the preexisting comments:

#!/bin/sh 
# Sound Manager 
#

### BEGIN INIT INFO
# Provides:          soundmanager
# Required-Start:    $local_fs $remote_fs $syslog $network
# Required-Stop:     $local_fs $remote_fs $syslog $network
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: Sound Manager
# Description:       Start the Sound Manager daemon
### END INIT INFO


PATH=/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/entity/bin //designate the various path's so that the shell knows where to look for executable's
data_dir=/usr/local/entity/project //designate a location where the shell can store information (does not seem to be used in the script)
SOUNDMANAGER=bmsoundmanager //give the bmsoundmanager a variable or an alias in this case "SOUNDMANAGER" the bmsoundmanager 
                            //is a program within entity/bin which controls various aspects
                            //of the sound system, it is an extension of SoundManagerAlsa

 
start() { //the "start" function, first statement in the case statement 
    echo -n "Starting Sound Manager:             " //print statement, will not output the trailing newline
    # Enable audio amplifier mode input
    echo 107 > /sys/class/gpio/export //not completely sure what this does, it might compare the value 107 to a value found in the export directory
    # Mute the amplifier
    echo 0 > /sys/class/gpio/gpio107/value //again not completely sure what this does
      ( ${SOUNDMANAGER} -q -n& ) && echo '[ OK ]' || echo '[FAIL]' //start the bmsoundmanager and feed it the
                                                                   //conditions -q (quiet = true: debug output is muted) and -n (detach = false: do not run as a daemon) 
                                                                   //&& means 'and' while || means 'or'
                                                                    //I am not sure how the program knows to echo OK versus FAIL
}

 
 
stop() { //the "stop" function, second statement in the case statement
    echo -n "Stopping Sound Manager:             " //print statement, will not output the trailing newline
    killall -HUP ${SOUNDMANAGER} //stops the program from running; however, H U P aren't in the man pages for the killall command so I am not sure what they do
}

//Here we have the case statement or the conditional logic. The first condition being that a user entered the command: 'start' and the second being
//that the user entered 'stop' when running the shell. 
case "$1" in
"start")
        start
        ;;
"stop")
        stop
        echo "OK"
    ;;
reload|restart)
        stop
        sleep 1
        start
    ;;
*)
    echo "$0: unknown argument $1." >&2;
    ;;
esac

#

EDIT: I hope the formatting shows up the same on your computer screens as it does on mine. I had some trouble formatting my comments so that it looks neat but you never know how it will appear to other users.

gull04 · December 28, 2018, 12:30pm

Hi Circuits,

Just a quick observation, the start function does not seem to be complete. The closing brace seems to be missing as does the command line entry to actually start the daemon.

However saying that there are entries in the script indicating that originally the start function may have been an entry in /etc/inittab , functionallity which I think has been mostly deprecated now.

Regards

Gull04

Circuits · December 28, 2018, 1:08pm

I accidentally deleted the closing brace when putting in my comments, that's my fault. Not sure about the command line entry to start the daemon.

bakunin · December 28, 2018, 3:23pm

That was a very commendable idea. Now let us go over the script. First, its "big structure":

#!/bin/sh 
PATH=/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/entity/bin //designate the various path's so that the shell knows where to look for executable's
data_dir=/usr/local/entity/project //designate a location where the shell can store information (does not seem to be used in the script)
SOUNDMANAGER=bmsoundmanager //give the bmsoundmanager a variable or an alias in this case "SOUNDMANAGER" the bmsoundmanager 
                            //is a program within entity/bin which controls various aspects
                            //of the sound system, it is an extension of SoundManagerAlsa

 
start() {
[...]
}

 
 
stop() {
[...]
}

case "$1" in
"start")
        start
        ;;
"stop")
        stop
        echo "OK"
    ;;
reload|restart)
        stop
        sleep 1
        start
    ;;
*)
    echo "$0: unknown argument $1." >&2;
    ;;
esac

The first thing you need to know is: Every shell script has a "main" part, like every C program has a "main()" function. Alas, the main part is not denoted as such. It is just, what remains after (or rather "outside" of) all the function definitions. I suppose you know from other languages what a function is. Here we have two functions: start() and stop(). One is supposed to start the program, one to stop the program. Functions are just read in and used only when they are called. So, this is the main part, which is executed first when the program is executed:

#!/bin/sh 
PATH=/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/entity/bin //designate the various path's so that the shell knows where to look for executable's
data_dir=/usr/local/entity/project //designate a location where the shell can store information (does not seem to be used in the script)
SOUNDMANAGER=bmsoundmanager //give the bmsoundmanager a variable or an alias in this case "SOUNDMANAGER" the bmsoundmanager 
                            //is a program within entity/bin which controls various aspects
                            //of the sound system, it is an extension of SoundManagerAlsa
case "$1" in
"start")
        start
        ;;
"stop")
        stop
        echo "OK"
    ;;
reload|restart)
        stop
        sleep 1
        start
    ;;
*)
    echo "$0: unknown argument $1." >&2;
    ;;
esac

First thing to notice is line 1:

#!/bin/sh

This tells the operating system which commando processor to use. A typical UNIX (or Linux) system has several possible commando processors (=shells), one of which is the "default shell", which is usually located in /bin/sh . All shells in UNIX are not only a (text-based) surface to the operating system but also a scripting language. Since the scripting languages of the most widely spread shells (the "Bourne shell" and its successors "Korn shell" and "bash") are very similar but not exactly the same this makes sure that the OS does not have to decide which commando processor to use but is told exactly about the one it should call and then feed it the script.

Next thing: "$0", "$1", ...

[...]
case "$1" in
[...]
    echo "$0: unknown argument $1." >&2;

There are some variables the shell maintains itself and any script can use them. "$0" is the name of the program itself, as it is called. Here is an exercise for you: create a directory in /tmp named "myprogram". Then put a file named "firstprog.sh" there with this content:

#! /bin/sh
echo  "The name of this program is $0"
exit 0

Then make it executable, here is the whole procedure, which you can cut & paste on the command prompt. How it works should (for now) not be your concern, it uses some advanced features:

mkdir -p /tmp/myprogram
cat > /tmp/myprogram/firstprog.sh <<-EOF
#! /bin/sh
echo  "The name of this program is \$0"
exit 0
EOF
chmod 754 /tmp/myprogram/firstprog.sh

After this you have a file /tmp/myprogram/firstprog.sh which you can look at:

$ ls -l /tmp/myprogram
total 4
-rwxr-xr-- 1 bakunin users 57 Dec 28 20:52 firstprog.sh

and which you can exectue. First, change into the directory and call it this way:

$ cd /tmp/myprogram
$ ./firstprog.sh 
The name of this program is ./firstprog.sh

Now change to your home directory (the cd shell builtin without an argument will get you there) and try again, using the full path:

$ cd
$ /tmp/myprogram/firstprog.sh 
The name of this program is /tmp/myprogram/firstprog.sh

You see the difference? This is what $0 is for. The other variables with numbers ( $1 , $2 , ...) are the (first, second, ...) representation of the commandline argument(s) given to the program inside of it. We will change our example script a little and see how that works:

mkdir -p /tmp/myprogram
cat > /tmp/myprogram/firstprog.sh <<-EOF
#! /bin/sh
echo  "The first argument was: \$1"
echo  "The second argument was: \$2"
echo  "The third argument was: \$3"
exit 0
EOF
chmod 754 /tmp/myprogram/firstprog.sh

Now try the following calls of this program one after the other. Observe the result:

$ /tmp/myprogram/firstprog.sh one two three
$ /tmp/myprogram/firstprog.sh one "two three"
$ /tmp/myprogram/firstprog.sh one "two three" four
$ /tmp/myprogram/firstprog.sh one two three four five
$ /tmp/myprogram/firstprog.sh

What have we seen? First, the variables $1, $2 and so on are filled on a "first come first served" basis. If there are less command line arguments than there are variables some (or maybe even all if there are no arguments) simply are empty. Second: normally the shell is separating arguments along word boundaries: "two" and "three" were separated because there was a white space in between. With the use of quotation (see the second example) you can switch this behavior off and create arguments which contain whitespace. Notice that both double and single quotes will have this effect although there are subtle differences between them which i will not address yet. Now, in light of this, let us try to examine what the main program does:

#!/bin/sh 
case "$1" in
"start")
        start
        ;;
"stop")
        stop
        echo "OK"
    ;;
reload|restart)
        stop
        sleep 1
        start
    ;;
*)
    echo "$0: unknown argument $1." >&2;
    ;;
esac

You see: the program can be called with a single commandline argument (it can be called with more too but it will ignore them since only "$1" is used). This argument can be one word out of the following:

1) "start"
2) "stop"
3) "reload"
4) "restart"

"start" will call the start function, "stop" will call the stop function, "reload" and "restart" will do the same, namely, first call the stop function, then wait for 1 second ("sleep 1"), then call the start function. Every other value of the first argument will lead to an error message of "unknown argument" and the name of the program so that the user knows which program this error message comes from.

There is nothing more disappointing than having dozens of scripts executed at system start and you get an error message of "failed" without an indication of what failed and why. When you start to program: don't do that to your users. Every script fails from time to time. That is nothing bad. Bad is to deny the user the means to find out why it failed, exactly where it failed and why. A well-written script can do complex things and its output will read like this:

did thing 1: success
did thing 2: success
did thing 3: failed
cannot do ....
exit 1

A badly written script will just write "failed" and be done. Now find out which of the possibly 25 things the script could do exactly failed and you start to call the programmer every swear name in your vocabulary.

OK, i stop ranting here. You sure are eager to further analyse your script in light of what i said.

I hope this helps.

bakunin