Random Line

I am trying to build a script written in ksh93 that pulls a random line out of each of four text files.

Since the files are all different, I can't think of a way to take the number of lines (in one case, let's say 50), then randomly pick a line out, making sure that all lines have a fair chance...

I tried playing with some variations of division and comparison of the $RANDOM variable, but I was unsuccessful in making sure I didn't pick the same few lines over and over...

Any suggestions?

try something like:

VAR=`expr $RANDOM % 50`

I'll give that a shot!
I tried using division, but not using % (I can't remember what it's called ATM...). It appears that I get a good mix out of it!

Thanks crashnburn - I'll post my script if it turns out how I want it (i.e. - I don't abandon it first :stuck_out_tongue: ...)

Suppose that you want to pick a random line froma file that has exactly 100 lines. You can label the lines 0 through 99. Then if you get a random integer, you just take the last two digits as your line number.

Those last 2 digits are the remainder that your get when you divide the random integer by 100. So for example, if the random integer is 10726352; you would have a remainder of 52 after dividing 10726352 by 100. This is expressed as
10726352 % 100 = 52
and % is sometimes called "the remainder function". And sometimes "modulus".

But is was important to divide by 100 since we had 100 lines in our file. When we divide by 100, we have 100 possible remainders, so each line in the file has a chance.

To make this general, don't divide by 100 (or 50) all the time, divide by the count of items.

Thanks to Perderabo and Crashburn for this idea...


nlines=`wc -l filename`
VAR=`expr $RANDOM % $nlines + 1`

we need to add 1 so that we genrate line numbers between 1 to $nlines and not 0 to $nlines -1

or this should also work... here I'm ensuring by dividing with 32768 that the "random" number generated will fall within the no of lines that file contains.

On hp-ux 10, 11 we have man ksh giving

0 <= $RANDOM <= 32767


nlines=`wc -l filename`
VAR=`expr \( 1 + $RANDOM \) \* $nlines / 32768`

you can see the range for RANDOM by referring "man ksh" for your implementation...

Cheers!
Vishnu.

assuming a file size of 5 lines execute both these loops on your machine... you need to type CTRL+C to break the loops when you want...

you will notice that newfile2 for most of the time won't contain 5... so the second way won't lead to a uniform distribution between 1 and $nlines... in fact most of the time it won't generate $nline at least for small $nline values...

Well, I used the basic suggestions above, and came up with this fun little waste of time (in a post below - I couldn't attach the file, even though it's only 12k)

A little background - this is based on the BOfH Excuse Calendar. If you don't know who / what that is, read over here first:
http://bofh.ntk.net/Bastard.html

It's stored in shar format, created by GNU shar.

the "excuse" script assumes you have a working "/bin/ksh", and have common utilities like sed in your PATH.
excuse.web will output a weak excuse for HTML, and is meant to be called from serve_exc. serve_exc assumes that you have netcat (nc), and it's in your PATH, and it was compiled with GAPING_SECURITY_HOLE defined (to allow it to use the "-l" option)... It'll listen on port 8080 for a connection from a web browser, run excuse.web, then start over...

It may take some fiddling, but take the code below, put it in a file called excuse.txt, and type "sh excuse.txt". Then "./excuse" and repeat for endless hours of fun!

#!/bin/sh
# This is a shell archive (produced by GNU sharutils 4.2.1).
# To extract the files from this archive, save it to some FILE, remove
# everything before the `!/bin/sh' line above, then type `sh FILE'.
#
# Made on 2002-11-20 11:30 PST
#
# Existing files will *not* be overwritten unless `-c' is specified.
#
# This shar contains:
# length mode       name
# ------ ---------- ------------------------------------------
#    549 -rw------- ex1_lst
#    507 -rw------- ex2_lst
#    314 -rw------- ex3_lst
#     86 -rw------- ex4_lst
#    950 -rwx------ excuse
#   1002 -rwx------ excuse.web
#    571 -rwx------ serve_exc
#
save_IFS="${IFS}"
IFS="${IFS}:"
gettext_dir=FAILED
locale_dir=FAILED
first_param="$1"
for dir in $PATH
do
  if test "$gettext_dir" = FAILED && test -f $dir/gettext \
     && ($dir/gettext --version >/dev/null 2>&1)
  then
    set `$dir/gettext --version 2>&1`
    if test "$3" = GNU
    then
      gettext_dir=$dir
    fi
  fi
  if test "$locale_dir" = FAILED && test -f $dir/shar \
     && ($dir/shar --print-text-domain-dir >/dev/null 2>&1)
  then
    locale_dir=`$dir/shar --print-text-domain-dir`
  fi
done
IFS="$save_IFS"
if test "$locale_dir" = FAILED || test "$gettext_dir" = FAILED
then
  echo=echo
else
  TEXTDOMAINDIR=$locale_dir
  export TEXTDOMAINDIR
  TEXTDOMAIN=sharutils
  export TEXTDOMAIN
  echo="$gettext_dir/gettext -s"
fi
if touch -am -t 200112312359.59 $$.touch >/dev/null 2>&1 && test ! -f 200112312359.59 -a -f $$.touch; then
  shar_touch='touch -am -t $1$2$3$4$5$6.$7 "$8"'
elif touch -am 123123592001.59 $$.touch >/dev/null 2>&1 && test ! -f 123123592001.59 -a ! -f 123123592001.5 -a -f $$.touch; then
  shar_touch='touch -am $3$4$5$6$1$2.$7 "$8"'
elif touch -am 1231235901 $$.touch >/dev/null 2>&1 && test ! -f 1231235901 -a -f $$.touch; then
  shar_touch='touch -am $3$4$5$6$2 "$8"'
else
  shar_touch=:
  echo
  $echo 'WARNING: not restoring timestamps.  Consider getting and'
  $echo "installing GNU \`touch', distributed in GNU File Utilities..."
  echo
fi
rm -f 200112312359.59 123123592001.59 123123592001.5 1231235901 $$.touch
#
if mkdir _sh26541; then
  $echo 'x -' 'creating lock directory'
else
  $echo 'failed to create lock directory'
  exit 1
fi
# ============= ex1_lst ==============
if test -f 'ex1_lst' && test "$first_param" != -c; then
  $echo 'x -' SKIPPING 'ex1_lst' '(file already exists)'
else
  $echo 'x -' extracting 'ex1_lst' '(text)'
  sed 's/^X//' << 'SHAR_EOF' > 'ex1_lst' &&
Temporary
Intermittant
Partial
Redundant
Total
Multiplexed
Inherent
Duplicated
Dual-Homed
Synchronous
Bidirectional
Serial
Asynchronous
Multiple
Replicated
Non-Replicated
Unregistered
Non-Specific
Generic
Migrated
Localised
Resignalled
Dereferenced
Nullified
Aborted
Serious
Minor
Major
Extraneous
Illegal
Insufficient
Viral
Unsupported
Outmoded
Legacy
Permanent
Invalid
Deprecated
Virtual
Unreportable
Undetermined
Undiagnosable
Unfiltered
Static
Dynamic
Delayed
Immediate
Nonfatal
Fatal
Non-Valid
Unvalidated
Non-Static
Unreplicatable
Non-Serious
SHAR_EOF
  (set 20 02 11 13 11 45 14 'ex1_lst'; eval "$shar_touch") &&
  chmod 0600 'ex1_lst' ||
  $echo 'restore of' 'ex1_lst' 'failed'
  if ( md5sum --help 2>&1 | grep 'sage: md5sum \[' ) >/dev/null 2>&1 \
  && ( md5sum --version 2>&1 | grep -v 'textutils 1.12' ) >/dev/null; then
    md5sum -c << SHAR_EOF >/dev/null 2>&1 \
    || $echo 'ex1_lst:' 'MD5 check failed'
00aecd2e37a5ddb28f14718ff212b9f6  ex1_lst
SHAR_EOF
  else
    shar_count="`LC_ALL= LC_CTYPE= LANG= wc -c < 'ex1_lst'`"
    test 549 -eq "$shar_count" ||
    $echo 'ex1_lst:' 'original size' '549,' 'current size' "$shar_count!"
  fi
fi
# ============= ex2_lst ==============
if test -f 'ex2_lst' && test "$first_param" != -c; then
  $echo 'x -' SKIPPING 'ex2_lst' '(file already exists)'
else
  $echo 'x -' extracting 'ex2_lst' '(text)'
  sed 's/^X//' << 'SHAR_EOF' > 'ex2_lst' &&
Array
Systems
Hardware
Software
Firmware
Backplane
Logic-Subsystem
Integrity
Subsystem
Memory
Comms
Integrity
Checksum
Protocol
Parity
Bus
Timing
Synchronisation
Topology
Transmission
Reception
Stack
Framing
Code
Programming
Peripheral
Environmental
Loading
Operation
Parameter
Syntax
Initialisation
Execution
Resource
Encryption
Decryption
File
Precondition
Authentication
Paging
Swapfile
Service
Gateway
Request
Proxy
Media
Registry
Configuration
Metadata
Streaming
Retrieval
Installation
Library
Handler
SHAR_EOF
  (set 20 02 11 13 11 46 46 'ex2_lst'; eval "$shar_touch") &&
  chmod 0600 'ex2_lst' ||
  $echo 'restore of' 'ex2_lst' 'failed'
  if ( md5sum --help 2>&1 | grep 'sage: md5sum \[' ) >/dev/null 2>&1 \
  && ( md5sum --version 2>&1 | grep -v 'textutils 1.12' ) >/dev/null; then
    md5sum -c << SHAR_EOF >/dev/null 2>&1 \
    || $echo 'ex2_lst:' 'MD5 check failed'
09e966a2adba5158ba54b6af16a25419  ex2_lst
SHAR_EOF
  else
    shar_count="`LC_ALL= LC_CTYPE= LANG= wc -c < 'ex2_lst'`"
    test 507 -eq "$shar_count" ||
    $echo 'ex2_lst:' 'original size' '507,' 'current size' "$shar_count!"
  fi
fi
# ============= ex3_lst ==============
if test -f 'ex3_lst' && test "$first_param" != -c; then
  $echo 'x -' SKIPPING 'ex3_lst' '(file already exists)'
else
  $echo 'x -' extracting 'ex3_lst' '(text)'
  sed 's/^X//' << 'SHAR_EOF' > 'ex3_lst' &&
Interruption
Destabilisation
Destruction
Desynchronisation
Failure
Dereferencing
Overflow
Underflow
NMI
Interrupt
Corruption
Anomoly
Seizure
Override
Reclock
Rejection
Invalidation
Halt
Exhaustion
Infection
Incompatibility
Timeout
Expiry
Unavailability
Bug
Condition
Crash
Dump
Crashdump
Stackdump
Problem
Lockout
SHAR_EOF
  (set 20 02 11 13 11 48 15 'ex3_lst'; eval "$shar_touch") &&
  chmod 0600 'ex3_lst' ||
  $echo 'restore of' 'ex3_lst' 'failed'
  if ( md5sum --help 2>&1 | grep 'sage: md5sum \[' ) >/dev/null 2>&1 \
  && ( md5sum --version 2>&1 | grep -v 'textutils 1.12' ) >/dev/null; then
    md5sum -c << SHAR_EOF >/dev/null 2>&1 \
    || $echo 'ex3_lst:' 'MD5 check failed'
6281886225b51332f25e27db12d04c66  ex3_lst
SHAR_EOF
  else
    shar_count="`LC_ALL= LC_CTYPE= LANG= wc -c < 'ex3_lst'`"
    test 314 -eq "$shar_count" ||
    $echo 'ex3_lst:' 'original size' '314,' 'current size' "$shar_count!"
  fi
fi
# ============= ex4_lst ==============
if test -f 'ex4_lst' && test "$first_param" != -c; then
  $echo 'x -' SKIPPING 'ex4_lst' '(file already exists)'
else
  $echo 'x -' extracting 'ex4_lst' '(text)'
  sed 's/^X//' << 'SHAR_EOF' > 'ex4_lst' &&
Error
Problem
Warning
Signal
Flag
Issue
Connundrum
Complication
Diagnostic
Exhaustion
SHAR_EOF
  (set 20 02 11 14 15 54 58 'ex4_lst'; eval "$shar_touch") &&
  chmod 0600 'ex4_lst' ||
  $echo 'restore of' 'ex4_lst' 'failed'
  if ( md5sum --help 2>&1 | grep 'sage: md5sum \[' ) >/dev/null 2>&1 \
  && ( md5sum --version 2>&1 | grep -v 'textutils 1.12' ) >/dev/null; then
    md5sum -c << SHAR_EOF >/dev/null 2>&1 \
    || $echo 'ex4_lst:' 'MD5 check failed'
1712d7a60fa5844c03a5ef197e435215  ex4_lst
SHAR_EOF
  else
    shar_count="`LC_ALL= LC_CTYPE= LANG= wc -c < 'ex4_lst'`"
    test 86 -eq "$shar_count" ||
    $echo 'ex4_lst:' 'original size' '86,' 'current size' "$shar_count!"
  fi
fi
# ============= excuse ==============
if test -f 'excuse' && test "$first_param" != -c; then
  $echo 'x -' SKIPPING 'excuse' '(file already exists)'
else
  $echo 'x -' extracting 'excuse' '(text)'
  sed 's/^X//' << 'SHAR_EOF' > 'excuse' &&
#! /bin/ksh
X
fil1_len=$(wc -l < ex1_lst)
X linenum=$((RANDOM%fil1_len+1))
X var1=$(sed -n "${linenum}p" ex1_lst)
X
fil2_len=$(wc -l < ex2_lst)
X linenum=$((RANDOM%fil2_len+1))
X var2=$(sed -n "${linenum}p" ex2_lst)
X
((RANDOM%2)) || {
X  fil3_len=$(wc -l < ex2_lst)
X  linenum=$((RANDOM%fil3_len+1))
X  var3=$(sed -n "${linenum}p" ex2_lst)
X  [[ $var3 == $var2 ]] && var3=""
X }
X
fil4_len=$(wc -l < ex3_lst)
X linenum=$((RANDOM%fil4_len+1))
X var4=$(sed -n "${linenum}p" ex3_lst)
X
fil5_len=$(wc -l < ex4_lst)
X linenum=$((RANDOM%fil5_len+1))
X var5=$(sed -n "${linenum}p" ex4_lst)
X
typeset -L1 -l comp=$var1
X [[ $comp == ?([aeiou]) ]] && { conj=an; } || { conj=a; }
X
clear
print "
**************************************
**************************************
X
X     Excuse Calendar - $(date +%D)    
X
**************************************
**************************************
"
print The problem appears to be caused by $conj $var1 $var2 $var3 $var4 $var5
print
X
SHAR_EOF
  (set 20 02 11 14 14 34 49 'excuse'; eval "$shar_touch") &&
  chmod 0700 'excuse' ||
  $echo 'restore of' 'excuse' 'failed'
  if ( md5sum --help 2>&1 | grep 'sage: md5sum \[' ) >/dev/null 2>&1 \
  && ( md5sum --version 2>&1 | grep -v 'textutils 1.12' ) >/dev/null; then
    md5sum -c << SHAR_EOF >/dev/null 2>&1 \
    || $echo 'excuse:' 'MD5 check failed'
353fcaf0f218625560068a2feb16194a  excuse
SHAR_EOF
  else
    shar_count="`LC_ALL= LC_CTYPE= LANG= wc -c < 'excuse'`"
    test 950 -eq "$shar_count" ||
    $echo 'excuse:' 'original size' '950,' 'current size' "$shar_count!"
  fi
fi

Continued on next post...

# ============= excuse.web ==============
if test -f 'excuse.web' && test "$first_param" != -c; then
  $echo 'x -' SKIPPING 'excuse.web' '(file already exists)'
else
  $echo 'x -' extracting 'excuse.web' '(text)'
  sed 's/^X//' << 'SHAR_EOF' > 'excuse.web' &&
#! /bin/ksh
X
fil1_len=$(wc -l < ex1_lst)
X linenum=$((RANDOM%fil1_len))
X var1=$(sed -n "${linenum}p" ex1_lst)
X
fil2_len=$(wc -l < ex2_lst)
X linenum=$((RANDOM%fil2_len))
X var2=$(sed -n "${linenum}p" ex2_lst)
X
((RANDOM%2)) || {
X  fil3_len=$(wc -l < ex2_lst)
X  linenum=$((RANDOM%fil3_len))
X  var3=$(sed -n "${linenum}p" ex2_lst)
X  [[ $var3 == $var2 ]] && var3=""
X }
X
fil4_len=$(wc -l < ex3_lst)
X linenum=$((RANDOM%fil4_len))
X var4=$(sed -n "${linenum}p" ex3_lst)
X
fil5_len=$(wc -l < ex4_lst)
X linenum=$((RANDOM%fil5_len))
X var5=$(sed -n "${linenum}p" ex4_lst)
X
typeset -L1 -l comp=$var1
X [[ $comp == ?([aeiou]) ]] && { conj=an; } || { conj=a; }
X
print "<BR>
**************************************<BR>
**************************************<BR>
�������� Excuse Calendar - $(date +%D)<BR>    
**************************************<BR>
**************************************<BR>
<BR>
The problem appears to be caused by $conj $var1 $var2 $var3 $var4 $var5 <BR>
<BR>"
X
SHAR_EOF
  (set 20 02 11 14 11 57 43 'excuse.web'; eval "$shar_touch") &&
  chmod 0700 'excuse.web' ||
  $echo 'restore of' 'excuse.web' 'failed'
  if ( md5sum --help 2>&1 | grep 'sage: md5sum \[' ) >/dev/null 2>&1 \
  && ( md5sum --version 2>&1 | grep -v 'textutils 1.12' ) >/dev/null; then
    md5sum -c << SHAR_EOF >/dev/null 2>&1 \
    || $echo 'excuse.web:' 'MD5 check failed'
003805135a7f3654adb34d0a27edb4d0  excuse.web
SHAR_EOF
  else
    shar_count="`LC_ALL= LC_CTYPE= LANG= wc -c < 'excuse.web'`"
    test 1002 -eq "$shar_count" ||
    $echo 'excuse.web:' 'original size' '1002,' 'current size' "$shar_count!"
  fi
fi
# ============= serve_exc ==============
if test -f 'serve_exc' && test "$first_param" != -c; then
  $echo 'x -' SKIPPING 'serve_exc' '(file already exists)'
else
  $echo 'x -' extracting 'serve_exc' '(text)'
  sed 's/^X//' << 'SHAR_EOF' > 'serve_exc' &&
#! /bin/ksh
X
while :
do
X    nc -l -p 8080 |& 
X    exec 3<&p 4>&p
X    
X    while read -u3 line
X    do
X        [[ $line = ?(\r) ]] && break
X    done
X    
X    file=html.$$
X    print "<HTML><HEAD><TITLE>BOfH Excuse Calendar</TITLE></HEAD>" >$file
X    print "<BODY>" >>$file
X    ./excuse.web >>$file 
X    print "</BODY></HTML>" >>$file
X 
X	print -u4 "HTTP/1.0 200 OK\r"
X	print -u4 "Server: /bin/ksh\r"
X        print -u4 Content-Length: $(wc -c < $file)"\r"
X        print -u4 "\r"
X        cat "$file" >&4
X
X    rm -f $file
X    exec 3>&- 4>&-
X    kill -1 $! >/dev/null 2>&1
done
X
SHAR_EOF
  (set 20 02 11 14 15 20 36 'serve_exc'; eval "$shar_touch") &&
  chmod 0700 'serve_exc' ||
  $echo 'restore of' 'serve_exc' 'failed'
  if ( md5sum --help 2>&1 | grep 'sage: md5sum \[' ) >/dev/null 2>&1 \
  && ( md5sum --version 2>&1 | grep -v 'textutils 1.12' ) >/dev/null; then
    md5sum -c << SHAR_EOF >/dev/null 2>&1 \
    || $echo 'serve_exc:' 'MD5 check failed'
9775ce6be608ea64537597c30cce22ba  serve_exc
SHAR_EOF
  else
    shar_count="`LC_ALL= LC_CTYPE= LANG= wc -c < 'serve_exc'`"
    test 571 -eq "$shar_count" ||
    $echo 'serve_exc:' 'original size' '571,' 'current size' "$shar_count!"
  fi
fi
rm -fr _sh26541
exit 0

Make sure to append this one to the above without any spaces inbetween...