Script to copy certain info from several directories

Hi,
I am writing a script to copy certain file name in txt file [1].
It is working fine if I provide a single directory name (for example "/eos/uscms/store/user/pooja04//analysis2012/525/data/doubleele/2012/datav1/" ) where those specific files are present ending with *root [2].
But I want to modify this script to check the directories there [3] look for all the directory for the files ending with *root and then copy them to the $FileName.

Thanks,
Pooja

[1]

#!/bin/bash                                                                                                                                                            
date
PATHNAME=/eos/uscms/store/user/pooja04//analysis2012/525/data/doubleele/2012/datav1/

FileName=DataFileName
 ls -ltr $PATHNAME grep root | awk '{print "$PATHNAME"$9}' > "$FileName"

[2]

[pooja04@cmslpc09 tcsh]$ ls /eos/uscms/store/user/pooja04//analysis2012/525/data/doubleele/2012/
dataJv1  dataMv2  datav1  datav4
[pooja04@cmslpc09 tcsh]$ 

[3]

[pooja04@cmslpc09 tcsh]$ l /eos/uscms/store/user/pooja04//analysis2012/525/data/doubleele/2012/datav1/
-rw-r--r-- 1 pooja04 us_cms 128761223 Jul  1 20:48 vgtree_276_4_8Dj.root
-rw-r--r-- 1 pooja04 us_cms 153347935 Jul  1 21:36 vgtree_183_2_aZm.root
-rw-r--r-- 1 pooja04 us_cms 139629983 Jul  1 22:01 vgtree_76_5_I3U.root
-rw-r--r-- 1 pooja04 us_cms 128422302 Jul  1 22:40 vgtree_95_4_KXv.root
-rw-r--r-- 1 pooja04 us_cms 139629983 Jul  2 02:36 vgtree_76_6_wEd.root

There's no point doing ls -l if all you're going to do is extract the filename from it. The ordinary listing will do.

find /eos/uscms/store/user/pooja04//analysis2012/525/data/doubleele/2012/ -name '*root' | while read FILE
do
        echo mv "$FILE" /path/to/dest
done

Hi,
Sorry I forgot to mention that I will need the following information in a text file,

/eos/uscms/store/user/pooja04//analysis2012/525/data/doubleele/2012/datav1/vgtree_434_1_AD5.root
/eos/uscms/store/user/pooja04//analysis2012/525/data/doubleele/2012/datav1/vgtree_253_1_Cie.root
/eos/uscms/store/user/pooja04//analysis2012/525/data/doubleele/2012/datav1/vgtree_252_1_AAy.root
/eos/uscms/store/user/pooja04//analysis2012/525/data/doubleele/2012/datav1/vgtree_253_1_bKi.root
/eos/uscms/store/user/pooja04//analysis2012/525/data/doubleele/2012/datav1/vgtree_107_1_b56.root


And given this, I need to do the "ls -ltr ".

Thanks
Pooja

No you don't. You need -t for time, and -r for reverse. All -l does is add columns that you instantly throw away.

-1, as in one, not ell, might make sense to force single-column output, but you get it anyway whenever ls writes to a non-terminal.

Since you need it sorted by time:

find /eos/uscms/store/user/pooja04//analysis2012/525/data/doubleele/2012/ -name '*.root' | xargs ls -1tr | tee listfile | while read LINE
do
        echo mv "$LINE" /path/to/dest
done

This may fail if there's too many hundreds of files, since they won't all be able to fit in one ls commandline.

Thanks for the suggestion, it could copy one directory content and 10% of the second directory but not all directory info.

And yeah, my files are sometimes 1000 in one directory.

pooja

Hmm, this may or may not be difficult. Do you have stat? It can print epoch times, which you feed into a numeric sort, and then get rid of with awk once they've served their purpose...

find . -name '*.root' | xargs stat -c "%Z %n" | sort -r -n | awk '{ print $2 }' > listfile

while read LINE
do
        echo mv "$LINE" /path/to/dest
done < listfile

Yes, I have.

I may have edited something in underneath you while you replied. Sorry, I'm sneaky like that. :smiley:

Are you sure it should work?? It failed to pass all the directories info.

Thanks
Pooja

---------- Post updated at 05:11 PM ---------- Previous update was at 05:01 PM ----------

Needed an immediate help:
In the given code piece, I am running it with command

./ScriptName condorjob   

the Issues is that the DataMC.sh which is simple script file failed to copy to the temp.sh or (TEMPSCRIPT). I have run this kind of script earlier as well and they workrd fine.
I am not sure what I am missing here.
Kindly comment.

thanks


#!/bin/bash
date

# Global Parameters
#PATHNAME=/eos/uscms/store/user/pooja04//analysis2012/525/data/0001/data/30Aug2012/   #533
PATHNAME=/eos/uscms/store/user/pooja04//analysis2012/525/data/doubleele/2012/datav1/
FileName=DataFileName
#TARGETPATH=
SCRIPT=DataMC.sh
TEMPSCRIPT=temp.sh
CARD=card_Data2011AB_Zee_40GeV
CondorJob=Job_condor

function pause(){
   read -p "$*"
}

usage () {
    echo "Script to submit offline batch mode jobs"
    echo "Possible arguments:"
    echo "  batchjob - to submit batch jobs"
    echo "  condorjob - to submit batch jobs"
    echo "  help - shows this help"
    echo "   ./Script4OfflineJobs.sh batchjob NoofJobs"
}

# this sets CLASSPATH                                                                                                                                                  
path () {
    source /uscms/home/pooja04/ToBegin.sh
    cd /uscms_data/d3/pooja04/CMSSW_5_3_3_patch2/src/ElectroWeakAnalysis/MultiBosons/test
    cmsenv
    cd /uscms/home/pooja04/script/AnalysisCode/Code42012/v2
}


#To Copy the Files Inforamtion from the Destination Folder
CopyFilesInfo() {
    ls -ltr "$PATHNAME" | grep root | awk '{print string path $9}' string="$CONSTANT" path="$PATHNAME"  > "$FileName"
#    ls -ltr $PATHNAME grep root | awk '{print "$PATHNAME"$9}' > "$FileName"
    echo "FileName are Copied"
}

#Divide the jobs in small files
DivideJobs() {
    echo "Division Of Jobs"
    #split -$4  $3 chunk
    split -10  $FileName chunk
    
    i=0
    for file in chunk*
      do
      ((i=i+1))
      new_file="data"$i".list"
      perl -0pe 's/\n$//' $file >  $new_file
      
      #to remove the processed file   
      rm -rf $file
      echo $i ' succesful'
    done
}

#For Submiting the batchjobs
SubmitBatchJob () {
    max=$2
    for (( i=0; i<=$max; ++i )) ; 
      do
      echo "SHELL SCRIPTING ================ $i ==============" 
      new_file="temp"$i".sh"
      cp $SCRIPT $new_file
      
      if [ $i == 0 ]; then
	  source $new_file
	  pause 'Press [Enter] key to continue...'
	  rm $new_file
      else
	  if [ $i > 0 ]; then
	      { rm $CARD; sed -e "s/data$i/data$((i+1))/g" > $CARD; } < $CARD
	      source $new_file
	      pause 'Press [Enter] key to continue...'
	      rm $new_file
	  else
	      echo "SUCCESFULLY DONE.."
	  fi
      fi
    done
}    


#For Submiting the batchjobs
SubmitCondorJob () {

    for (( i=0; i < 89 ; i++ ))
      do
      echo "CondorJob Submission For  ================ $i ==============" 
      new_file="temp"$i".sh"

      if [ $i == 0 ]; then
      cp $SCRIPT $TEMPSCRIPT
	  echo "$SCRIPT"
	  

	  condor_submit $CondorJob
	  rm $TEMPSCRIPT
	  echo "Going To Sleep.."
	  sleep 100
      else
	  if [ $i > 0 ]; then
	      { rm $CARD; sed -e "s/data$i/data$((i+1))/g" > $CARD; } < $CARD
	     cp $SCRIPT $TEMPSCRIPT 
	    
	      condor_submit $CondorJob
	      rm $TEMPSCRIPT
	
	      echo "Going To Sleep.."
	      sleep 100
	      echo "SUCCESFULLY DONE.."
	  fi
      fi
    done
}    

if [ $# -lt 1 ]; then
    usage
    exit 1
fi

if [ $1 = "help" ]; then
    usage
    exit 0

elif [ "$1" = "batchjob" ]; then
    path  
    CopyFilesInfo
    DivideJobs
    SubmitBatchJob
    exit 0

elif [ "$1" == "condorjob" ]; then
    path
    CopyFilesInfo 
    DivideJobs
    SubmitCondorJob
    exit 0
fi
    

Don't know what goes wrong in your script, but I noticed some things...

# ls -ltr $PATHNAME grep root | awk '{print "$PATHNAME"$9}' > "$FileName"

  • this is commented, but there is a missed "|" (pipe) before "grep root", if you should use the line again.

elif [ "$1" == "condorjob" ]; then

  • should be single "=" I think.

In what way did it not work? It worked on my machine.

Show me what you did and what happened word for word, letter for letter, keystroke for keystroke.

The script I ran had the following content.

#!/bin/bash                                                                                                                                                            
find /eos/uscms/store/user/pooja04//analysis2012/525/data/doubleele/2012/ -name '*.root' | xargs stat -c "%Z %n" | sort -r -n | awk '{ print $2 }' > listfile

while read LINE
do
        echo mv "$LINE" /path/to/dest
done < listfile 

It ran fine. But the final file "listfile" only have the information of one directory and some information of second directory, but not directories which are at location /eos/uscms/store/user/pooja04//analysis2012/525/data/doubleele/2012/.

The output file is big, do you want to look at it? I can post if you wish.
Let me know

Thanks
Pooja

---------- Post updated at 01:59 AM ---------- Previous update was at 01:57 AM ----------

thanks for both the suggestions. But in the end, it is giving me hard time in that CP command. which should have ran fine. gurrrrr :confused: No idea what is happeing..

thnaks

I still haven't studied your script in details, and don't understand what you mean with "hard time in that CP command", but there is one more thing I noticed:

...
SubmitBatchJob () {
  max=$2
...

it seems that "max" will get a NULL value because you call the function "SubmitBatchJob" with no parameters? The "$2" is parameter 2 for the function, not the main script.

1 Like

Find is recursive, it will print everything.

xargs can split across as many split calls as needed.

sort can handle input of arbitrary size too.

I see no reason anything should be truncating or missing output there, except.. Hm. Do any of the directory names contain spaces?

find can do the same thing with printf

-printf "%T@ %f\n"
1 Like