Another scripting help please, splitting files by range, maybe I don't need a script :(

Hi,

I requested the Windows SysAdmin to give me all available config files from a server.

Instead of sending it into several files, he combine all of them into one single file leaving it for me to split them up myself.

I was able to do so via a script below:

$ cat ./split_file.ksh
#!/bin/ksh
#

myPID=$$
grep -in "C:" file_to_split.txt > $myPID.tmp.00
cat $myPID.tmp.00

last_number=`tail -1 $myPID.tmp.00 | awk -F":" '{ print $1 }'`
switch=0
count=0

echo
echo "myPID = $myPID"
echo "last_number = $last_number"
echo

while read num
do
   if [[ $switch = 0 ]] ; then
      end=1
      switch=1
      continue
   else
      let count=$count+1
      start=$end
      end=`echo $num | awk -F":" '{ print $1-1 }'`
      echo "- Printing ${start} to ${end} ..."
      sed -n "${start},${end}p" file_to_split.txt > file_to_split.file.${count}.txt
      let end=$end+1

      if [[ $end = $last_number ]] ; then
         let count=$count+1
         start=$end
         echo "- Printing ${start} to the end ... "
         sed -n "${start},\$p" file_to_split.txt > file_to_split.file.${count}.txt
      fi
   fi
done < $myPID.tmp.00

rm $myPID.tmp*
echo
ls -l file_to_split.file.*txt
echo

wc -l file_to_split.txt
echo
wc -l file_to_split.file.*txt

exit 0

Sample output running the script is as below:

$ ./split_file.ksh
1:C:\Oracle\product\12.1.0\client_1_32Full\NETWORK\ADMIN\tnsnames.ora
6673:C:\Oracle\product\12.1.0\client_1_32Full\NETWORK\ADMIN\SAMPLE\tnsnames.oRA
13345:C:\Oracle\product\12.1.0\client_1_64Full\network\admin\tnsnames.ora
20017:C:\Oracle\product\12.1.0\client_1_64Full\network\admin\sample\tnsnames.oRA
26689:C:\Oracle\product\12.1.0\TNSNAMES\tnsnames.ora
33361:C:\Support\tnsnames.ora
39897:C:\Support\CHG0082341\tnsnames.ora

myPID = 26172
last_number = 39897

- Printing 1 to 6672 ...
- Printing 6673 to 13344 ...
- Printing 13345 to 20016 ...
- Printing 20017 to 26688 ...
- Printing 26689 to 33360 ...
- Printing 33361 to 39896 ...
- Printing 39897 to the end ...

-rw-r--r-- 1 oracle oinstall 363739 Feb 27 14:28 file_to_split.file.1.txt
-rw-r--r-- 1 oracle oinstall 363746 Feb 27 14:28 file_to_split.file.2.txt
-rw-r--r-- 1 oracle oinstall 363739 Feb 27 14:28 file_to_split.file.3.txt
-rw-r--r-- 1 oracle oinstall 363746 Feb 27 14:28 file_to_split.file.4.txt
-rw-r--r-- 1 oracle oinstall 363718 Feb 27 14:28 file_to_split.file.5.txt
-rw-r--r-- 1 oracle oinstall 352223 Feb 27 14:28 file_to_split.file.6.txt
-rw-r--r-- 1 oracle oinstall 363706 Feb 27 14:28 file_to_split.file.7.txt

46568 file_to_split.txt

   6672 file_to_split.file.1.txt
   6672 file_to_split.file.2.txt
   6672 file_to_split.file.3.txt
   6672 file_to_split.file.4.txt
   6672 file_to_split.file.5.txt
   6536 file_to_split.file.6.txt
   6672 file_to_split.file.7.txt
  46568 total

Is there an awk one-liner way of achieving the same result? :o

Hi
It just doesn't work for me to start with 1

csplit -f file_to_split.file. -b "%d.txt" file_to_split.txt 6673 13345 20017 26689 33361 39897
file_to_split.file.0.txt
file_to_split.file.1.txt
...

--- Post updated at 10:57 ---

If that's the only way

csplit -f file_to_split.file. -b "%d.txt" file_to_split.txt 1 6673 13345 20017 26689 33361 39897

And ignore or delete the empty file with index 0

Looks like you're searching for the C: string to split upon. How about

csplit -f "file_to_split.file." -b "%02d.txt" file_to_split.txt "/C:/" "{*}"

to replace your entire script above?

1 Like

Hi @RudiC,
There, not all chunks are the same size

--- Post updated at 12:04 ---

I got the point. Thanks