Linux script help

I am creating a script that will incorporate multiple variables organized in a spreadsheet for pdftk. I have a 1000 page pdf that I have to split into about 300 individual pdfs. The basic command line to extract pages is as follows:

$ pdftk file.pdf cat 1-7  output newfile.pdf

file.pdf = 1000 page original file.
1-7 = pages to be extracted
newfile.pdf = new output file (pages 1-7 of 1000)

If I have a spreadsheet with all the sequential page numbers to be extracted in one column and the corresponding filename in another column, how do I insert those into the command line and have the script work through the entire list? I would appreciate any suggestions on how this could be done easier.

(noob I know, I ask for your patience)

Thanks
Tank.

Can you post ( say ...) first 10 lines of your spreadsheet file ? Exactly the way it is.

If you had a flat file that contained the pages and file names like:

1-7 newfile1
8-234 newfile2
238-499 newfile3
500-1000 newfile4

then you code use something like this:

perl -nle '/(.+) (.+)/;qx(pdftk file.pdf cat $1 output $2.pdf)' infile

Or ...

$ cat sample
1-7 file_1.pdf
8-14 file_2.pdf
15-22 file_3.pdf
23-30 file_4.pdf
while read s1 s2
  do
    pdftk file.pdf cat "$s1" output "$s2"
  done < sample

Spreadsheet or flat file would look like:

1-5 newfile1.pdf
6-7 newfile2.pdf
8-15 newfile3.pdf
16-20 newfile4.pdf

bigfile.pdf is the original 1000 page pdf. . .

# pdftk bigfile.pdf cat 1-5 output newfile1.pdf
# pdftk bigfile.pdf cat 6-7 output newfile2.pdf
# pdftk bigfile.pdf cat 8-15 output newfile3.pdf
# pdftk bigfile.pdf cat 16-20 output newfile4.pdf
 

Both solutions guessed the right format of the spreadsheet file, and they worked OK. The problem is already solved.

P.S.
BTW, thanks for posting the file, I just wanted to confirm I was working with the right sample.

Thanks, for all the suggestions. One more question. If I want to insert two variables to create a more complex output filename, what would be the easiest way?

For example, if my flat file/spreadsheet included a third variable, a year, it would look like this:

1-7 newfile1 1993
8-234 newfile2 2002
238-499 newfile3 2006
500-1000 newfile4 2009

I wanted to combine the second and third variables ("newfile1" and "1993") together, separated by a dash/hyphen as part of the output filename, the actual commands would look like this:

# pdftk bigfile.pdf cat 1-5 output newfile1-1993.pdf
# pdftk bigfile.pdf cat 6-7 output newfile2-2002.pdf
# pdftk bigfile.pdf cat 8-15 output newfile3-2006.pdf
# pdftk bigfile.pdf cat 16-20 output newfile4-2009.pdf

what would be the easiest method to have the two variables inserted, separated by a dash/hyphen, and then add the .pdf extension?

Thanks for your help, again. Much appreciated.

Simply add another variable (say s3) in the loop, and join them together later, ( ... output "$s2"-"$s3".pdf ... ).
Equally simple is modifying the posted perl code.

thanks, I guessed that was probably the most logical way- I appreciate the assistance.