How to read and Split a file?

Don_Cragun · January 29, 2014, 2:35am

I don't get it. Shouldn't the number of records in the last file be whatever is leftover after putting the first 100 lines from your input file in one file, and putting 1000 lines in subsequent files until less than 1000 lines are left? Are you saying that if your input file contains 1500 lines, you want 100 lines in one output file, 1000 lines in another file, 367 lines in the next file, and 33 lines in the last file? Why shouldn't it be 100 line in the first file, 1000 lines in the second file and 400 lines in the last file?

And, message #16 in this thread you said you wanted to call your script as:

./a.sh 2 1000 5

I see where the 1000 makes sense, but I don't understand what the 2 and the 5 are intended to tell your script. Why isn't the script to be invoked with something like:

./a.sh lines_to_go_in_file1 max_lines_to_go_in_remaining_files input_filename

wisecracker · January 29, 2014, 3:05am

I am with Don here...

There is no way of knowing at any one point in time for any *.csv file that your last
split will always be 33. It surely must be the remainder.

There is the situation that you only have a few lines total and less than either 100 and/or
1000 also with the possibility of 0 remaining lines.

If however this is purely a one off then the command line is highly interactive and makes
it possible to do it manually.

azherkn3 · February 3, 2014, 8:24am

Hi guys,

I have a requirement where i need to split a .csv file into multiple files.
Say for example i have data.csv file and i have splitted that into multiple files based on some conditions i.e first file should have 100, last file 50 and other files 1000 each. Am passing the values in command line arguments say ./samp.sh 100 50 1000..
I need to check the following conditions in which i got stuck

1) If the file has 1000 records in it will not split further
2) Any file greater than 1000 records will be split

how to achieve this, pls guide me through

RudiC · February 3, 2014, 12:29pm

That request will work only for files having exactly (n * 1000) + 150 records. Are you sure you can guarantee that condition? And, what if the file has 999 records?

Don_Cragun · February 3, 2014, 1:02pm

Hi Rudi,
These questions have been asked before in this thread. The original poster hasn't supplied an answer yet even though this thread has been open for a week. The number of lines to be stored in the first, last, and other files keeps changing; but there is no explanation of why the last file shouldn't just contain what is left over after putting one requested number of lines in one file and splitting the remainder into another requested number of lines chunks.

RudiC · February 3, 2014, 1:04pm

Hi Don,

thanks for pointing out. I asked as his last post was as vague as the others. And, in his other thread, I asked for the broad picture.

Scrutinizer · February 3, 2014, 1:59pm

@azherkn3.

Please:

Answer the questions put forward by various posters in this thread.
Comment on why the solutions given in this thread are not working for you.
Be very specific about what you want
Do not keep repeating what you wrote earlier.

Post a clear answer this time, or this thread will need to be closed.