Split a file into multiple files with an extension

Hi

I have a file with 100 million rows. I want to split them into 1000 subfiles and name them from 1.xls to 1000.xls.. Can I do it in awk?

Thanks,

Do a man on split command but if you wish to use awk, try this (untested):

 
awk 'NR%100000==1 {filenum=++i;} {print > filenum".xls"}' input.file

Hi,

Thanks, It does not work for a file with lines 37086834...

It just simply returns nothing..

If your .xls files are excel spreadsheets, awk and split are not going to work. The awk utility is designed to work on text files. Microsoft's excel spreadsheets are binary files; not text files. The cut utility will work on binary files, but there is no reason to believe that you could use cut -l since there is no reason to believe that rows in a table in a .xls file have a 1-to-1 correspondence to newlines in the .xls file.

If you export the table from excel into a .csv text file, then awk and cut and lots of other utilities could be used to split the .csv file into smaller .csv files containing a set number of lines desired in each resulting file.

Yes, you can do it. Use 100000 instead of 10 in the awk command below:

$ seq 100 > file

$ awk 'BEGIN { f = 1 } i == 10 { f++; i = 0 } { print >> f ".xls"; i++ }' file

$ ls
1.xls   2.xls  4.xls  6.xls  8.xls  file
10.xls  3.xls  5.xls  7.xls  9.xls

$ cat 6.xls
51
52
53
54
55
56
57
58
59
60

hanson44,
Note that your file is a text file. It is not an excel spreadsheet!

Create a spreadsheet using excel with the values 1 to 100 in a table in the spreadsheet and then show me that this awk script will produce 10 output files with the values 51 through 60 in the 6th output file when given the .xls file produced by excel as its input.

I never suggested the script would work on an excel spreadsheet. The script is designed to operate on a text file, as I made clear by providing an example.

The poster just said "file". You're right to assume an xls file "should be" a spreadsheet. But maybe the poster gives text files an xls extension for some odd reason of their choosing. They will have to clarify.

If it is a spreadsheet (100 million rows? :eek:), the poster could of course convert it to a csv or tsv file first, and change the xls extension in the script to "csv" or "tsv".

Yes. And your awk script is logically equivalent to the awk script mjr provided that Diya23 already said does not work.

Based on those earlier postings I made the wild assumption that the 1000 .xls files Diya23 wants to create actually come from a HUGE excel spreadsheet as an input file.

Of course, another possibility is that there just isn't enough space to create two copies of the 100,000,000 row (line) text file in the current directory/filesystem.

We'll have to wait for more information from Diya23 to determine what is really going on.

I too was under the impression that it was a text file, not an Excel file. The Excel file would need to be first exported to a text/delimited file or, alternatively, a VBA/macro could be used to split the file within Excel without the need for a Unix solution.