I assume Ravinder didn't reply to this because you pointed out that it didn't work correctly if the number of input fields wasn't an exact multiple of the number of desired fields on each output line (and he gave you an xargs
solution that worked as you requested).
Here is a slightly modified version of his script that will print the missing data at the end of an input line and allows you to set the desired number of output fields (as in ongoto's perl
script):
#!/bin/ksh
FieldsPerLine=${1:-3} # Set the number of fields to be placed
# on each output line with a default of
# 3 if no operands are given.
awk -F, -v fpl="$FieldsPerLine" ' # Set input field separator to comma and
# set "fpl" to the number of fields to put
# on each output line.
{ for(i=1;i<=NF;i++){ # For each input field...
S=((S!="")?S OFS $i:$i) # If string "S" is not an empty string,
# set string "S" to the current
# contents of "S" followed by the
# output field separator followed by the
# contents of the current field;
# otherwise, set string "S" to the
# contents of the current field.
if(i%fpl==0){ # If the current field # is evenly
# divisible by the specified number of
# fields per output line...
print S;S="" # Print the current contents of the
# string "S" and clear the contents of
# "S".
}
}
if(NF%fpl){ # If the number of fields on this input
# line is not evenly divisible by the
# specified of fields per output line...
print S;S="" # Print the remaining fields for this
# line and clear the contents of "S".
}
}' OFS=, Input_file # Set output field separator to comma
# and specify the input file to be
# processed.
The code shown in red is what I changed from Ravinder's original script:
- The
FieldsPerLine
shell variable and the fpl awk
variable processing allows you to specify the desired number of output fields/line instead of always using the default value (3).
- The change from
S=S?S OFS $i:$i)
to: S=((S!="")?S OFS $i:$i)
corrects a bug that will only be seen if the 1st field on an output line was from an input field that was a string of one or more zeroes.
- The
if
statement added after the for
loop takes care of the missing output lines from input lines that do not contain an even multiple of the number of fields desired in output lines.
This was written and tested using the Korn shell, but it will also work with any other shell that recognizes basic POSIX parameter expansion requirements (such as bash
). As with any standard awk
script, if you want to try this on a Solaris/SunOS system, change awk
to /usr/xpg4/bin/awk
, /usr/xpg6/bin/awk
, or nawk
.
If Input_file
contains:
a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z
0,1,2,3,4,5,6,7,8,9
A,B,C,D,E,F
the default output produced by this script is:
a,b,c
d,e,f
g,h,i
j,k,l
m,n,o
p,q,r
s,t,u
v,w,x
y,z
0,1,2
3,4,5
6,7,8
9
A,B,C
D,E,F
and if run with a different requested number of fields per output line such as with ./splitline 5
, it produces the output:
a,b,c,d,e
f,g,h,i,j
k,l,m,n,o
p,q,r,s,t
u,v,w,x,y
z
0,1,2,3,4
5,6,7,8,9
A,B,C,D,E
F