using the below ,will find out max length
awk -F "|" '{ if (length($83) > max) max = length($83) } END { print max }' file
cat filename|egrep '[0-9]' ---for numbers
car filename |egrep '[a-z][A-Z]' --for strings
Thanks a lot for your reply .can you please explain? .Is there any way to check for date format.Regarding the decimal and precesion uisng awk check for numbers with decimal ,if any decimal ,want hard code to decimal(38,4) else only number with the length.
awk -F'|' '
NR==1 {
for ( c=1; c<=NF; c++ ) {
colName[c] = $c;
}
colCount = NF;
next;
}
{
for ( c=1; c<=NF; c++ ) {
if ( $c ~ /^[0-9]+$/ ) colIsNum[c]++
else if ( $c ~ /^[0-9]*\.[0-9]*$/ ) colIsDec[c]++
else if ( $c ~ /^[0-9]+\/[0-9]+\/[0-9]+$/ ) colIsDat[c]++
else colIsTxt[c]++;
if (length($c) > colLength[c]) colLength[c] = length($c);
}
}
END {
print "create table tablename("
for ( c=1; c<=colCount; c++ ) {
out = " " colName[c];
if ( colIsTxt[c] ) out = out " varchar2(" colLength[c] ")"
else if ( colIsDat[c] ) out = out " date"
else if ( colIsDec[c] ) out = out " number(38,4)"
else if ( colIsNum[c] ) out = out " number(" colLength[c] ")"
else out = out " Error!";
if ( c != colCount ) out = out ",";
print out
}
print ");";
}
' inputfile
Hi Aigles,
Thanks for your reply .Its working for all columns except last column.
For the last column,set data type with column name .
problem with 4th column,its start with decimal places but the o/p showing varchar2(13)
Input file
col1|col2|col3|col4|col5|col6|col7|col8
1234|zxxxx|999| |01/01/1970|||xxxx
12|abcdefghij|xyz|13.0000000000|10/08/2010||commentsxx|REAL xxxx- xxxx WILL ONLY BE xxxx; renegotiated 12/16/2000, yyy 47106812 ;payment xxx
The 4th Column in row #2 doesn't contain a number (only spaces), so the column is assigned the varchar2 type.
Jean-Pierre.
---------- Post updated at 08:27 ---------- Previous update was at 08:18 ----------
With your last sample datas, i don't have the problem on the last column name :
Input file:
col1|col2|col3|col4|col5|col6|col7|col8
1234|zxxxx|999| |01/01/1970|||xxxx
12|abcdefghij|xyz|13.0000000000|10/08/2010||commentsxx|REAL xxxx- xxxx WILL ONLY BE xxxx; renegotiated 12/16/2000, yyy 47106812 ;p
ment xxx
Hi Aigles,
There is no space in the 4 thc column ,but it still showing string data type.Plese advise
input file
col1|col2|col3|col4|col5|col6|col7|col8
1234|zxxxx|999|32.3000|01/01/1970|||xxxx
12|abcdefghij|xyz|13.0000000000|10/08/2010||commentsxx|REAL xxxx- xxxx WILL ONLY BE xxxx; renegotiated 12/16/2000, yyy 47106812 ;payment xxx
Thanks for your great help .Want to check with you ,if the columns contains spaces or null and numbers ,will not assign number data type .The i/p file some of the columns are null and some with date/number values ,its assigned to varchar2 data type.Is it any otherway to check to ignore null or spaces and assign correspoing data type.
The following new version of the AWK program ignore null and spaces.
awk -F'|' '
BEGIN {
colDefaultLength = 1;
}
NR==1 {
for ( c=1; c<=NF; c++ ) {
colName[c] = $c;
}
colCount = NF;
next;
}
{
for ( c=1; c<=NF; c++ ) {
if (length($c) > colLength[c]) colLength[c] = length($c);
if ( $c ~ /^ *$/ ) continue;
if ( $c ~ /^[0-9]+$/ ) colIsNum[c]++
else if ( $c ~ /^[0-9]*\.[0-9]*$/ ) colIsDec[c]++
else if ( $c ~ /^[0-9]+\/[0-9]+\/[0-9]+$/ ) colIsDat[c]++
else colIsTxt[c]++;
}
}
END {
print "create table tablename("
for ( c=1; c<=colCount; c++ ) {
if ( colLength[c]+0 == 0 ) colLength[c] = colDefaultLength;
out = " " colName[c];
if ( colIsTxt[c] ) out = out " varchar2(" colLength[c] ")"
else if ( colIsDat[c] ) out = out " date"
else if ( colIsDec[c] ) out = out " number(38,4)"
else if ( colIsNum[c] ) out = out " number(" colLength[c] ")"
else out = out " varchar2(" colLength[c] ")";
if ( c != colCount ) out = out ",";
print out
}
print ");";
}
' inputfile
Inputfile :
col1|col2|col3|col4|col5|col6|col7|col8
1234|zxxxx|999|32.3000|01/01/1970|||xxxx
12|abcdefghij|xyz|13.0000000000|10/08/2010||commentsxx|REAL xxxx- xxxx WILL ONLY BE xxxx
|not empty|foo| |05/05/2010||text|
Thanks for your help ,script is working fine.I need some additional modifcation on the script
If string data lenghth below 50 ,set string default length to 50 ,if between 50 to lessthan 100,set default data length to 100
If between 100 to less than 200 ,set default length to 500
If between 500 to less than 1000,set default length 1000.
Numbers data length ,no issue ,set to 38 whatvery length comes
If there is spaces or no value at the last column ,script to remove those values .
The script workinf fine if last filed is null and data also null(After col4) ,if any column data also null,its giving Error data type .For Instance for 2 column having null values ,when executing the script its giving Error data type.Is there any way to check only last field column name ,if column name exist and then no data set string 5o or if no column name and no data,consider upto col4.Your help is greatly appreciated.
Are you sure of your script ?
The string 'Error!' doesn't exist in the script.
When I execute the script of my last post with your last sample datas, the result is :