Hi
Description of input file I have:
-------------------------
1) CSV with double quotes for string fields.
2) Some string fields have Comma as part of field value.
3) Have Duplicate lines
4) Have 200 columns/fields
5) File size is more than 10GB
Description of output file I need:
-------------------------------
1) Can be of CSV or Pipe delimited
2) But Comma within field value should remain
3) No Duplicate lines
4) I need only first 150 columns
Code I used till now:
-------------------
cat file.in | awk -F"," '!($1$2$3 in a){a[$1$2$3];print $0}' | cut -d, -f1-150 > file.out
But with this code, comma's within field value is treated as delimiter.