Output file with <Tab> or <Space> Delimited

TechGyaann · December 31, 2015, 2:12am

Input file:

xyz,pqrs.lmno,NA,NA,NA,NA,NA,NA,NA
abcd,pqrs.xyz,NA,NA,NA,NA,NA,NA,NA

Expected Output:

 
 xyz pqrs.lmno NA NA NA NA NA NA NA
abcd pqrs.xyz NA NA NA NA NA NA NA

Command Tried so far:

 
 awk -F"," 'BEGIN{OFS=" ";} {print}' $File_Path/File_Name.csv

Issue:
Output is not as extected... output is same as input.

balajesuri · December 31, 2015, 2:20am

To me, sed seems a simpler alternative:

user@host~> sed 's/,/ /g' file
xyz pqrs.lmno NA NA NA NA NA NA NA
abcd pqrs.xyz NA NA NA NA NA NA NA
user@host~>

Correction to your command:

awk -F"," 'BEGIN{OFS=" ";} {$1=$1; print}' file

RavinderSingh13 · December 31, 2015, 2:23am

Hello TechGyaann,

Following may help you in same.
1st:

 tr ',' ' ' < Input_file

2nd:

awk '{gsub(/\,/," ",$0);print}'  Input_file

Output will be as follows in both above codes.

xyz pqrs.lmno NA NA NA NA NA NA NA
abcd pqrs.xyz NA NA NA NA NA NA NA

NOTE: considering that the first line space which you have showed us in the sample output is a typo here.

Thanks,
R. Singh

TechGyaann · December 31, 2015, 3:40am

ravindersingh13:

Hello TechGyaann,

Following may help you in same.
1st:
 tr ',' ' ' < Input_file
2nd:
awk '{gsub(/\,/," ",$0);print}'  Input_file
Output will be as follows in both above codes.
xyz pqrs.lmno NA NA NA NA NA NA NA
abcd pqrs.xyz NA NA NA NA NA NA NA
NOTE: considering that the first line space which you have showed us in the sample output is a typo here.

Thanks,
R. Singh

Yeah that was a Typo.

I shall try it, hope it should work. (not sure about gsub etc.. im learning awk and unix shell scripting)

Aia · December 31, 2015, 12:17pm

Awk has two built-in functions to deal with string substitutions: sub() and gsub() . There are others built-ins functions for matching and extracting parts of substrings, but these are for substitutions. The main different between both is that sub() will only do the substitution one time on the first instance of the match, and gsub() will do it for every instance of the match; the g in front tries to detonate this distinction and stand for global as in all or the whole.
The syntax could be expressed as gsub(regex, replacement, string), where regex is an Extended Regular Expression to match, replacement is the substitution part and string is the sequence of characters you want the work to be done on.
Using the example given by RavinderSingh13:

awk '{gsub(/\,/," ",$0);print}'

In gsub() , now we know:
/\,/ is the regex representing the comma we want to find
" " is what we want to substitute for
$0 is the range of characters we want to work on, in this case the whole record.
After the function is done:
print is for after the substitutions are done, display the whole modified record.

However, awk users are attracted by the terse and brief statement abilities that the language has to offer and how much does with so little user code, thus they thrive in applying the most idiomatic and succinct expressions possible. Don't be surprise if you see instead of:

awk '{gsub(/\,/," ",$0);print}' file.here

this:

awk 'gsub(/,/, " ")' file.here

or even

awk 'gsub(/,/, OFS)' file.here

The original command and these two latest produce the same output, in this case.

Or instead of:

awk -F"," 'BEGIN{OFS=" ";} {$1=$1; print}' file

this:

awk -F, '$1=$1' file

Don_Cragun · December 31, 2015, 8:10pm

Note that, although it works with the sample data provided, the last suggestion above will delete lines found in file that have an empty 1st field or have a 1st field that is just a string of one or more 0 characters. The following idiom is safer if either of these could be present in your input:

awk -F, '{$1=$1}1' file