Removing blank/white spaces and special characters

Hello All ,

  1. I am trying to do a task where I need to remove Blank spaces from my file , I am using
awk '{$1=$1}{print}' file>file1
Input :-
;05/12/1990                      ;31/03/2014                         ;
Output:-
;05/12/1990 ;31/03/2014 ;

This command is not removing all spaces from file . Still i can see a single space before each semicolon.

Could you help me on this .

  1. My file also contains some special characters specially French characters.
derni�res ann�es

When I try replacing this character using sed/awk , I am unable to replace them that is I am unable to print them on command prompt . When I copy paste these two words in my command special characters replace them with some other characters.

So until I print them in my command, I cannot replace them.

Any way to do this I need to replace that A stuff with small a and E stuff with small e .

Kindly help on this.

Thanks,
Himanshu Sood

 
tr -d ' ' < file > file1

or

sed 's/ //g' file > file1

awk approach

awk '{gsub(/ /, x)}1' file > file1

Could you explain me what is this doing ?

Also why my awk was unable to remove all spaces?

Thanks,
Himanshu Sood

Here gsub(/ /, x) , we are replacing all the spaces with nothing
In your code, you are resetting the field separator from multiple spaces to 1, it will supresses the continues spaces to one

$1=$1 removes all empty fields but maintains the default fieldseparators (spaces) between existing fields.

You could remove the spaces if you set an empty value to the built-in variable OFS (Output Field Separator):

awk '$1=$1' OFS= file

@Franklin, if $1==0, it will not print the output.

awk '{$1=$1}1' OFS= file

Hi.

For the second task, one could use iconv, like so:

#!/usr/bin/env bash

# @(#) s1	Demonstrate conversion of UTF-8 to ASCII, iconv.

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C iconv

FILE=${1-data1}

pl " Input data file $FILE:"
cat $FILE

pl " Results:"
LANG=en_US.UTF-8 LC_ALL="" iconv -c -f utf8 -t ascii//TRANSLIT $FILE |
tee data2
pe
file $FILE data2

exit 0

producing:

$ ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian 5.0.8 (lenny, workstation) 
bash GNU bash 3.2.39
iconv (GNU libc) 2.7

-----
 Input data file data1:
derni�res ann�es

-----
 Results:
derniEres annAes

data1: UTF-8 Unicode text
data2: ASCII text

I usually use "C" for the locale, but you may need to set it to something different as noted on the line executing the iconv command. See man pages for details.

Best wishes ... cheers, drl

1 Like