Hi i have a file in which i am doing some processing.
The code is as follows:
#!/bin/ksh
grep DATA File1.txt >> File2.txt
sed 's/DATA//' File2.txt | tr -d � � >> File4.xls
As you can see my output is going in a xl file.The output consist of four columns/feilds out of which the first column has Date and time which looks like:
12 Sep 2014 14:00 EST
so if i give tr -d ' ' It works fine for other feilds( because they have values without internal space ) but it deletes spaces for my Date and time feild which looks like :
12Sep201414:00EST
Is there a way where i can delete only the spaces in rest of the feild leaving my first column?
xyz Date Name Age State
------------------------------------------------------------------------------------
DATA 12 Sep 2014 14:00 EST Alia 22 Chennai
SAND 12 Sep 2014 14:00 EST Alia 22 Chennai
DATA 12 Sep 2014 14:00 EST Ashika 22 Chennai
DATA 12 Sep 2014 14:00 EST Bosky 22 Chennai
DATA 12 Sep 2014 14:00 EST Swati 22 Chennai
Input file File2:
xyz Date Name Age State
------------------------------------------------------------------------------------
DATA 12 Sep 2014 14:00 EST Alia 22 Chennai
DATA 12 Sep 2014 14:00 EST Ashika 22 Chennai
DATA 12 Sep 2014 14:00 EST Bosky 22 Chennai
DATA 12 Sep 2014 14:00 EST Swati 22 Chennai
As you can see there are some extra leading and trailling spaces.
input file File4(removes DATA)(xls file)
Date Name Age State
-------------------------------------------------------------------------------
12Sep201414:00EST Alia 22 Chennai
12Sep201414:00EST Ashika 22 Chennai
12Sep201414:00EST Bosky 22 Chennai
12Sep201414:00EST Swati 22 Chennai
I dont want my date column to be edited with space i just want name age and state column to be removed the trailing and leading spaces.
sed 's/ //7g; s/DATA //p; 1s/^[^ ]* //p; 2!d' file1
Date Name Age State
------------------------------------------------------------------------------------
12 Sep 2014 14:00 EST Alia22Chennai
12 Sep 2014 14:00 EST Ashika22Chennai
12 Sep 2014 14:00 EST Bosky22Chennai
12 Sep 2014 14:00 EST Swati22Chennai
Following may help you in same, please use code tags while posting commands and codes in posts.
awk -vvar=" " 'NR==1{$1="";print} NR==2{print} NR>2{$1="";V=$(NF-2) var $(NF-1) var $NF;{gsub(/EST.*/,"EST",$0);gsub(/[[:space:]]/,X,$0);print $0 OFS V}}' Input_file
Output will be as follows.
Date Name Age State
------------------------------------------------------------------------------------
12Sep201414:00EST Alia 22 Chennai
12Sep201414:00EST Ashika 22 Chennai
12Sep201414:00EST Bosky 22 Chennai
12Sep201414:00EST Swati 22 Chennai
sed '2!{s/[^ ]* //};1!{s/ //;s/ //;s/ //;s/ //;}' yourfile
Using Rudi's 5g stuff :
sed '2!{s/[^ ]* //;};1!{s/ /#/g;s/#/ /5g;s/#//g;}' yourfile
# cat tst
xyz Date Name Age State
------------------------------------------------------------------------------------
DATA 12 Sep 2014 14:00 EST Alia 22 Chennai
SAND 12 Sep 2014 14:00 EST Alia 22 Chennai
DATA 12 Sep 2014 14:00 EST Ashika 22 Chennai
DATA 12 Sep 2014 14:00 EST Bosky 22 Chennai
DATA 12 Sep 2014 14:00 EST Swati 22 Chennai
# sed '2!{s/[^ ]* //;};1!{s/ /#/g;s/#/ /5g;s/#//g;}' tst
Date Name Age State
------------------------------------------------------------------------------------
12Sep201414:00EST Alia 22 Chennai
12Sep201414:00EST Alia 22 Chennai
12Sep201414:00EST Ashika 22 Chennai
12Sep201414:00EST Bosky 22 Chennai
12Sep201414:00EST Swati 22 Chennai
Just another one :
# sed '2!{s/[^ ]* //;};1!{s/ /#/5g;y/ #/- /;}' tst
Date Name Age State
------------------------------------------------------------------------------------
12-Sep-2014-14:00-EST Alia 22 Chennai
12-Sep-2014-14:00-EST Alia 22 Chennai
12-Sep-2014-14:00-EST Ashika 22 Chennai
12-Sep-2014-14:00-EST Bosky 22 Chennai
12-Sep-2014-14:00-EST Swati 22 Chennai
Leaving out the code tags you hid all those multiple spaces from us thus leading to inappropriate solutions. With that new info, try
sed -r 's/ +/ /g; s/DATA //p; 1s/^[^ ]* //p; 2!d' file1
Date Name Age State
------------------------------------------------------------------------------------
12 Sep 2014 14:00 EST Alia 22 Chennai
12 Sep 2014 14:00 EST Ashika 22 Chennai
12 Sep 2014 14:00 EST Bosky 22 Chennai
12 Sep 2014 14:00 EST Swati 22 Chennai
How would you propose to distinguish that space from the one(s) before? In awk default, ALL spaces are the field separators, so they would be gone anyhow. In sed, there's no fields, so you would need to know the space count in order to remove that one...
Ok so its very simple i have a text file okay which has four columns like below:
column1 column2 column3 column4
16 jan 1994 _alia1 22 chennai
17 jan 1994 _alia2 22 chennai
18 jan 1994 _alia3 22 chennai
19 jan 1994 _alia4 22 chennai
and from this file i want to remove that one space which is coming in name and move it to another file which is an excel sheet so that it comes like below:
column1 column2 column3 column4
16 jan 1994 alia1 22 chennai
17 jan 1994 alia2 22 chennai
18 jan 1994 alia3 22 chennai
19 jan 1994 alia4 22 chennai
---------- Post updated at 10:23 AM ---------- Previous update was at 08:11 AM ----------
can we do this?? please reply i need help with this
First, let me be very clear: If you have a text file, feeding it through awk , grep , sed , and/or tr is not going to magically convert that text file into the proprietary format of a Microsoft Excel Spreadsheet!
If you have the text file you have shown us in post #12 in this thread, you can create the output you say you want in that post as a text file using the command:
sorry for the confusion and the mess created by me.
I have only the xls file which has this data
column1 column2 column3 column4
16 jan 1994 _alia1 22 chennai
17 jan 1994 _alia2 22 chennai
18 jan 1994 _alia3 22 chennai
19 jan 1994 _alia4 22 chennai
Underscores means spaces. so there is only one leading space before the data in column 2 and the position of the space is shown above.
and i want to remove space from the second column to make it look like below:
column1 column2 column3 column4
16 jan 1994 alia1 22 chennai
17 jan 1994 alia2 22 chennai
18 jan 1994 alia3 22 chennai
19 jan 1994 alia4 22 chennai
NO! What you have shown us is a text file not a binary Microsoft Excel spreadsheet file. An underscore and a space are very different characters. If you are saying that you have converted a space to an underscore to indicate which one of the nine spaces between 1994 and alia is to be deleted; I say it doesn't matter whether you delete the 2nd of those nine spaces (which the sed script I provided does) or the 9th of those nine spaces. Either way, you get the output you requested (as long as the input you showed us above is an accurate representation of the text file you have as an input file). Did you try the sed command I suggested, or have you determined by looking at it that it won't do what you want?
Please try it before deciding it won't work. The sed script I suggested produces exactly the output you said you want when I run it!
I tried your code it doesn't work for me..
and i have a excel file with .xls extension. I am doing all the data processing in text file and finally transferring it to the xls file.
the xls file has that space.
Underscore is used to show the space its not the actual underscore.. I have used it to show the space,the leading space.
Again (see post#9): How would you propose to distinguish that space from the one(s) before? Are those spaces or <TAB>s?
If it is a REAL .xls file (which is hard to believe reading all the posts), we cannot work on it with text tools. You need to run EXCEL and remove that column's leading spaces (or maybe the perl excel module?).
Just calling a file "something.xls" does not make it a (binary) EXCEL file. What you provided above is producing text files, and in those you can't distinguish one space from the other, and removing the first of nine is as good as removing the last (as Don Cragun said). If his proposal doesn't fit your needs, I'm afraid you're out of luck.