Hi Guys,
I need help in modifying a large text file containing more than 1-2 lakh rows of data using unix commands. I am quite new to the unix language
the text file contains data in a pipe delimited format
sdfsdfs
sdfsdfsd
START_ROW
sdfsd|sdfsdfsd|sdfsdfasdf|sdfsadf|sdfasdf
sdfsd|sdfsdfsd|sdfsdfasdf||sdfasdf
sdfsd||sdfsdfasdf|sdfsadf|sdfasdf
END_ROW
sdfsd
sdfsfsdf
i want to remove the header and the footer, so the final file would look like below.
sdfsd|sdfsdfsd|sdfsdfasdf|sdfsadf|sdfasdf
sdfsd|sdfsdfsd|sdfsdfasdf||sdfasdf
sdfsd||sdfsdfasdf|sdfsadf|sdfasdf
I tried varous vb methods to do it .However when i use it for large files it hangs and closes.
Thanks very much.
Try this with grep:
grep \| < infile >outfile
if your data file contains only one START END block:
sed -n -e '/START_ROW/,/END_ROW/ {p} ; /END_ROW/ q' file.txt > newfile.txt
can i use the grep command to print lines which are greater than a specific length..... let say lines having length > 25
Yes. But in this case, perfromance will not be good, as the output of grep shoule be piped to another command which will select records >25 chars.
In this context, sed would be faster.
sed -n '/|/ { /.\{25\}/p }' < infile > outfile
sed '1,/START_ROW/d;/END_ROW/,$d' infile
below i m getting sed:garbage after command and a blank file generates
sed -n -e '/START_ROW/,/END_ROW/ {p} ; /END_ROW/ q' file.txt > newfile.txt
thanks
Manish
awk '/END/{f=0}/START/{f=1;next}f' file
The below code is working fine.
sed '1,/START-OF-DATA/d;/END-OF-DATA/,$d' corp_pfd_asia.out > corp_pfd_asiaout.txt
However there is a slight problem. The header of the data ends with "# PRODUCT=Corp/Pfd" after START-OF-DATA. when i input "# PRODUCT=Corp/Pfd" instead of "START-OF-DATA" it gives error think coz of "/"
sed '1,/PRODUCT=Corp/Pfd/d;/END-OF-DATA/,$d' corp_pfd_asia.out > corp_pfd_asiaout.txt
Thank you very much
Manish
---------- Post updated at 04:18 PM ---------- Previous update was at 03:42 PM ----------
The below code is working fine.
sed '1,/START-OF-DATA/d;/END-OF-DATA/,$d' corp_pfd_asia.out > corp_pfd_asiaout.txt
However there is a slight problem. The header of the data ends with "# PRODUCT=Corp/Pfd" after START-OF-DATA.
when i input "# PRODUCT=Corp/Pfd" instead of "START-OF-DATA" it gives error think coz of "/"
sed '1,/# PRODUCT=Corp/Pfd/d;/END-OF-DATA/,$d' corp_pfd_asia.out > corp_pfd_asiaout.txt
Thank you very much
Manish
Hi Manish 2009:
You will have to escape forward / like this: \/ :
sed '1,/# PRODUCT=Corp\/Pfd/d;/END-OF-DATA/,$d' corp_pfd_asia.out > corp_pfd_asiaout.txt