Hi,
I need help to remove tab delimited space in the $2 of a specific row. My file is like this:-
file1.txt
No_1 4 139 156
No_1 5 161 205
No_4 91 227 212
No_19 254 243 263
No_19 645 249 258
No_19 101 2492 2635
No_90 8 277 288
file2.txt
ID L_254
NAME L_254
START 39644
END 37193
LINE unknown
TYPE R
N 37736-37861
@@
ID L_101
NAME L_101
START 314257
END 312432
LINE unknown
TYPE R
@@
ID L_8
NAME L_8
START 3196078
END 3194948
LINE unknown
TYPE R
@@
i used a script like this to update my file2.txt with values of START and END in file1.txt.
FNR==NR{b[$2]=$3;f[$2]=$4; OFS="\t"; next}
$1=="ID" {id=substr($2,index($2,"_")+1)}
id in b {$2=($1=="START")?b[id]:(($1=="END")?f[id]:$2)}
1
My output is tab separated. The code above works great for the values update but the problem is after each '@@'. i don't want the column after each '@@" in tab separated. It should be considered as the end of the line. It should be just @@ instead of @@\t. Thanks in advance..
sed 's/^\(your_first_pattern\)TAB\(your_end_pattern\)$/\1\2/' input_file >output_file
TAB is a literal tab, your_first_pattern and your_end_pattern must regular-expressions that define and accept what you want to keep.
Hi DGpickett,
Thanks for your response. I believe that your code is for one file only. I have thousands of separate files that i need to remove the tab. Is there any other way to do this? thanks
As long as the pattern does not change, this is quick and robust:
find /top_directory_path -name your_pattern -type f | while read f
do
sed ... $f >$f.new
if [ "$( cmp $f $f.new 2>&1 )" = "" -s -s $f -a -s $f.new ]
then
mv $f $f.old
mv $f.new $f
else
rm $f.new
fi
done
If file names have spaces or metacharacters, put "$f" for $f.
Hi,
Can you please explain to me your code? i dont really familiar with sed. i tried your code couple of times but i got infinite loop in my shell.
sed is a stream editor, it reads in a loop applying the script to each line read and then writing it. It never uses a temp file or runs out of space on huge data sets when you use it on pipes.
The command s/pattern/new_data/ is a substitute. The pattern is regular expressions (regex) slightly expanded to allow \(\) pickup and \number put back down in the substitute. In your case, the regex can identify lines where the tab must be removed, and where in the line the tab is. For just every @@TAB, you could just scrub that phrase as many times on any line as it appears:
s/@@TAB/@@/g