I have files with tabs separating the fields but those fields values can have whitespace characters (basically a text string). I want to instruct join to not consider white spaces as separators but only tabs.
I have tried:
join -t "<Tab>" file1 file2
join -t "<tab>" file1 file2
join -t "\t" file1 file2
kindly submit some input texts and the desired output text
BR
can't do tabs on forums but here:
file1
abc def <Tab> X1
ghi jkl <Tab> X2
file2
abc def <Tab> Y1
ghi jkl <Tab> Y2
output
abc def <Tab> X1 <Tab> Y1
ghi jkl <Tab> X2 <Tab> Y2
use this code:-
nawk '
NR==FNR {
a[$1$2]=$3 ; next
}
{ print s=( $1$2 in a ) ? $1FS$2FS"\t"FS$3FS"\t"FS a[$1$2] : $0 RS }
' file2 file1
note:- if you don't have nawk use /usr/xpg4/bin/awk
BR
To keep the forums high quality for all users, please take the time to format your posts correctly.
First of all, use Code Tags when you post any code or data samples so others can easily read your code. You can easily do this by highlighting your code and then clicking on the # in the editing menu. (You can also type code tags
```text
and
```
by hand.)
Second, avoid adding color or different fonts and font size to your posts. Selective use of color to highlight a single word or phrase can be useful at times, but using color, in general, makes the forums harder to read, especially bright colors like red.
Third, be careful when you cut-and-paste, edit any odd characters and make sure all links are working property.
Thank You.
The UNIX and Linux Forums
thanks Mr. vgersh99 for the advice..hope you can understand the code now after modification :)
It seems to me the solution you propose would only work if there is only 1 space in the first expression but wouldn't if it had multiple (ie "a b cdef"). Anyways, I tried a solution of my own, which isn't that pretty, but works.
- substitute spaces to @@@
- substitute tab to space
- join
- revert space to tab
- revert @@@ to space
sed -e 's/ /@@@/g' ./temp/file1 | sed -e 's/<tab>/ /g' > ./temp/file1.rdy2join
sed -e 's/ /@@@/g' ./temp/file2 | sed -e 's/<tab>/ /g' > ./temp/file2.rdy2join
join -t " " -1 1 -2 1 ./temp/file1.rdy2join ./temp/file2.rdy2join > ./temp/output
then I reverted with sed again.
It will work only you need to change the fields numbers in the assoiative array to be :-
a[$1$2$3]=$4 .... and thats it.
try it...
BR :)