and I want to
print the lines of file2
of which the first word (in the first column)
matches with the first word of file 1 (in the first column), BUT keep the order of first file.
The output should look like
but the order in file3 is like file2.
Moreover awk hits in the quote symbol ' is there a way to ignore it and read only the name inside quotes?
Thanks in advance for the help and time
One last question: what can I do if i want to remove the space
when it is followed by single quote from wherever it is inside the file?
The point is to keep the single quote in the previous and next words of a column.
e.g.
'numbers1' 'te1 ' text
'numbers2' 'te2 ' text
...
will have to result the output:
'numbers1' 'te1' text
'numbers2' 'te2' text
...
Note to mention that only 4 characters exist inside the problematic quotes (like 'tes ') including the space.
(note that there is a tab between the last two fields instead of specs on the line containing "gamma"), Aia's code in message #8 in this thread produces:
If I understand the third set of requirements properly (only remove a single space at the end of fields between pairs of single quotes; keep spaces between fields as they were), I think this does what you want:
awk -v sq="'" '
BEGIN { FS = OFS = sq
}
FNR == NR {
for(i = 2; i <= NF; i +=2)
if(substr($i, length($i)) == " ")
$i = substr($i, 1, length($i) - 1)
d[$2] = $0
next
}
$1 in d {
print d[$1]
}' file2 FS=" " file1 > file3
If you have an empty quoted field in the input such as in:
'alfa ' 'keepnumbers ' 'keepnumbers ' ''
this will remove spaces at the ends of the quoted fields that have spaces and add a space to the empty quoted field. It also still removes a space between fields when there are multiple spaces between field in the input. as in:
'alfa' 'keepnumbers' 'keepnumbers' ' '
instead of:
'alfa' 'keepnumbers' 'keepnumbers' ''
But, of course, we have no way of knowing whether or not this matters to the OP since requirements for these cases were not specified.