I want a ksh script that parse two files (text files, actually my original files are .xls) - input data:
- one file file1 containig lines separated by spaces (or other delimiter)
- 2nd one file2 contain only one numerical value in a line (for simplicity but it might of the same form as the 1st)
- I know the 2nd value in file1 is also numerical and some of them can be found in file2
Output data:
The result should be a file that contains from the file1 only those lines whose 2nd field cannot be found in any line of file1.
I know this is easy, but I am too tired after a full hard working day so an expert can fix it in a minute.
I think it can be done in one line (complex?) command either....
sample input files AND desired output based on sample input PLS!
An example of line in file1 is like:
GAGLIARDI 7 GILBERTO TREZZANO - DG 30450 3TECH 3TECH 3TECH
All the lines are of this form.
Column 2 is interesting for me.
file2 might contain in one line only a number let's say 7:
- then do not output the line,
otherwise if 7 does not exist in file2
- then output the line (in a file).
At this moment both files are some .xls containing the same columns.
Both files contains thousands of lines...
a sample for file2, pls!
What do you consider a 'column' in file1 and file2?
In your sample file1, the 2-nd column has a value '7'. Is that correct?
A line in file1:
GAGLIARDI 743 GILBERTO TREZZANO - DG 30450 3TECH 3TECH 3TECH
A line in file2:
GAGLIARDI 743 GILBERTO TREZZANO - DG 30450 SupportCRM TeamLead 3TECH
assuming file1 and file2 are of the same format:
nawk 'FNR==NR {file2[$2];next} !($2 in file2)' file2 file1
I think the code should be:
nawk 'FNR==NR {file2[$2];next} !($2 in file1)' file2 file1
But I've checked on the test machines I am allowed to write and run scripts and there I cannot use nawk / not installed but awk is allowed:
ttss...@hk... /home/ttss...> nawk 'FNR==NR {ff2[$2];next} !($2 in ff1)' ff2 ff1
ksh: nawk: not found