I have two text files which have records of thousand rows. Each row is having around 40 columns. Each column is tab delimited. Each row is delimited by newline character.
My requirement is to find for each row i need to find whether any column is different between the two files. For each row i need to find which columns are different. Example is as below
Thanks for the reply. Sorry for mentioning it as | instead of tab.
I am able to run the command properly and the results are coming as expected. One problem i am facing is that there is possibility that files have data which are in different order means 1 st row in file 1 could point to 5th row in file 2.
What i could of think of now as we should take the key column (number) from the user and then sort the file on the basis of that. Is there any other way of doing the same.
Can it be done on the basis of filteration also means for file 1 we will take the primary key and then filter the file 2 on the basis of that but i think it will be cumbersome. Sorting the whole file on the basis of primary key will be better option.
Can you please provide any better way of doing this. what will the unix commands for the same. Thanking you in advance for helping me out on this.
What does it mean when you say that '1 st row in file 1 could point to 5th row in file 2'?
Is there a common key (a common row cell OR a combination of cells) that relates 2 rows from 2 different files?
If you know that, you can rewrite the initial script - no need for sorting.
Thanks. What i am trying to say is that data of 1st row in file 1 needs to be compared with data of 5th row.(this is just an example as data in the files are in different sort order)
In a nutshell, User should be prompt to provide the common key (means the user will be entering the column number.) Based on common key the data should be compared.
So considering the same example as above
File1
1|check|test|plan|672
4|checked|this|plan|610
3|just|no|plan|612
Thanks a lot. It perfectly works fine.
Now the problem which i am facing in this that some of the rows are missing in the file2 due to which it does not come in the report. The current code tells the difference between columns of each row. There should be also some report which tells these rows (means primary key in file1 is not found in file2) are missing in file2.
Hope i have clear my question
Hopefully by now you understand the current implementation and can modify it to fulfill your changing requirements.
Do come back with any specific concerns.
Good luck.