I posted the incorrect files yesterday and apologize. I also modified the awk script but with no luck. There are two text files in the zip (name.txt and output.txt). I am trying to match $2 in name.txt with $1 in output.txt and if they match then $1 of name.txt is copied to $7 of output.txt. The tricky part (well at least for me is), that only part of $2 will match $1. Thank you :).
Could you describe what should be matched? Apparently DTE3504500000001ref should match DTE3504500000001 , but DTE3504500000001antiref should not. What is the criterion?
Sorry the correct files are attached. The DTE3504500000001 is the criterion to match so that both records will be assigned the same value. Also, the final output.txt needs to be delimiated so it can be opened in excel. I am not sure where to put the
Then in your example, why does only the record with DTE3504500000001ref get an extra 1 at the end and why doesn't the one with DTE3504500000001antiref get one?
Thank you very much. I attached the combined.txt but forgot that $5 needs to be copied to $8 and the blank row in between the two lines removed. The combined.txt really only needs to loo like the below, but I'm not sure how to do this. Basically, I am going to importing the sheet into a SQL database and trying to format the data accordinly, the values in the sheet are combined (..... ref and .....antiref) and the chromosome is matched. Thank you very much :).
It works great, I am just trying to have the combined.txt be the values in the output sheet combined (..... ref and .....antiref) and the chromosome matched. This way the row space between them can be removed. Thank you very much :).
current combined.txt
DTE3504500000001ref 1
DTE3504500000001antiref "space"
DTE3504500000002ref 1
DTE3504500000002antiref "space"