However, I am having issues. I want to remove the first volume in the 2nd field if it has another entry under rearrange. So I want the file to look like this:
Also I have found a line that makes an array of the file, by searching for similar issues. I am not sure how this works, but I think it does not consider the "rearrange part"
cat test3 |gawk '!arr[$2]++'
The above expression gets rid of the last line, which is NOT what I want. I want only the rearrange for that volume to be outputed. In addition, there is a command "tac" that I have seen some work with, but I don't have it on my distribution.
Does anybody have ideas? I am really a novice at removing duplicates and am not sure how the process works.
awk 'NR==1; ---> prints your header in line number 1
NR>1{A[$2]=$0} ---> line number is greater then 1
NR>1{ then Array A with index of column of $2 will hold line $0 that is A[$2]=$0
END{for(i in A)print A}' --> In END block printing array contents
gawk '{if (Line!=$1$2) print; Line=$2}'
Line!=$1$2 --> if line is not equal to column 1 and column2 then print line print , this will work for first line since Line is not set, after printing variable Line will be assigned the value of $2 Line=$2 , and again check if for 2nd line.
--edit--
your code will not work because it just considers previous line pattern, in between if there is any duplicate it will get printed
and awk '!arr[$2]++' this prints only first found value from field 2 $2 this is the reason why rearrange is not getting printed
Yoda solution keeps track of rearrange in field1, my solution assumes it's sorted so it save last found value, if file is not sorted I think you should go through Yoda's solution.