The arrays can be taken out of order, but the array contents need to stay intact. I have been using awk to pull arrays and all their elements from several files into one like this
awk '/string-array name/,/string-array>/' $file
but it seems that the same array appears in more than one file ><
sometimes the arrays have elements and sometimes they do not. The contents of the arrays can be anything, they can have special characters, newlines (I have been using sed to escape the newlines) and regular characters.
Is it possible to include these items in the awk search-delete routine?
This command is nice, it works very well to put all items between <array> and </array> on its own line - making it easier for processing. however, it does not remove duplicate definitions.
Yes I did, it produced the same output as the first nawk suggestion.
With each array and its elements on one line it might make this easier. Is there a way to delete lines based on the comparison result of the contents between the first < and > characters?
I have it wrapped in a function to take the array name as an argument, and the function is run as many times as the number of names. it takes several input files, one for string-arrays, one for styles, plurals, dimens, strings, colors, drawables, etc (all the android resources) and produces two final xml files: one for strings and colors with each item being one line, and one called arrays.xml which is what I am working with now. I hope that clears it up.
---------- Post updated at 02:59 PM ---------- Previous update was at 02:47 PM ----------
I figured it out...
here is my final function
dupArrayDelete() {
echo "removing duplicate"
arrayName=$1
echo $arrayName
#get arrays and their contents on their own line
#first awk prints the file ignoring new lines, putting the whole file in one line
#sed inserts newlines after each closing </style> or </string-array>, etc
#second awk removes all lines that have the same column 2
awk '{printf$0}' $2 | sed 's#</'$arrayName'>#&\n#g' | awk '!A[$2]++' >> $3
}
$1 is the array name, $2 is the source file and $3 is the destination