I need to scan my 'vector<string> mystrings' for duplicate entries.

Basically I'm trying to avoid writing paragraphs to do it. It is possible I could convert over to a vector<int>'s if that may help. Does anyone know how to do this without really turning it into a huge brainbuster? I have considered using a for() loop but I wasn't able to put anything into practice just yet.

Thanks so much for any help.
Eric :smiley:

The only way to avoid N^2/2 cost is to remember the items in a more efficient way or sort them all, which is almost saying the same thing. For instance, put them is a hash map or a tree. Pardon the pseudo code.

for ( i = 0 to n-1 ){                // N^2/2
for ( j = i+1 to n-1 ){
 if ( mystrings = mystrings[j] ){
   mystrings[j].delete() ;
   n-- ;
}}}

map<string> smap ;                                // map
for ( i = 0 to n-1 ){
  if ( !smap.insert_key( mystrings ){
   // insert returns 0 if unique key already in map
   mystrings[j].delete() ;
   n-- ;
}}

Now, if the original order is moot, you can sort them in place, but it does not beat the map for large cases. Since the map is unique by nature, the obvious thing is not to have a vector -- wrong container for requirements. If sorted output is needed, a tree can do that at a slight disadvantage of having log(n) speed. Since map's hash, they do not store things in order: abcd might hash to bucket 137 and bcdef hash to 136.

1 Like