Storing pointers in a container ------------------------------- This handout clarifies potential problems with storing character strings in containers (e.g., in hash_sets). In general, a container stores objects of a particular type/struct/class. When you add(insert) an object to a container, the object is copied. Hopefully, everything is obvious so far :) In HWK5, hash_set is supposed to store objects of type char* These objects are pointers. When you add(insert) such a pointer to hash_set, the pointer is copied. When a pointer is copied, what it points to is *not* copied automatically (because in some applications you don't want to copy and in others you do). In the example on the SGI page for hash_set, all pointers point to different string constants (such as "kiwi" and "plum"). However, in the example below, we get into trouble with hash_sets (in addition to potential buffer overflows). char buffer[1000]; // if we ever read more than 1000 characters at once // we'll have a buffer overflow, possible crashes, etc while ( inFile ) // almost the same as ! inFile.eof() { inFile >> buffer; // no words in the dictionary are supposed // to be longer than 1000 characters, but // we aren't checking for this in any way ! hSet.insert(buffer); } This code populates hSet (which is assumed to be a hash_set) with *identical* pointers to buffer. So that the hash_set contains a large number of *identical* strings. It's actually just one string (not copies) and it is stored in buffer. If you are using the c_str() method, the effect is practically the same since c_str() is using an internal buffer. So, you need to allocate memory for those strings and copy them from the buffer before adding pointers to hash_set. Note that at the end of your program, you need to iterate through the hash_set and delete (or free, depending on how you allocated that memory) those strings. To copy, you can use strcopy() or strncopy(). In both cases, you need to compute the length of the string with strlen() and allocate enough memory with new. The strdup() function does all of this, except that it uses malloc() instead of new. If you use strdup, you would need to use free() instead of delete. In general it is not recommended to use malloc and new in the same program, but I haven't seen this causing problems in practice. Igor Markov