During the process of writing code to replace htseq-count I did some experiments to work out how to get the most out of the gcc compiler. The code was compiled using the -O3 optimised code mode
The compiler appears to include code for efficiently handling strings, probably such that when a copy is made of a string, all that happens is that both locations reference the same internal string and an associated reference count is increased to two. A real copy of the string is only made if it is processed, e.g. a substring created.
It did appear to be the case that using rValue constructors and using move on strings acheived nothing, indeed possibly made the code slightly slower, suggesting that this was getting in the way of the internal optimisation.
An attempt was made to increase the efficiency of a map with a string as the index by just using substrings, but the calculation of the substring, but again the need to allocate more memory for the new version of the string made it less efficient.
I experimented with using "string & var" to declare some class variables so that I explicitly told the compiler not to make a copy. This made things more efficient for a class that was continuously being constructed and destructed, but less efficient for a class that was only constructed once and referred to lots of times. This may suggest that referring to a member variable that is declared as "string &" is less efficient than if it is a string, possible because the code has to navigate an additional indirection.