Thursday, September 18, 2008

Assignment operator

Today I went through a piece of code similar to this (what matters here is how the assignment operator was written):


class Test {
Test()
:theId(0), theName()
{}

Test& operator=(const Test& aRhs) {
theID = aRhs.theID;
theName = aRhs.theName;

return *this;
}

private:
int theID;
std::string theName;

};

as it is, the code works and I have nothing to say at first glance. However , who knows how the class will be extended in the future?

Imagine that in a future version some members are added:

class Test {
Test()
:theId(0), theName(), theHammer(), theHeap(0)
{}

Test& operator=(const Test& aRhs) {
theID = aRhs.theID;
theName = aRhs.theName;
theHammer = aRhs.theHammer; //This is going to be a bottleneck
theHeap = aRhs.theHeap; //This is wrong

return *this;
}

private:
int theID;
std::string theName;
HugeClass theHammer; //Extra member with huge footprint
HeapClass* theHeap; //Extra member on heap
};
as you can see with these two extra members following the same code line as the previous version the code becomes invalid. The objection I get is: "if one day someone adds those two members then they will take care to write it correctly", unfortunately in the real world it doesn't work like this. The average programmer will just add those two extra members following the code line already in place, following the rule: "after all if the code works for the already present members why shouldn't it be the same for the two extra members I have been told to add?"

The trick to avoid problems like this is to write the right code, from the start thinking of what will happen in the future, when possible of course. Usually the change to make is just a matter of a few lines of code and a two line comment in order to warn the future coders.

The original well written class would have been:

class Test {
Test()
:theId(0), theName()
{}

Test& operator=(const Test& aRhs) {

//Check for a self assignment
if (this != &aRhs) {
theID = aRhs.theID;
theName = aRhs.theName;
}

return *this;
}
private:
int theID;
std::string theName;
};

At this point the objection is: why check for a self assignment when the event is never going to happen? Who will ever write:

Test t;

t=t;
well, the code is valid so be sure someone will, also it is not always easy to spot a self assignment, consider this:

t[j] = t[i];

or even:

Test t;
Test &a = t;
...
t = a;

The real question here is: "Why was the assignment operator in that class was ever implemented?" The question makes a point, the operator was not needed at all, in that case the right thing to do is to remove it.

Let's then suppose the class is the one with the two extra members, the class is managing dynamically allocated memory then the operator must be implemented. The almost correct version is:



class Test {
Test()
:theId(0), theName(), theHammer(), theHeap(new HeapClass)
{}

Test& operator=(const Test& aRhs) {
if (this != &aRhs) {
theID = aRhs.theID;
theName = aRhs.theName;
theHammer = aRhs.theHammer;

delete theHeap;
theHeap = new HeapClass(*aRhs.theHeap);
}

return *this;
}

private:
int theID;
std::string theName;
HugeClass theHammer; //Extra member with huge footprint
HeapClass* theHeap; //Extra member on heap
};


I wrote "almost correct" because the assignment operator is correct but not exception safe, imagine what will happen if an exception is thrown, if that is the case then the class will be left in an inconsistent state. The goal is not an easy one, in order to make an assignment operator exception safe before modifying the internal state it is better to create a temporary object and then swap it with the internal state with operations that do not throw exceptions. This is achieved using the Pimpl idiom, moving all the internal state of Test inside another class and then leaving inside the class Test a pointer to this new class, it is better to use a boost::shared_ptr in this case to avoid exception catches:



class TestImpl {

friend class Test;

private:
TestImpl()
:theID(0), theName(), theHammer(), theHeap(new HeapClass)
{}

TestImpl(const TestImpl& aRhs)
:theId(aRhs.theID), theName(aRhs.theName),
theHammer(aRhs.theHammer), theHeap(new HeapClass(aRhs.theHeap))
{}

int theID;
std::string theName;
HugeClass theHammer; //Extra member with huge footprint
HeapClass* theHeap; //Extra member on heap
};

class Test {

Test()
:theImplementation(new TestImpl)
{}

Test& operator=(const Test& aRhs) {
if (this != &aRhs) {
boost::shared_ptr<TestImpl> tmp(new TestImpl(*aRhs.theImplementation));

std::swap(theImplementation, tmp);
}

return *this;
}

private:
boost::shared_ptr<TestImpl> theImplementation;
};

As you can see writing right code is not easy, and this becomes more difficult if you want to write exception safe code. What I suggest is, to remove assignment operator and copy constructor and write those only if really needed:


class Test {

Test()
:theID(0), theName(), theHammer(), theHeap(new HeapClass)
{}

private:
Test& operator=(const Test& aRhs); //disabled (do not even implement it)
Test(const Test& aRhs); //disabled (do not even implement it)


int theID;
std::string theName;
HugeClass theHammer; //Extra member with huge footprint
HeapClass* theHeap; //Extra member on heap
};