Sunday, May 6, 2007

Rollbacks -- Destructors for successfully and partially conustructed objects.

keywords: try-except, try-finally, constructor-destructor, allocate-free, create-free, deallocate, dispose




Dealing with exceptions within a method
Normally, any programming task involves allocation of resources (memory, system object, file, device, etc), their use and disposal. Schematically, this can be exelmplified by the following listing:
-- Listing 1 --
res1 = allocate_res1();
res2 = allocate_res2();
use(res1, res2);
free(res2);
free(res1);

The example with two resources can be easily extended to n resources. The problem is that allocation and use of resources may cause errors (mem not available, dev does not support an op). The errors interrupt the normal flow of execution and unwind the call stack untill the error is 'caught' and processed. In the languages that do not support exceptions you are taught to do like in listings 2, 3 or 4:
-- Listing 2 --
if (res1 = allocate_res1()) {

if (res2 = allocate_res2()) {
if (!a_use(res1, res2)) {
print("a failed");
goto use_failure;
}

if (!b_use(res1, res2)) {
print("b failed");
goto use_failure;
}

use_failure:
free(res2);
} else
print("could not alloc res2");

free(res1);

} else
print("could not alloc res1");



-- Listing 3 --
res1 = allocate_res1();
if (!res1) {// failure
print("failure allocating res1");
goto exit;
}
res2 = allocate_res2();
if (!res2) {// failure
print("failure allocating res2");
goto res1_rollback;
}

if (!a_use(res1, res2)) {
print("a failure");
goto res2_rollback;
}

if (!b_use(res1, res2)) {
print("b failure");
goto res2_rollback;
}

res2_rollback:
free(res2);

res1_rollback:
free(res1);

exit: // nothing to free


-- Listing 4 --

res1 = null;
res2 = null;

res1 = allocate_res1();
if (!res1) {
print("res1 allocation failed");
goto release;
}

res2 = allocate_res2();
if (!res2) {
print("res2 allocation failed");
goto release;
}

if (!a_use(res1, res2)) {
print("a failure");
goto release;
}

if (!b_use(res1, res2)) {
print("b failure");
goto release;
}

release:
if (res1) free(res1);
if (res2) free(res2);


The first observation is that error handling inflates code multiple times. It is a serious issue therefore. The repeated code patterns could be C-macros generated to some extent. The widely-propagated approach presented in listing 4 additionally entails two additional redundant sections: preallocation initialization and if-initialized check in release. This additional code bloat degrades both code managability by human, storage , compilation and runtime processing overhead. Let's call it a 'wasting approach'.

In exception-enabled languages, the amount of coding can be considerably reduced altogether with gotos elemination:
-- Listing 5 --
Win32Check(DWord RetVal, context) {
if (RetVal = 0)
raise Win32Exception.Create(context);
}

Win32Check(res1 = allocate_res1(), "allocating res1");

// allocation was successfull, enter try-finally for res1
try {
Win32Check(res2 = allocate_res2(), "allocating res2");
try { // res2 try-finally
Win32Check(a_use(res1, res2), "execution of a");
Win32Check(b_use(res1, res2), "execution of b");
} finally {
free(res2);
}
} finally {
free(res1);
}
free(res2);

When the library is exception-aware, the code is further simplified to:
-- Listing 6 --
Res1 res1 = new Res1();
try { // try-finally to deallocate res1
Res2 res2 = new Res2()
try {
res1.use(res2);
res2.use(res1);
} finally {
res2.Free();
}
} finally {
res1.Free();
}


I do not understand the pinheads that still insist on wasting approach under single-finally section:
-- Listing 7 --
Res1 res1 = null; // initialize to invalids
Res2 res2 = null;
...
Resn resn = null;
try {
res1 = new Res1();
res2 = new Res2();
res1.use(res2);
res2.use(res1);
} finally {
if (res1 != null) res1.Free(); // check for initialized before calling free
if (res2 != null) res2.Free();
...
if (resn != null) resn.Free();
}



Imho, the latter, in addition to being the waste approach, defeats OOP. The exceptions are mostly feature of good OOP languages and are intended to replace the 'big switch' solution of structural programming by polymorphism:
-- Listing 8 --
void draw(shape)
switch (shape) {
case point: drawPoint();
case line: drawLine();
case circle: drawCircle();
...
}

in OOP is implemented by a set of shape classes:
class Point extends Shape() {
void draw() {
...

class Circle extends Shape() {
void draw() {
...

void draw(shape) {
shape.draw();
}


I would like to ask them: what is the advantage of conditional resource release if we know that object should be released only if it has created successfully and we know that it has created successfully if construction succeeded? IMO, the try-finally is exactly for the purpose to be used after every successful allocation:
-- Listing 9 --
obj = create()
try
use(obj)
finally
free(obj)


We have revised the simple allocate-use-deallocate blocks that are typically found in routines. But what if allocation, use and release are spread across different routines? Use of constructors and destructors implies such separation. Such scattered code cannot use try-finally to ensure the release of resources.


Con- and De-structors
Constructors are intended for initialization of object fields. Since objects consist of some other subobjects, it is natural for constructors to allocate these basic parts. This typically involves memory but also can include opening of files, synchronization objects, connections and other devices. Destructors are used to free the allocated resouces. It is responsibility of user to call a destructor, once object is created. Destructors must not raise errors/exceptions because this stucks user -- he neither can proceed nor exit (deallocation of resources is usually done at program/routine exit). But what if an exception occurs in constructor so that it cannot proceed and exits with failure? We cannot just jump to the finalization section code because there is no one in the constructor. Neither destructor can be merely called on an object that is incomple/invalid.

In Delphi, an exception in constructor calls the destructor. In Java, destrucor methods are class-specific and constructor is free to clean-up anyway in case of exception. So, constructor and destructor must agree on disposing partially constructed objects. The waste-approach is widely recommended:
res1 = nil; // init to unallocated flag
res2 = nil;

try
res1 = Res1.Create();
res2 = Res2.Create();

// use resoucres

finally
if res1 <> nil then Res1.Free
if res2 <> nil then res2.Free;
end;

When we know the order of the resources allocation, the deallocation checks can be optimized. Here is how I did it in plain C:

Resourse1 r1;
Resourse2 r2;
...

// allocation check macro
#define CHECK_ALLOC(alloc_func, success_rollback, err_msg) \
if (!func_alloc) { \ // allocation failure
print(err_msg); \ //
deinit(rollback); \ // destroy the partially created object
} else \
rollback = success_rollback; // success, advance the rollback marker


void init() {
rollback = nothing; // a local variable to track the progress
CHECK_ALLOC(r1 = alloc1(), rb_r1, "allocating r1");
CHECK_ALLOC(r2 = alloc2(), rb_r2, "allocating r2");
..
}

void deinit(rollback_point) { // partial destructor
switch (rollback_point) {
case rb_rn: release(rn);
...
case rb_r2: release(r2);
case rb_r1: release(r1);
case nothing: // nothing to deallocate
}
}

void deinit() { // normal destructor
deinit(rb_rn); // start cleaning up from the last allocated resource
}


The idea is that if allocation fails, the progress is rolled back starting from the top point reached. If everything is allocated OK, the object is created and can be destroyed by deinit(). The proposed ideal constructor-destructor organization allows to share the same deallocation code for both normal workflow and errors in constructor.

The code is trivialized by exceptions in decent OOP languages that automatically check every result. This not only considerably reduces the code complexity but also ensures that no check is omitted (destructor is the same as above):

void Create() {

rollback = nothing; // define a local variable to keep building progress
try {
r1 = alloc1(); rollback = rb_r1; // allocation error throws an error
r2 = alloc1(); rollback = rb_r2;
..
} except (Exception e) { //rollback on catch
Free(rollback);
throw; // re-throw
}
}

No comments: