Define a variable inside or outside of a loop

If you implement something you will write a lot of loops. Within the loops you will often use variables. This is a very common issue but it isn’t that easy like it looks in the first moment. Because you have to make one decision: whether you define the needed variable(s) inside or outside the loop.

// define variable outside of loop
MyClass myInstance;

for (int i = 0; i < n; i++)
{
  myInstance = DoExecute();
}

// define variable inside of loop  
for (int i = 0; i < n; i++)
{
  MyClass myInstance(DoExecute());
}

 

I have heard many arguments, pros and cons about both approaches. Therefore I want to use this article to write about the most important aspects of this decision. I think the most important aspects are: readability, maintainability and costs (speed and memory). These are basic criteria which you can use for any kind of design decision.

 

Readability

In my opinion both implementations are coequal.

I already have heard the argument that the definition outside of the loop is less readable because if you read the assignment you sometimes have to have a look at the variable definition and therefore scroll up until you will find it. This may be necessary to understand what kind of variable we have. But there is something wrong with this argument. The root cause of the issue isn’t the question where we define the variable. There are two other design issues. At first your variable shall have a name which makes it readable without the need to know its type. And second you loop and your function shall not contain that much code that you have to scroll.

 

Maintainability

I think there is one difference between the both implementations which may have an influence to the code maintainability. The difference is the scope of the variables. In case you define the variable outside of the loop in can be accessed in a bigger scope. And therefore the variable which was created to use it within the loop only may be used in other parts of the function and therefore creates dependencies and reduces the maintainability of the source code.

But unfortunately this argument isn’t that solid too. If this issue occurs, your function will probably do more than one thing. If a function does one thing only the function wide scope of the variable should not be an issue.

 

Costs

As we have not seen any important differences so far we will now look at the costs of the different implementations. So we want to think about memory usage and execution speed.

One common guideline says: “Define a variable when it is needed”. The thought behind this guideline is that you don’t define a variable which will maybe never be used. A function may return early in case of an error or in case you have explicit return statements for example in parameter checks at the beginning of the function. If you define a variable prior to this possible function returns it may happen that it is never used. Therefore it’s a waste of time and memory to define the variable before it is needed. In the case of the loop the probability of this use case is small. Parameter checks which may lead to an early function return should already be done. But if you execute some functions within the loop which may throw an error, this argument will become relevant. Therefore with respect to this common development guideline we shall define the variable when it is needed and prefer the definition inside of the loop.

At next I want to think about the execution speed. In terms of operations the two approaches will create the following costs. (‘n’ is the number of loop iterations)

Variable definition outside of the loop Variable definition outside of the loop
1 constructor

n assignments

1 destructor

n constructors

n destructors

 

These costs of constructors, assignments and destructors depend on the programming language and on the variable type. Therefore it is not possible to say you shall always use the first or second approach. But if you implement the loop you may think about the used object. Is it a large object which needs to manage resources? In this case construction and destruction may be very expensive. Or is it a lightweight object with small construction cost? In this case construction and destruction may be very fast.

 

Summary

In terms of readability and maintainability you should prefer to define the variable inside the loop. The arguments for this decision are loose and therefore the main aspect is regarding execution costs.

If you don’t have any performance issues or if you don’t know the costs of the destructor, constructor and assignment of the object, you shall also prefer to define the variable inside the loop.

Only if you are dealing with a performance sensitive part of your application and you know that the constructor-destructor pair costs more than an assignment, you shall prefer to define the variable outside of the loop.

Advertisements
Veröffentlicht unter C++ | Kommentar hinterlassen

Slicing problem

By default C++ will pass parameters by value. Therefore when you pass an object to a method a copy of this object will be created. Other languages, for example C# will pass parameters by implicit references.

If you work with interfaces and pass parameters by value, the slicing problem can occur. “Working with interfaces” means you normally have a superclass defining functions and a couple of subclasses overwriting these functions.

Within the following example we implement a window base class with a drawing function. To keep it easy we implement a console output as replacement of a complex drawing algorithm. The window base class will be used by a couple of subclasses. The example contains one subclass which of course will overwrite the drawing function and add its own visualization.

class Window
{
public:
  virtual void Draw() const
  {
    std::cout << "window" << std::endl;
  }
};

class ToolWindow : public Window
{
public:
  virtual void Draw() const
  {
    std::cout << "tool window" << std::endl;
  }
};

int _tmain(int argc, _TCHAR* argv[])
{
  ToolWindow toolWindow = ToolWindow();

  toolWindow.Draw();

  return 0;
}

If we execute the application the output “tool window” is shown.

Now we add a generic drawing function which will get the window interface and call the windows drawing method. So we can use any kind of window subclass and pass it to the generic function.

void Draw(Window window)
{
  window.Draw();
};

int _tmain(int argc, _TCHAR* argv[])
{
  ToolWindow toolWindow = ToolWindow();

  Draw(toolWindow);

  return 0;
}

But now the output is “window”. What’s wrong within this example implementation? We have created a tool window and passed it to the function which expected a window (interface).

If we go back to the start of this article we will find the explanation: “C++ passes parameters by value and therefore creates a copy of the object passed to a method”. As our method expects a window object a new window object was created as copy of the given tool window. But of course the window object does not know the additional implementations of the tool window and therefore this part of the object was “sliced” away. The “slicing problem” occurs when an object of a subclass type is copied to an object of superclass type and thereby losing part of the information which was contained in the subclass.

To implement the functionally we want – a generic function expecting a interface (superclass) – we have to pass the value as reference. This can be done by using one of the most important implementation patterns in C++: “pass by reference to const”. The following source code shows the adapted example. Now we pass the value by using a const reference.

void Draw(const Window& window)
{
  window.Draw();
};

int _tmain(int argc, _TCHAR* argv[])
{
  ToolWindow toolWindow = ToolWindow();

  Draw(toolWindow);

  return 0;
}

This time the output is like expected: “tool window”. The tool window object reference was passed to the function and nothing was spliced away.
Of course, the slicing problem will not only occur for function parameters. Each type cast to a superclass may slice the subclass part away. Therefore I want to finish this article by showing a casting example. The following source code contains two castings: one to a superclass object and one to a superclass reference.

int _tmain(int argc, _TCHAR* argv[])
{
  ToolWindow toolWindow = ToolWindow();

  static_cast<Window>(toolWindow).Draw();
  static_cast<Window*>(&toolWindow)->Draw();

  return 0;
}

Of course we can expect the same slicing problem like in the examples above. The first cast will slice the subclass implementation away and outputs “window” and the second cast outputs “tool window”.

Veröffentlicht unter C++ | Kommentar hinterlassen

Methods should do one thing only

According to the Single Responsibility Principle a class should have one, and only one, reason to change. To same statement is valid for methods. A method should do one thing only and therefore have only one reason to change.

Unfortunately this principle will be disregarded often. And I think the main reason is that it isn’t that easy to say whether a method does one or several things. Within this article I want to give you some hints which may help you to count the responsibilities of a method.

Does your method delegate or implement tasks?

In general, source code can contain code to execute a task or code which delegates the task execution to another method or class. For example you want to create a file with some content. You may use the File class of the .NET framework and use the open, write and close methods to implement you file handling task. This implementation contains execution code. Your source code implicitly implements the task. This implementation will normally be part of a method. Within another class this method is used. For example prior to closing your application you want to store all data. So your shutdown module will call the implemented file creation method. This shutdown method will therefore delegate the work to another method.

Methods which delegate the work to other methods will most often contain several of such delegations. They will call them by using a logical order or workflow. Therefore I normally call these two types of code: “execution code” and “logical code”. Execution code is the implementation of a task and logical code contains a workflow which uses the methods of the execution code.

In my opinion the separation between execution code and logical code is a base concept for clean software architecture. And this concept will help us to solve the issue of this article. Therefore I want to ask the following question: “Is a method which contains execution code and logical code able to do one thing only or will it always do several things?”

I think it will help to look at a little example. The following example shall implement a report generation. The report will contain the salary for all employees. To make it easy we will not implement every detail. Let’s say we have following components: a data class for an employee and a report component which can create pages and add some headers and text elements. Furthermore it is possible to add a new page be defining the content.

interface IReportContent
{
	//...
}

interface IReport
{
	void CreateNewPage();
	void AddHeader(string header);
	void AddText(string content);
	void AddPage(IReportContent content);
}

interface IEmployee
{
	string Name { get; }
	double Salary { get; }
}

Or report shall contain a page with some base information, pages for all employees and a summary page. Therefore we will implement the following method.

class ReportGenerator
{
	public IReport CreateSalaryReport(List<IEmployee> employees)
	{
		IReport report = new Report();
					
		//create header
		report.CreateNewPage();
		report.AddHeader("Employee Report");
		report.AddText("Date: " + DateTime.Now);
		report.AddText("Created by: " + "...");

		//add all employees
		foreach (IEmployee employee in employees)
		{
			IReportContent content = GetEmployeeSalaryReportPage(employee);
			report.AddPage(content);
		}
			
		//add summary
		report.CreateNewPage();
		report.AddHeader("Total Pays:");
		report.AddText("....");
	}

	private IReportContent GetEmployeeSalaryReportPage(IEmployee employee)
	{
		//...
	}
}

What do you think if you look at this method? How many responsibilities does this method have? Or in the words of the Single Responsibility Principle: How many reasons for a change exist?

I think the method does three things. It creates the header page, it creates the summary page and it concatenates all pages to create the report. Or in other words, the method has three reasons for a change: A change of the content of the header page, a change of the content of the summary page or a change of the report structure.

Therefore we have to refactor this method. In this case you can extract the header page creation and the summary page creation into own methods.

class ReportGenerator
{
	public IReport CreateSalaryReport(List<IEmployee> employees)
	{
		IReport report = new Report();
		IReportContent content;

		content = GetReportHeader();
		report.AddPage(content);

		foreach (IEmployee employee in employees)
		{
			content = GetEmployeeSalaryReportPage(employee);
			report.AddPage(content);
		}

		content = GetReportSummary();
		report.AddPage(content);

	}

	private IReportContent GetReportHeader()
	{
		IReportContent content;
		content.AddHeader("Employee Report");
		content.AddText("Date: " + DateTime.Now);
		content.AddText("Created by: " + "...");

		return content;
	}

	private IReportContent GetEmployeeSalaryReportPage(IEmployee employee)
	{
		//...
	}

	private IReportContent GetReportSummary()
	{
		IReportContent content;
		content.AddText("Total Pays:");
		content.AddText("....");

		return content;
	}
}

With this small refactoring the responsibilities of the method have greatly changed. Now the method has one responsibility only to create a report by combining the report parts. As a result there is only one reason for a change: the page structure of the report changes, for example the summary page shall be removed.

After this example I want to come back to the topic and question of this chapter: “Does your method delegate or implement tasks?”

As you can see the first version of the method does both. It implements the creation of the header and summary pages and it delegates the creation of the employee salary page. This mix of execution code and logical code is a clear signal that the method does several things. In summary I want to make the following statement:

A method which contains execution code and logical code will always ever do several things. Therefore you should never mix execution code and logical code within a method.

Does your method contain several separated logical workflows?

If your method contains execution code or logical code only, you need another possibility to see whether it does several things. In this case you should discover the logical workflows within the method. If you can find more than one workflow or if you see a chance to split up the existing workflow into different parts, than there is a high probability your method does more than one thing.

I want to show a little example. Let’s say we have a settings engine which should store settings data. The data is stored as encrypted string. To make it easy we will look at the settings engine only and hide the details of the database component and the settings data component.

class SettingsEngine
{
	IDatabaseController _database;
	IConfigurationController _configuration;

	public void SaveSettings()
	{
		//connect or create database
		if(_database.Connect() == false)
		{
			_database.Create();
		}

		//get settings data as encrypted data
		string settings;
		settings = _configuration.GetEncryptedSettingsData();

		//write to database
		if(_database.WriteSettingsData(settings) == false)
		{
			throw new DatabaseWriteException(...);
		}
	}
}

The method contains logical code only. So far so good, but let us check the second criteria. Does the method contain several logical workflows? Unfortunately: yes! There are three workflows. At first we have the overall workflow to store the settings data, which is the main workflow of the method. Second we have the procedure to initialize the database by connect to the database or create a new one if necessary. And there is a small third workflow which tries to write the data and throw an exception if the data update fails. As a result of this design there are three possible reasons for a method change: The connections procedure could be changed, for example throw an error and don’t create the database. The data update could be changed, for example repeat the write step with additional rights and don’t throw an error. And the whole storage workflow can be changed for example by removing the connection step as this can be done by another service class.

In summary we have two separated workflows within the method workflow. We can therefore refactor this method by extracting these workflows.

class SettingsEngine
{
	IDatabaseController _database;
	IConfigurationController _configuration;

	public void SaveSettings()
	{
		InitializeDatabase();

		string settings;
		settings = _configuration.GetEncryptedSettingsData();

		WriteSettingsToDatabase(settings);
	}

	private void InitializeDatabase()
	{
		if (_database.Connect() == false)
		{
			_database.Create();
		}            
	}

	private void WriteSettingsToDatabase(string settings)
	{
		if (_database.WriteSettingsData(settings) == false)
		{
			throw new DatabaseWriteException(...);
		}
	}
}

Now the method contains no individual workflows. There is only the main workflow to store the settings. The method will do one thing only now and therefore there is only one reason the change the method.

In summary there I can give the following recommendation:

A method which contains several separated logical workflows will always ever do several things. Therefore you should never implement more than one workflow within one method.

Summary

A base software concept is: A method should do one thing only or in other words there should only one reason to change a method. But the source code of nearly every application will contain methods which violate this rule. One reason may be that it isn’t that easy to see whether your method does more than one thing or not. Within this article I have introduced two easy checks which you can use to identify methods with several responsibilities.

Veröffentlicht unter .NET, C#, Clean Code | Kommentar hinterlassen

Manage resources by using smart pointer RAII objects

A common way to create objects is by using a factory method or an abstract factory. Such a factory will normally create and return the object and handover the responsibility over the object to the client. Therefore the client has to release the created object to free no longer needed resources. The following source code shows a typical implementation pattern. The object is created by using a factory and released at the end of a function call.

class MyObject
{
};

MyObject* CreateMyObject()
{
  return new MyObject();
}

int _tmain(int argc, _TCHAR* argv[])
{
  MyObject *pMyObject = CreateMyObject();

  // ...

  delete pMyObject;
  
	return 0;
}

 

This looks fine in the first moment but this pattern is a source for errors. It may happen that the „delete“ statement is never executed. For example an error may occur in one of the statements before „delete“ and the function returns early. But not only runtime errors may occur, this pattern will also increase the possibility of implementation errors, especially in huge functions were the factory and the delete statements are far away from each other. For example a developer may add parameter checks or checks the return values of sub-function calls and return in case of wrong parameters or results.

To avoid such issues you should manage resources in own objects. If the object scope is left, the object will be destroyed and the destructor is called. This is done in any case, independent whether the function is executed completely, an early return is called or an error occurred.

By using a resource management object you can free resources in the dtor and you don’t have to think about all the possible ways the object scope is left.

This concept is often called RAII (resource acquisition is initialization). It means that objects acquire resources in their constructors and release them in their destructors.

Fur such simple cases like the one above you don’t have to implement an own object. Instead you can use an existing one offered by the STL, for example a shared pointer.

int _tmain(int argc, _TCHAR* argv[])
{
  std::shared_ptr<MyObject> pMyObject(CreateMyObject());

  // ...

  return 0;
}

 

Within the destructor, the shared pointer will delete the containing object. So it’s destructor is invoked and you have the possibility to release resources. You have to have one importand thing in mind whenever you use a shared pointer: it calls “delete” but not “delete[]”. Therefore you cannot use it for dynamically allocated arrays. That’s because vector and string can almost always replace dynamically allocated arrays and should be used instead. In case you need a shared pointer for arrays you can find one within the Boost library (boost::shared_array).

Another important guideline is to create the shared pointer within an own statement and do not use it inside another statement. Let us use the previous example to explain why. For example you want to execute a function „DoSomething“ which has two parameters, the object pointer and an object with application settings. These settings are read by a function “GetApplicationSetting()”. So you may call the DoSomething function and execute the creation of the shared pointer and the function call for the application settings as nested statements.

int _tmain(int argc, _TCHAR* argv[])
{
  DoSomething(std::shared_ptr<MyObject> pMyObject(CreateMyObject()), GetApplicationSettings());

  return 0;
}

 

Beside the fact that such nested function calls are difficult to read, this call may leak resources. But why? We use an object to manage the resources and as we have learned so far this should solve the resource leaking issue.

We have to think about the possibilities of the compiler to understand this behavior. The compiler has to add the following steps.

  • call GetApplicationSetting()
  • execute factory.CreateMyObject)
  • call std::shared_ptr ctor

But the compiler does not execute them in this order. He can change the execution order to create more efficient code. So the compiler can choose the following order:

  • execute factory.CreateMyObject)
  • call GetApplicationSetting()
  • call std::shared_ptr ctor

What will happen if “GetApplicationSetting()” throws an exception? In this case we create our object by calling the factory method but we have not yet stored it within the shared pointer. So we end up in a resource leak as the dtor of the object is never called. To avoid this issue we should create new objects and store them within the smart pointer in a standalone statement. Furthermore to increase source code readability I would recommend avoiding nested function calls in general. So I prefer to call “GetApplicationSettings()” in a standalone statement too.

int _tmain(int argc, _TCHAR* argv[])
{
  std::shared_ptr<MyObject> pMyObject(CreateMyObject());
  ApplicationSettings mySettings = GetApplicationSettings()

  DoSomething(pMyObject, mySettings);

  return 0;
}

 

Summary

Follow the RAII concept and use objects to manage resources. Implement the resource creation and the management object creation in a standalone statement.

Veröffentlicht unter C++ | Kommentar hinterlassen

Virtual Destructor

You may have heard the advice that a destructor should be declared as virtual in case you use the class as a base class. Within this article I want to show the technical background behind this advice and want to explain why you should follow this guideline.

Let’s start by implementing a base class and a derived class and test the behavior of the destructor.

class Animal
{
public:
  ~Animal()
  {
    std::cout << "dtor of Animal\n";
  }
};

class Bird : public Animal
{
public:
  ~Bird()
  {
    std::cout << "dtor of Bird\n";
  }
};

int _tmain(int argc, _TCHAR* argv[])
{
  Animal *pAnimal = new Animal();
  delete pAnimal;

  std::cout << "\n";
  Bird *pBird = new Bird();
  delete pBird;

  return 0;
}

 
The console application creates the following output:

dtor of Animal

dtor of Bird

dtor of Animal

 

As expected the according destructors a called. So far we don’t see the need for a virtual base class destructor.

But what happens in case the derived class is deleted by using a base class pointer. If the base class has a non-virtual destructor the result is undefined according to the C++ standard. And that’s easy to explain because in this case the compiler cannot know which of the two destructors you want to call because this may be different according to the current context of you implementation. In this case the result can be a partly destroyed object because the compiler will call the base class destructor only. Within the console application example the compiler will call the base class destructor only. But another compiler may call the derived class destructor.

int _tmain(int argc, _TCHAR* argv[])
{
  Animal *pAnimal = new Bird();
  delete pAnimal;

  return 0;
}

Output:

dtor of Animal

As explained it is undefined how the compiler will handle the above implementation and therefore it is absolutely necessary that we implement the destructor in a way that we have a defined behavior. This can be done by using a virtual destructor. In case your base class has a virtual destructor, the derived class will overwrite it. And now, the above example will work without any issues. Regardless whether you use a pointer to the derived class or a pointer to the base class, you can destroy the derived class object in any case because it now has only one destructor, the one of the derived class which has overwritten the one of the base class.

class Animal
{
public:
  virtual ~Animal()
  {
    std::cout << "dtor of Animal\n";
  }
};

class Bird : public Animal
{
public:
  ~Bird()
  {
    std::cout << "dtor of Bird\n";
  }
};

int _tmain(int argc, _TCHAR* argv[])
{
  Animal *pAnimal = new Bird();
  delete pAnimal;

  return 0;
}

Output:

dtor of Bird

dtor of Animal

If a non-virtual destructor can cause such a trouble you may come to the conclusion to always declare you destructors virtual no matter whether you want to use the class as base class or not. But unfortunately that’s not a good idea too. Virtual functions will lead to some overhead behind the scene. A virtual table pointer will be added to the class which points to the virtual table. This table is an array of function pointers. When a virtual function is called the actual functions is determined by looking up the appropriate function within this virtual table.

If you use a virtual destructor or virtual functions in general you have to deal with two main issues: The overhead of managing the virtual table pointer and the table itself and the incompatibility with other programming languages. For example if you want to pass your object pointer from C++ to a C component you may implement additional converters as there is no virtual table pointer in C and therefore the C++ object is no longer equal to the same object implemented in C.

STL types

As we now know that we should not inherit from a class with a non-virtual destructor we may ask the question: Can we inherit from STL types?

For example you may want to implement your own vector class which offers some additional functions compared to the STL vector type. In this case you may think it will be nice to inherit from the STL type:

class MyVector : public std::vector{};

As the STL container types do not have a virtual destructor this implementation may lead to the destructor issue show above. Therefore you should not use the STL types for inheritance.

Summary

If a derived class is deleted by using a base class pointer and the base class has a non-virtual destructor the result is undefined. To handle this use case you have to implement virtual destructors. But on the other hand always declaring your destructors virtual is just as wrong as never declaring them virtual. Therefore you have to carefully choose the right implementation with respect to the according use cases. In case a class is used for inheritance you should implement a virtual destructor, otherwise implement a non-virtual one.

Veröffentlicht unter C++ | Kommentar hinterlassen

Virtual functions in ctor and dtor

During construction and destruction of an object you are able to call virtual functions. But you should not do this. That’s because it is a source for errors as the behavior of the application may be differ from your expectations or the expectations of other developers which have to change your code in future.

Before we start to think about the ctor and dtor issue we want to have a look at the normal behavior of a virtual function. The following source code shows a base class with a virtual function which creates some logging output and another function which executes something and writes a log entry. A derived class will use the base class implementations but overwrites the virtual function for the log entry. So we can create an object of the derived class, executed the functions provided by the base class and the log information of the derived class is used as we have overwritten the virtual function.

class BaseClass
{
public:
  void DoSomething()
  {
    //do something...

    //... and log what you habe done
    LogClassInformation();
  }

  virtual void LogClassInformation()
  {
    std::cout << "BaseClass log" << std::endl;
  }
};

class DerivedClass : public BaseClass
{
public:
  void LogClassInformation()
  {
    std::cout << "DerivedClass log" << std::endl;
  }
};


int _tmain(int argc, _TCHAR* argv[])
{
  DerivedClass myClass = DerivedClass();

  myClass.DoSomething();

  return 0;
}

 

But why is it dangerous to make a virtual function call in ctor and dtor? Let us think about the execution order of ctor and dtor to understand the issue. Again we have a base class and a derived class. If we create an object of the derived class, the ctor of the base class is called first followed by the ctor of the derived class. If you base class ctor will call a virtual function which is overwritten by the derived class we now have the issue that we want to execute some functionality of a class which is not initialized by a ctor yet. So the class members may contain not initialized members and the function execution can result in an undefined behavior. As this is dangerous, C++ gives you no way to call the derived class function. That is such a fundamental language behavior that during base class construction the type of the object is that of the base class itself and not the derived class.

The same is true for the dtor. During destruction of the class the dtor of the derived class was already called when the dtor of the base class is executed. It is not safe to call a function of the derived class as the class members are no longer valid. Therefore C++ changes the object type during the dtor call to the base class object on entry of the base class dtor.

The following source code shows an according example.

class BaseClass
{
public:
  BaseClass()
  {
    std::cout << "BaseClass ctor" << std::endl;
    LogClassInformation();
  }

  ~BaseClass()
  {
    std::cout << "BaseClass dtor" << std::endl;
    LogClassInformation();
  }

  virtual void LogClassInformation()
  {
    std::cout << "BaseClass log" << std::endl;
  }
};

class DerivedClass : public BaseClass
{
public:
  DerivedClass()
  {
    std::cout << "DerivedClass ctor" << std::endl;
    LogClassInformation();
  }

  ~DerivedClass()
  {    
    std::cout << "DerivedClass dtor" << std::endl;    
    LogClassInformation();
  }

  void LogClassInformation()
  {
    std::cout << "DerivedClass log" << std::endl;
  }
};


int _tmain(int argc, _TCHAR* argv[])
{
  DerivedClass myClass = DerivedClass();
  
  return 0;
}

The console application will create the following output.

BaseClass ctor

BaseClass log

DerivedClass ctor

DerivedClass log

DerivedClass dtor

DerivedClass log

BaseClass dtor

BaseClass log

 

As you can see the base class ctor and dtor will call the base class logging function and not the one of the derived class. After reading this article you may argue that this is the expected behavior and therefore it is safe to implement in that way. But it is recommended to avoid such implementations as it is a source for errors. Within the short example it is easy to understand the source code but in real applications such implementations are a good source for errors.

Furthermore within this example the virtual function is implemented within the base class and the derived class. So we have a valid and running application. But often you explicitly want to move the responsibility for implementation to the derived class. In this case you can create a pure virtual function. The following source code shows the adapted example.

class BaseClass
{
public:
  BaseClass()
  {
    std::cout << "BaseClass ctor" << std::endl;
    LogClassInformation();
  }

  ~BaseClass()
  {
    std::cout << "BaseClass dtor" << std::endl;
    LogClassInformation();
  }

  virtual void LogClassInformation() = 0;
};

class DerivedClass : public BaseClass
{
public:
  DerivedClass()
  {
    std::cout << "DerivedClass ctor" << std::endl;
    LogClassInformation();
  }

  ~DerivedClass()
  {    
    std::cout << "DerivedClass dtor" << std::endl;    
    LogClassInformation();
  }

  void LogClassInformation()
  {
    std::cout << "DerivedClass log" << std::endl;
  }
};


int _tmain(int argc, _TCHAR* argv[])
{
  DerivedClass myClass = DerivedClass();
  
  return 0;
}

Now thinks will become more difficult. What’s the expected behavior now? Of course the dtor and ctor execution order stays the same and therefore the base class logging function is called. But this time it is a pure virtual function and cannot be called. Some compilers will recognize such errors and report a linking issue other compilers will not recognize the issue which will result in an error during runtime.

Summary

Do not call virtual functions during construction or destruction, because such calls will never go to a more derived class than that of the currently executing constructor or destructor. This may result in undefined behavior and is a source of errors.

Veröffentlicht unter C++ | Kommentar hinterlassen

Design patterns: Strategy

The strategy design pattern is used in case different algorithms are implemented and shall vary dynamically dependent on the use case.

Very often, you can implement functionality in different ways. In most use cases you look at the pros and cons of the alternatives and select one. But sometimes you need the flexibility to offer different implementations and select the best one dynamically in context of the actual use cases. For example you want to read and save data in an encrypted format. In this case you may choose and implement a serialization strategy and an encryption strategy. Or you can offer different strategies and let the client application select the ones they need. This flexibility can be implemented by using the strategy pattern. You will implement different algorithms for one purpose. The context specific strategy will be chosen by the client which uses your context specific object.

The following example will show a possible implementation of the strategy pattern. In our application we will have different kinds of lists. These lists shall be sortable. There exist many different sort algorithms, all with their own context specific pros and cons. So you want to have the flexibility to select the algorithm depending on the kind of list you use.

Therefore we create a sort strategy by defining the according interface.

interface ISortStrategy<T>
{
    void Sort(List<T> list);
}

And we implement some algorithms for this strategy.

class ShellSort<T> : ISortStrategy<T>
{
    public void Sort(List<T> list)
    {
        //todo: implement sort algorithm
    }
}

class QuickSort<T> : ISortStrategy<T>
{
    public void Sort(List<T> list)
    {
        //todo: implement sort algorithm
    }
}

class MergeSort<T> : ISortStrategy<T>
{
    public void Sort(List<T> list)
    {
        //todo: implement sort algorithm
    }
}

At next we create a context specific class, in this case a customer list, which wants to use the sort strategy.

interface ICustomer
{
    string Name { get; set; }
}

class Customers
{
    private ISortStrategy<ICustomer> _sortStrategy;
    private List<ICustomer> _customers;

    public Customers(ISortStrategy<ICustomer> sortStrategy)
    {
        _sortStrategy = sortStrategy;
    }

    public void Sort()
    {
        _sortStrategy.Sort(_customers);
    }
}

Depending on their needs, the client application will have the possibility to select the best strategy dynamically. The following source code shows a console application which create the customer object instance and set the sort strategy.

class Program
{
    static void Main(string[] args)
    {
        ISortStrategy<ICustomer> sortStrategy = new MergeSort<ICustomer>();

        Customers customers = new Customers(sortStrategy);

        customers.Sort();
    }
}

In the example code the strategy was set by dependency injection via constructor. Of course, that’s one possible implementation only. You may also have the possibility to use the strategy as parameter of the sort function or to implement it in another way, for example as delegate if the strategy contains a single function only.

Veröffentlicht unter Design Pattern, Designprinzip, Uncategorized | Kommentar hinterlassen