User Tools

Site Tools


notes:misc:software_engineering

Software Engineering and Programming

SOLID Design Principles

S Single responsibility principle A class/method should have only one responsibility.
O Open/closed principle An object should be open for extension but closed for modification.
L Liskov substitution principle A base type should be replaceable with subtypes in each and every situation.
Is-A inheritance is actually Is-Substitutable-For.
I Interface segregation principle Use client-specific interfaces instead of one general interface.
By using small interfaces, you don't force clients to implement more than they need. Prefer small, cohesive interfaces to “fat” interfaces.
D Dependency inversion principle Depend upon abstractions such as an interface or an abstract class. This makes the code less coupled to the actual implementation.

Software Design

  • Inheritance means “is a” i.e., base classes describe what an object is.
  • Interface means “behaves like” i.e., interfaces describe a way an object behaves.
  • Working software is a primary measure of progress. (Agile Principles)
  • The purpose of modeling is to communicate and understand, not to document.
  • The exact format of the design document is less important than the process of thinking about your design. The point of designing is to think about your program before you write it.
  • The key to writing good programs is to design classes so that each cleanly represents a single concept.
  • Organizing the relationship between classes in a program is often harder than laying out the individual classes.
  • Stroustrup: One of the most powerful intellectual tools for managing complexity is hierarchical ordering - organizing related concepts into a tree structure with the most general concept as the root
  • Abstraction helps you to manage complexity by providing models that allow you to ignore implementation details. A class interface that presents a good abstraction usually has strong cohesion.
  • Encapsulation prevents you from looking at the details even if you want to.
  • Nicholas A. Solter “Professional C++”: “Too often programmers jump into applications without a clear plan: they design as they code. This approach inevitably leads to convoluted and overly complicated designs.”
  • Role interfaces:
    • Few members, preferably just one.
    • It's easier to fulfill the Liskov substitiution principle with lots of small interfaces than with few large interfaces.
    • If you don't want a certain feature, you don't implement a corresponding interface.
  • Header interfaces:
    • Old-fashioned .h headers.
    • Extracted from classes.
  • Postel's Law:
    • Output / sender: conservative - give as many guarantees about your output as possible.
    • Input / receiver: liberal - try to be as tolerant of input as possible.
  • Fail Fast: inform the sender as quickly as possible that something went wrong and tell him how to fix it.
  • Returning null value is a bad design decision.
  • Two characteristics of well-designed code:
    • High cohesion
    • Low coupling
  • Nicholas A. Solter “Professional C++”: If you find that you are stuck, you can take one of the following actions:
    • Ask for help. Consult a coworker, mentor, book, newsgroup, or web page.
    • Work on something else for a while. Come back to this design choice later.
    • Make a decision and move on. Even if it's not an ideal solution, decide on something and try to work with it. An incorrect choice will soon become apparent. However, it may turn out to be an acceptable method. Whatever you decide, make sure you document your decision, so that you and others in the future know why you made it.
  • Bohm and Jacopini: All programs can be written in terms of only three control structures:
    • Sequence structure
    • Selection structure (if, if/else, switch)
    • Repetition structure (for, while, do/while)
  • Steve McConnell “Code Complete”: Classes and routines are first and foremost intellectual tools for reducing complexity. If they're not making your job simpler, they're not doing their jobs.
  • Instantiation of objects within a class creates coupling between the class and the objects' types as well as it violates the Open/Closed Principle of SOLID: we need to modify the class in order to accommodate the instantiation of a new object's type. The only thing that our program should be aware of is a common interface shared between all instantiated objects.
  • Fundamental Challenge [Brooks 1987]: “The hardest single part of building a software system is deciding precisely what to build… Therefore, the most important function that the software builder performs for the client is the iterative extraction and refinement of the product requirements… I would go a step further and assert that it is really impossible for a client, even working with a software engineer, to specify completely, precisely, and correctly the exact requirements of a modern software product before trying some versions of the product”.
  • Semantic Interface and Programmatic Interface [Code Complete by Steve McConnell]:
  • Make interfaces programmatic rather than semantic when possible.
  • Each interface consists of a programmatic part and a semantic part. The programmatic part consists of the data types and other attributes of the interface that can be enforced by the compiler. The semantic part of the interface consists of the assumptions about how the interface will be used, which cannot be enforced by the compiler. The semantic interface includes considerations such as “RoutineA must be called before RoutineB” or “RoutineA will crash if dataMember1 isn't initialized before it's passed to RoutineA.”
  • The semantic interface should be documented in comments, but try to keep interfaces minimally dependent on documentation. Any aspect of an interface that can't be enforced by the compiler is an aspect that's likely to be misused. Look for ways to convert semantic interface elements to programmatic interface elements by using Asserts or other techniques.
  • The classes and interfaces that you expose publicly to the outside world are your contract. The more cluttered the public contract is, the more constrained your future direction is. The fewer public types you expose, the more options you have to extend and modify any implementation in the future.
Sequential Approach Iterative Approach
The requirements are fairly stable. The requirements are not well understood or you expect them to be unstable.
The design is straightforward and fairly well understood. The design is complex, challenging, or both.
The development team is familiar with the applications area. The development team is unfamiliar with the applications area.
The project contains little risk. The project contains a lot of risk.
Long-term predictability is important. Long-term predictability is not important.
The cost of changing requirements, design, and code downstream is likely to be high. The cost of changing requirements, design, and code downstream is likely to be low.
  • Iterative approaches are useful much more often than sequential approaches.
  • Specify a clear statement of the problem that the system is supposed to solve before beginning construction: product vision, vision statement, mission statement, or product definition:
    • Define what the problem is without any reference to possible solutions.
    • Make it short: one or two pages.
    • Use user language.
    • Describe the problem from a user's point of view rather than in technical computer terms.
    • Keep in mind that without a good problem definition, you might put effort into solving the wrong problem.

Requirements

  • Requirements (a functional specification):
    • Describe in detail what a software system is supposed to do.
    • Help to ensure that the user rather than the programmer drives the system's functionality.
    • Keep the programmer from guessing what the user wants.
    • Help to avoid arguments. You decide on the scope of the system before you begin programming.
    • Help to minimize changes to a system after development begins.

The development process helps customers better understand their own needs, and this is a major source of requirements changes. [Curtis, Krasner, and Iscoe 1988; Jones 1998; Wiegners 2003]

  • Specifying requirements adequately is a key to project success, perhaps even more important than effective construction techniques.
  • On a typical project, the customer can't reliably describe what is needed before the code is written.
  • A plan to follow the requirements rigidly is a plan not to respond to your customer.
  • If your requirements aren't good enough, stop work, back up, and make them right before you proceed.

Exception Handling

  • The exception handling pattern is to catch the exceptions at the lowest possible level and pass a return value or a Result class to the higher levels, for example:
    • Throw an exception in DataLayer.
    • Catch the exception in Repository, log it, and turn it to Result. Alternatively, you could catch the exception in a VM's private helper.
    • Return Result to VM.
  • If you still are not able to handle the exception in the client layer, re-throw the exception. It will eventually be caught by the global handler.
  • Wrap try/catch around specific statement(s) only. Otherwise, you won't be able to tell where the exception comes from.
  • Consider using the “fail fast” approach: do not catch exceptions and allow them to propagate to the global level. Log them in the global handler.
  • Use exceptions for stating that there is a bug in your app.
  • Exception messages are aimed into programmers, not end-users. End-users should see non-technical messages.
  • Custom controls rarely require any built-in default values. The default values are supposed to be provided by a client application. Throw an exception if a required value is not provided by the client.

Abstration Design

Quotes from “Professional C++” by Marc Gregoire:

“Experience and iteration are essential to good abstractions. Truly well-designed interfaces come from years of writing and using other abstractions. The best interface is rarely the first one you put on paper, so keep iterating.”

“Don't be afraid to change the abstraction once coding has begun, even if it means forcing other programmers to adapt. Sometimes you need to evangelize a bit when communicating your design to other programmers. Perhaps the rest of the team didn't see a problem with the previous design or feels that your approach requires too much work on their part. In those situations, be prepared both to defend your work and to incorporate their ideas when appropriate.”

“Iteration is worth mentioning again because it is the most important point. Seek and respond to feedback on your design, change it when necessary, and learn from mistakes.”

Open-Source Software

An excerpt from “Professional C++” by Marc Gregoire:

  • Free software. Note that the term “free” does not imply that the finished product must be available without cost. Developers are welcome to charge as much or as little as they want. Instead, the term “free” refers to the freedom for people to examine the source code, modify the source code, and redistribute the software.
  • Source code available. The Open Source Initiative uses the term open-source software to describe software in which the source must be available. As with free software, open-source software does not require the product or library to be available for free.

Using a library under the GPL (GNU) might require you to make your own product open-source as well. Boost, OpenBSD, CodeGuru, CodeProject, Creative Commons License allow for using the open-source library in a closed-source product.

Open-source libraries are usually written by people in their “free” time. As a good programming citizen, you should try to contribute to open-source projects if you find yourself reaping the benefits of open-source libraries. If you work for a company, you may find resistance to this idea from your management because it does not lead directly to revenue for your company. However, you might be able to convince management that indirect benefits, such as exposure of your company name, and perceived support from your company for the open-source movement, should allow you to pursue this activity.

Intellectual Property Rights

When you design or write code as an employee of a company, the company, not you, owns the intellectual property rights. It is often illegal to retain copies of your designs or code when you terminate your employment with the company. The same is also true when you are self-employed working for clients.

Extreme Programming

Characteristics of XP:

  • designed for use by teams of up to 10 people
  • emphasizes iterative development
  • centered on user requirements that drive design

XP encourages:

  • rapid iterations
  • developer-created schedule estimates
  • programming in pairs
  • continuous testing
  • active user involvment

Directory structure

Directory structure for a game and a game engine:

  • Docs - Design documents, technical specs, contracts, milestone acceptance criteria, etc.
  • Media - Art and sound assets in source formats. When assets are imported and packed into game files they go into the Bin folder.
  • Source - The source code. Any 3rd party libraries should go to Source\3rdParty.
  • Obj - Temporary files of build targets (e.g. Obj\Debug, Obj\Release, etc.)
  • Bin - The release executables and the data files (Bin\Data) that are used to create the project (not the source code!). It should everything what is needed to run and test the game.
  • Test - The debug and profile targets as well as test scripts, test utilities, logs, etc. It should also contain the release notes for the latest build.
  • Inc - #include files (for the game engine only). It allows the game engine to be published with only the #include files and the library.

Endianness

  • Little-endian: LSB is stored at a lower memory address than MSB e.g. 0xABCD is stored 0xCD, 0xAB
  • Big-endian: MSB is stored at a lower memory address than LSB e.g. 0xABCD is stored 0xAB, 0xCD

| a – b | < e rule

Do not compare floats for equality or inequality. Rather, test that the absolute value of the difference is less that a specified small value.

Hashing and Dictionaries

Hashing is the process of turning a key of a data type into an integer. This integer can be used modulo the table size as an index into the table.

Given a key k, we want to generate an integer hash value h using the hash function H, and then find the index i into the table:

h = H(k)
i = h mod N

N is the number of slots in the table.

A dictionary data structure is usually implemented as a binary search tree or as a hash table. In a hash table implementation, the values are stored in a fixed-size table, where each slot in the table represents one or more keys. Finding a key-value pair is an O(1) operation in the absence of collisions.

A collision is a situation when two or more keys end up occupying the same slot in the hash table. There are two ways to resolve a collision, each related to a different kind of a hash table:

In an open hash table, collisions are resolved by storing more than one key-value pair at each index, usually as a linked list. This approach does not impose an upper bound on the number of key-value pairs that can be stored. However, it requires memory to be allocated dynamically whenever a new key-value pair is added to the table.

In a closed hash table, collisions are resolved via probing until an empty slot is found. This approach imposes an upper limit on the number of key-value pairs that can be stored. The main benefit is that it uses a fixed amount of memory hence no memory allocations are needed.

API Design

Reference: API Design for C++ by Martin Reddy

Q: What is API?

A: An API is a logical interface to a software component that hides the internal details required to implement it. An API is written to solve a particular problem or perform a specific task.

Q: What are the requirements for a good API?

A: API must be well-designed, documented, regression tested, and stable between releases.

Q: What are the benefits of using or writing an APIs?

A: API:

  • hides implementation
  • increases longevity of codebase - systems that expose their implementation details tend to devolve into spaghetti code where every part of the system depends on the internal details of other parts of the system
  • promotes modularization - APIs define a modular grouping of functionality; developing an application on top of APIs promotes loosely coupled and modular architectures where behaviour of one module does not depend on the internal details of another module
  • reduces code duplication - by keeping all of your code's logic behind an interface, you centralize the behaviour in a single place.
  • easier to change the implementation
  • easier to optimize - for example, you could add caching and invalidation your cached results
  • enables code reuse - effective code reuse follows from a deep understanding of the clients of your software

Q: How APIs model the problem domain?

A: The API should model the problem domain i.e., API should be formulated in terms of high-level concepts that make sense in the problem domain rather than exposing low-level implementation details. A non-programmer should be able to understand the concepts of the API's interface and how it works.

Q: What are the two techniques used to hide implementation details?

A: The two techniques are:

  • Physical hiding - the source code is not available to users.
  • Logical hiding - uses language features to limit access to certain elements of the API.

Q: How is physical hiding achieved in C++?

A: Physical hiding means storing internal details in a separate file (.cpp) from the public interface (.h)

Q: How is logical hiding achieved in C++?

A: Logical hiding means using C++ language features of protected and private to restrict access to internal details.

Q: What are the benefits of using the getter/setter routines, rather than exposing member variables directly?

A: The benefits are:

  • Validation
  • Lazy evaluation
  • Caching
  • Extra computation
  • Notifications
  • Debugging/logging
  • Synchronization/thread safety

Q: What level of encapsulation should you use for data members?

A: Data members should always be private, never public or protected. Alan Snyder [1986]: “inheritance severely compromises the benefits of encapsulation in object-oriented programming languages”.

Q: In what scenario you might consider exposing member variables?

A: The only plausible argument for exposing member variables is for performance reasons such as in a tight loop that performs large number of operations. However, even in this case the careful use of inlining, combined with a modern optimizing compiler, may completely eradicate the method call overhead.

Q: Should an API be minimally complete?

A: A good API should be minimally complete. This is one of the most important qualities of an AP. You should try to keep your APIs as simple as you can: minimize the number of classes you expose and the number of public members in those classes. This will also make your API easier to understand, easier for your users to keep a mental model of the API in their heads, and easier for you to debug.

Don't Overpromise

Every public element in your API is a promise - a promise that you will support that functionality for the lifetime of the API. The key point is that once you release an API and have clients using it, adding new functionality is easy, but removing functionality is really difficult. When in doubt, leave it out [Bloch 2008; Tulach 2008].

There is a temptation to add extra levels of abstraction or generality to an API because you think it might be useful in the future. You should resist this temptation for the following reasons:

  • The day may never come when you need the extra generality.
  • If that day does come, you may have more knowledge about the use of your API and a different solution may present itself from the one you envisioned originally.
  • If you do need to add the extra functionality, it will be easier to add it to a simple API than a complex one.

Add Virtual Functions Judiciously

One subtle way that you can expose more functionality that you intended is through inheritance, that is, by making certain member functions virtual:

  • Changes to your bases classes may break the clients (“fragile base class problem”).
  • Clients may extend your API in incorrect or error-prone ways.
  • Overridden functions may break the internal integrity of your class. Example: the default implementation of a virtual method may call other methods in the same class to update its internal state. If an overridden method does not perform these same calls, then the object could be left in an inconsistent state and behave unexpectedly or crash.
  • Virtual function calls must be restored at run-time by performing a vtable lookup, whereas non-virtual function calls can be resolved at compile time. This can make virtual function calls slower than non-virtual calls although in reality this overhead may be negligible.
  • The use of virtual functions increases the size of on object, typically by the size of a pointer to the vtable.
  • Adding, reordering, or removing a virtual function breaks binary compatibility. This is because a virtual function call is typically represented as an integer offset into the vtable for the class. Therefore, changing its order means that existing code needs to be recompiled to ensure that it calls the right functions.
  • Virtual functions cannot always be inlined. There are certain situations where a compiler can inline a virtual function despite the fact that virtual functions are resolved at run-time (inlining is a compile-time optimization).
  • It's better to avoid overloading virtual functions. A symbol declared in a derived class hides all symbols with the same name in the base class. Therefore, a set of overloaded virtual functions in a base class is hidden by a single overriding function in a subclass.

A class with no virtual functions tends to be more robust and requires less maintenance than one with virtual functions. A rule of thumb: if your API does not call a particular method internally, then that method probably should not be virtual. You should also allow subclassing is situations where the potential subclasses form an “is-a” relationship with the base class.

If you still decide that you want to allow subclassing, remember the following rules:

  • Always declare your destructor to be virtual if there are any virtual functions in your class. This is so that subclasses can free any extra resources that they may have allocated.
  • Always document how methods of your class call each other. If a client wants to provide an alternative implementation for a virtual function, they need to know which methods need to be called in order to preserve the integrity of the object.
  • Never call virtual functions from your constructor and destructor. These calls will never be directed to a subclass [Meyers, 2005].

Convenience APIs

On one hand, an API should only provide one way to perform one task. This ensures that the API is minimal, focused, consistent, and easy to understand. On the other hand, an API should make simple things easy to do. Clients should not be required to write lots of code to perform basic tasks.

Both these goals can be achieved by writing convenience wrappers according to certain rules. Convenience wrappers are utility routines that encapsulate multiple API calls to provide simpler higher-level operations.

  • Do not mix your convenience API in the same classes as your code API.
  • Produce supplementary classes that wrap public functionality of your core API.
  • Isolate the convenience classes from your core API, for example, in different source files or even separate libraries. This has a benefit of ensuring that your convenience API depends only on the public interface of your core API, not on any internal methods or classes.

Add convenience APIs as separate modules or libraries that sit on top of your minimal core API.

Q: What does it mean that an API is easy to use?

A: A well-designed API should make simple tasks easy. It should be possible for a client to look at the method signatures of your API and be able to glean how to use it, without any additional documentation. This API quality follows from minimalism: if your API is simple, it should be easy to understand.

Q: What additional artifacts help clients to use your API?

A: Your API should provide:

  • good supporting documentation
  • sample code
  • additional complex functionality for expert users; this functionality does not necessarily has to be easy to use

Q: What does it mean that an API is discoverable?

A: A discoverable API is one where users are able to work out how to use the API on their own, without any accompanying explanation or documentation.

Q: Give an example of misusing an API?

A: A good API should be difficult to misuse. For example, an API may be misused by passing the wrong arguments to a method or passing illegal values to a method.

Q: How can you minimize misuse of your API?

A: You can make your API more difficult to misuse by following the guidelines:

  • Prefer enums to Booleans to improve code readability.
  • Avoid functions with multiple parameters of the same type.
  • Use consistent function naming and parameter ordering.

Q: What is static polymorphism?

A: Static polymorphism refers to identifying the common concepts across your classes and using the same conventions to represent these concepts in each class. For example, STL containers do not inherit from a common base class but they follow the same patterns and idioms such as iterators

Q: What does it mean that an API is orthogonal?

A: An orthogonal API means that functions do not have side effects. For example, calling a method that sets a particular property should change only that property and not additionally change other publicly accessible properties. Another interpretation of orthogonal design is that different operations can all be applied to each available data type. For example, STL offers a collection of generic algorithms and iterators that can be used on any container.

Q: What are the two important factors of the orthogonal APIs?

  • Reduce redundancy. Ensure that the same information is not represented in more than one way.
  • Increase independence. Ensure that there is no overlapping of meaning in the concepts that are exposed. Any overlapping concepts should be decomposed into their basal components.

Q: What is RAII (Resource Acquisition Is Initialization)?

A: RAII is a technique related to resource management where resources are chunks of memory, file handles, mutex locks, etc. Consider providing a class to manage resources. Think of resource allocation and deallocation as object construction and destruction.

Q: What are good practices to make an API platform independent?

A: A platform independent API should:

  • Avoid platform-specific #if/#ifdef lines in its public headers. There are very few cases where API should be different for different platforms.
  • Hide the fact that a function only works on certain platforms by providing a method to determine whether the implementation offers the desired capabilities on the current platform, for example: a HasGps method.

Q: What is coupling and cohesion?

A: Coupling. A measure of the strength of interconnection between software components, that is, the degree to which each component depends on other components in the system. One way to think of coupling is that given two components, A and B, how much code in B must change if A changes. Cohesion. A measure of how coherent or strongly related the various functions of a single software component are.

Good APIs exhibit loose coupling and high cohesion.

Q: What are the benefits of loose coupling and high cohesion?

A: Loose coupling and high cohesion allows components to be used, understood, and maintained independently of each other.

Q: What are measures to evaluate the degree of coupling between components?

A: Measures to evaluate the degree of coupling between components are:

  • Size. The number of connections between components, including the number of classes, methods, arguments per method, etc. For example, a method with fewer parameters is more loosely coupled to components that call it.
  • Visibility. The prominence of the connection between components. For example, changing a global variable in order to affect the state of another component indirectly is a poor level of visibility.
  • Intimacy. The directness of the connection between components. If A is coupled to B and B is coupled to C, the A is indirectly coupled to C. Another example is that inheriting from a class is a tighter coupling than including that class as a member variable (composition) because inheritance provides access to all protected members of the class.
  • Flexibility. The ease of changing the connections between components. For example, if the signature of a method in A needs to change so that B can call it, how easy is it to change that method and all dependent code.

Q: What is circular dependency?

A: Circular dependency (or a dependency cycle) is a form of tight coupling where two components depend on each other directly or indirectly.

Q: What are techniques to reduce coupling between classes and methods within an API (inter-API coupling)?

A: Techniques to reduce coupling within an API are:

  • Coupling by name only. If class A only needs to know the name of class B i.e., it does not need to know the size of class B or call any methods in the class, then class A does not need to depend on the full declaration of B. In C++ you can use a forward declaration for class B.
  • Reducing class coupling. Whenever you have a choice, you should prefer declaring a function as a non-member non-friend function rather than as a member function. Doing so improves encapsulation and reduces the degree of coupling of those functions to the class. The non-member non-friend functions are not coupled to the internal details of the class. They are therefore less likely to break when the internal details of the class are changed. Examples in C++ include the STL algorithms such as std::for_each and std::unique which are declared outside of container classes.
  • Intentional redundancy. Reuse of code and data implies coupling. Sometimes, it may make sense to duplicate code (code redundancy) or duplicate data (data redundancy) in order to avoid coupling.
  • Manager classes. A manager class owns and coordinates several lower-level classes. This can be used to break the dependency of one or more classes. It may require abstracting the underlying lower-level classes. Therefore, the manager class could put in place a generic interface that abstracts the specifics a of particular lower-level class.
  • Callbacks, observers, and notifications. These techniques address the problem of notifying other classes when some event occurs.

Q: How your API design decisions affect the cohesion and coupling of your clients' applications (intra-API coupling)?

A: The larger your API is – the more classes, methods, and arguments you expose – the more ways in which your API can be accessed and coupled to your clients application. As such, the quality of being minimally complete can contribute toward loose coupling.

Q: What issues do you need to be aware of when working with callbacks, observers, and notifications?

A: Some general issues to be aware of when working with callbacks, observers, and notifications:

  • Reentrancy. Your client code may call back into your API. At a minimum, your API should guard against this behaviour with a coding error. However, a more elegant solution would be to allow this reentrant behaviour and implement your code such that it maintains a consistent state.
  • Lifetime management. Clients should have a clean way to disconnect from your API. Also, your API may wish to guard against duplicate registrations to avoid calling the same client code multiple times for the same event.
  • Event ordering. The sequence of callbacks or notifications should be clear to the user of your API i.e., the API should make it clear whether a notification is sent before or after an event by using names such as willChange and didChange.

Q: What is a callback in C/C++?

A: In C/C++, a callback is a pointer to a function within module A that is passed to module B so that B can invoke the function in A at an appropriate time. Keep in mind that the usage of callbacks can be convoluted in object-oriented C++ programs without using libraries such as boost::bind.

Q: What mechanism can you use in object-oriented languages such as C++ as an alternative to callbacks?

A: A more object-oriented solution is to use the observer pattern. In this pattern an object maintains a list of its dependent objects (observers) and notifies them by calling one of their methods.

Q: What are notifications? Give an example of a notification mechanism.

A: Notifications (or events) are sent by a centralized mechanism that serves as an intermediate layer between unconnected parts of the system. There are several kinds of notification schemes such as signals and slots. Signals can be thought of as callbacks with multiple targets (slots). All of the slots for a signal are called when that signal is invoked.

notes/misc/software_engineering.txt · Last modified: 2021/06/16 by leszek