|
2. Error PreventionIt is important to prevent as many errors as possible from being inserted into the code. The less errors present, the less time required to remove them at a later stage. As the project develops, errors become defects if they are not removed at the source of insertion. Latent defects can cause extensive problems. A ripple effect (Dunn, Ullman, 1994) occurs as a single defect in an early stage will multiply into many defects at a later stage. Therefore, it is desirable to prevent these errors from morphing into defects. Defects consume resources in the removal process. Not all defects will be removed by the final release of the software. By reducing the error insertion rate, the overall defects that make it to the customer is likely to be reduced. This section describes some techniques used to minimise error insertion. 2.1. ReuseMuch code generation is repetitive. If a software developer can use code for a current project that was generated in a previous one, they will benefit in a number of ways. The most apparent benefit will be in terms of man hours. This is assuming that the code module to be reused is easy to understand (well documented) and generic enough to be applicable to the current problem. In the most useful of situations, a module can be placed into the project without so much as an expression being written. Another advantage of reuse is its quality implications. Presupposing that modules targeted for reuse have been thoroughly tested for defects, the amount of errors will be considerably smaller than those of a module written from scratch (Smith, 1995). Modules can be modified to greater or lesser extent, to achieve the desired function. However, it is important to use configuration management (section 7) to keep track of modules that have been altered. Code is not the only item that can be reused. Documentation, testing procedures and requirements specifications can all be used again. A developer needs to be careful in selecting items for reuse. Poorly chosen items may result in an increase of errors because of difficulty in the conversion process. 2.2. Information hidingClosely linked to the object oriented approach to software development is information hiding. This restricts the interface between modules to the minimum level of data required. Other modules cannot change the data that a module is currently using, because that data is ‘hidden’. This increases the integrity of the data a program uses, and therefore reduces the likelihood that corruption will occur, reducing the quality of the software. 2.3. Coding standardsA software development organisation can reduce the errors inserted into code by implementing a coding standard. This aims at getting all developers to write code in a consistent, simple, and well documented manner. An example of such a standard is found in the programming language Java. Ambler (2000, pg 3) describes the Ambysoft Coding Standards for Java as “a collection of standards, conventions, and guidelines for writing solid Java code. They are based on sound, proven software engineering principles that lead to code that is easy to understand, to maintain, and to enhance”.
An important aspect of any coding standard is how program variables are named. Variable names should be easy to understand at a glance, rather than a random character string. In addition to understandable naming, code needs to be sufficiently commented to explain the function being implemented. All of these things make code more understandable, therefore easier to maintain and modify. This reduces error insertion by minimising the assumptions the programmer needs to make about the code they are viewing. 2.4. Tool useA variety of tools are available in the world of software engineering to increase productivity, improve quality, and to reduce the risk associated with project management. Configuration Management (CM) uses tools that keep track of all the items in a software development process. These items include documentation, source code, binary code, test results etc. Section 7 will take a closer look at CM, but it is useful to examine briefly its effect on defect prevention. CM ensures that items are stored safely in a database. If a developer wishes to alter an important item, then they need to get it signed off by a CM committee. What this means is that the full impact of changes made will be taken into consideration. Developers therefore cannot unwittingly introduce errors because of their lack of overall knowledge of the project. Another important issue that gets mentioned in section 2.5 and 6 is customer requirements. If these can be well defined, fewer errors will be introduced because of an increased understanding by the developers of the problem to be solved. Software tools are available to aid in the capture, use and management of requirements. 2.5. Well defined customer requirementsAs mentioned above, customer requirements are important to the issue of prevention. Requirements must be converted into technical specifications. This is then converted into a software solution. If the original requirements are not accurate or detailed enough, then the resulting software might be useless. Looking at module of code for example (perhaps 20 lines), the function of the module might be to calculate the area of a circle with given dimensions. But if the customer really wants the circumference of the circle, the module contains an error, even though the code might be perfectly correct. Consequently, the requirements for the function have not been adequately expressed. This problem can also appear at a larger scale. For example, the customer requirements for a new database may not state that the system is to have a response time of 20 seconds or less. If the system constantly retrieves records from its database at a rate of one per minute, then the system may not even get used. This would result in the entire project being a waste of money, time and effort. Therefore, to prevent small errors and huge system shortfalls, a suitable methodology for requirements elicitation and analysis needs to be in place (discussed in section 6). 2.6. DocumentationEach piece of code needs to have supporting documentation. This is in addition to the comments and completely defines all inputs, outputs, data types used, and most importantly, what the code does and how it does it. These documents are stored in the CM database so that all developers can have access to them. Again, the better understood the code is, the more correct the modifications or interactions with that code will be. 2.7. High level language usageVarious programming languages offer differing levels of abstraction from the hardware. Assembly language is closely linked to the hardware and requires multiple instructions to achieve simple processing tasks. In higher level languages such as C/C++, the commands are more descriptive. This results in only a small amount of instructions being required to implement simple processing tasks. A basic measure of error injection into code is the errors per thousand lines of code (KLOC). By moving to a more descriptive language, the KLOC in a project might improve. Even if the error rate stays the same, the total amount of errors in the project will be reduced. This concept does not take into account the possibility that the increasing complexity of the instruction might increase the error rate in some cases. Kaikkonen et al. (1998) gives two reasons why assembler poses a quality problem. First, there are less tools available than C/C++ for debugging. Secondly, the high learning curve creates problems for novice programmers (resulting in a higher error insertion rate). C is a third generation language (3GL), whereas Matlab (a mathematical programming environment) can be seen to straddle the 3GL – 4GL boundary. Matlab offers a significant improvement in expressability over C at the expense of computational speed. It can be argued that Matlab code is more readable because it is more descriptive. For example, to perform a Fast Fourier transform in Matlab, the command: fft(vector); is simply given. To do this in C would require pages of code. Anything that improves readability aids in error prevention by making the code easier to understand. 2.8. TestabilityWhen it comes to testing the completed code (discussed in section 3), a test set needs to be developed to wring out as many defects as possible. If the code has not been designed in such a way to facilitate that testing, then defects may not be uncovered. One way to maximise testability include writing small modules that are completely testable. Large modules can reside in such a large number of states that it becomes infeasible to test all of them. Small modules also promote reduced complexity (section 2.10). 2.9. MaintainabilityIf code can be updated easily to operate with new modules or different working environments, then effort is spared recoding from scratch. Therefore this is a method of preventing errors being inserted into the code during maintenance. The issues already discussed, including readability, documentation, information hiding and coding standards, all help to generate maintainable code. 2.10. ComplexityIt can be said that by decreasing the codes complexity, then the scope for errors reduce. Modularity in code forces this approach. It is natural to take a complicated project and break it up into smaller, more manageable fragments. The code becomes more understandable in modular form. There are many methods to determine a piece of code’s complexity rating. The McCabe Cyclomatic Complexity Analysis uses directed flow graphs to graphically show the structure of the code. The structure is based on the following programming constructs: repeat-until, if-then-else, case, if-then, do-while, and sequence. McCabe gives three methods to arrive at a complexity value: 1) Edges – nodes + 2 2) Count of regions 3) Predicate nodes + 1 where the edges are the number of vectors, the nodes are the number of points, the regions are bounded areas and a predicate node is one having two or more exit points. Figure 1 below shows a simple program, whereas figure 2 demonstrates how complicated things can get (taken from McCabe et al, 1994). Table 1 gives a method of determining if a program is too complicated.
Figure 1. Graph of a simple program
Figure 2. Graph of a complicated program Table 1. Level of risk associated with different levels of Cyclomatic complexity
|