Much of this page is adapted from chapter 5 of Watts Humphrey's book "A Discipline for Software Engineering". It is from him that I have borrowed the idea that size is measured throught the use of proxies. The general idea is that in to estimate the size of a project one should have a more easily constructible representative for it. If one knows the size of the representative and has good historical data on the relationship between the size of the proxy and the eventual project one can then estimate the size of the current project.
Before looking at the process in more detail one should be aware of the shortcomings of the process. First, it provides little guidance when the project is of a completely new type. Secondly it must be corrected for the tendency of a given individual to either over or under estimate. Lastly the process itself takes time so that convincing management that this time is needed may be a problem.
The usual measure of program size is lines of source code (LOC). More usually KLOC for one thousand lines of source code. It is important to be consistent in how one measures LOC. Ideally it should be done automatically. For this purpose one does well to adopt a set of standards for laying out source code, including documentation.
The actual number of LOC is dependant on the language that is used. The following table gives typical figures for the number of LOC required to code the same process in different languages. The measure used is lines of code per function point
LOC per FP Assembler 320 C 150 COBOL 106 FORTRAN 106 Pascal 91 PL/I 80 Ada 71 Prolog 64 APL 32 Smalltalk 21 Spreadsheet Languages 6
Function Points are extensively used as proxies for code. The idea is due to Albrecht. It is based upon analysis of a large number of actual commercial applications. Most of the applications use screens for data entry and display. The function points are:-
These categories can be further weighted according to their complexity. An estimate of the size of the project (in FP) is given by
FP = 4 * Inp + 5 * Out + 4 * Inq + 10 * Maf + 7 * Inf
An alternative proxy is the Object line of Code. The idea here is that there is a relationship between the size of the code when written as messages to objects and the final code size. The link between the two is made by estimating the size of the objects. Humphrey shows in his book a good regression line between OLC (object lines of code) and LOC.
The size of an object can be estimated in terms of the number of methods that are required to implement the object and an estimate of the category of the object. The following table, taken from Humphrey's book shows a collection of estimates of object size in LOC per method. You lineage may be different!
C++ Object Size in LOC per method. Category Very Small Medium Large Very Small Large Calculation 2.34 5.13 11.25 24.66 54.04 Data 2.60 4.79 8.84 16.31 30.09 I/O 9.01 12.06 16.15 21.62 30.09 Logic 7.55 10.98 15.98 23.25 33.83 Set-Up 3.88 5.04 6.56 8.53 11.09 Text 3.75 8.00 17.07 36.41 77.66
Note that this process requires that you design the system before you estimate its size. (You can estimate how much a house will cost until you have designed it in broad outline!).
The process can be considerably improved by using statistical methods to track the accuracy of your estimates.