One of the best ways to determine realistic expectations for any project is to examine historical data. Where have you done something similar? How long did it take? Where did the spikes in activity occur? Obtaining these answers will aid in establishing baselines for how the organization can feasibly complete new and future projects. If you have not been collecting this data all along, it may sound like an overwhelming journey on which to embark. But, it doesn’t have to be.
There are actually only a few pieces of information that you need to collect to get started:
- Size: A measure of the amount of functionality or value delivered in a system. This can be measured in Source Lines of Code (SLOC), Function Points, User Stories, Business or Technical Requirements or any other quantifiable artifact of product size.
- Time: The duration required to complete the system (i.e. schedule). This is typically measured in months, weeks or days.
- Effort: A measure of the resources expended. This is typically measured in Person Hours or Person Months of work and is related to staffing as well as cost. The total effort multiplied by the labor rate is typically used to determine the project cost.
- Quality: The reliability of the system is typically measured in the number of errors injected into a system, or the Mean Time to Defect (MTTD), which measures the amount of elapsed time that the system can run between the discoveries of new errors.
Once the size, schedule, effort and quality data have been collected, a fifth metric -- the Productivity Index (PI) -- can be calculated. The PI is a measure that determines the capability and complexity of a system. It is measured in Index Points ranging from 0.1 to 40 and takes into account a variety of factors including personnel, skills and methods, process factors and reuse.
What is an ideal PI to shoot for in your projects? That depends on the size and type of software that you’re building. Typically business systems, such as billing systems or online banking portals, have the highest productivities, averaging around 20 Index Points. These systems are relatively simple and straightforward to build because they have lower reliability requirements than other software applications. Engineering systems and real-time embedded systems have lower productivities, ranging from 15-18 Index Points and 10-12 Index points respectively (see Figure 1). This does not mean that software engineers working on real-time systems are less productive – the lower PI is caused by the greater complexity of work and the need for more thorough testing compared with business systems. The size of the application also impacts the productivity rating. Smaller projects, or those with less functionality, will have lower productivities than larger projects.
In short, a lot of factors need to be taken into account to assess the productivity of a project, but we’ll get more into that later.
Figure 1. Typical PI Ranges for Different Application Types.
Together, these five metrics give a complete view of the project, which can be used to assess its performance. In order to establish a true baseline; a broad reaching sample of historic performances is preferred (see Figure 2). However, it is better to start with something rather than nothing at all, so begin by creating your baseline with whatever is practical and then build on that as you realize success from newly completed projects.
Figure 2. Build datasets to include a sample of historical data at a variety of project sizes.
THE INTERNAL BASELINE
Once a repository of an organization’s completed projects has been established, custom trend lines can be calculated to use for creating the baseline. These trend lines serve as a reference point, which can be used for comparing projects within your organization. Where your projects fall relative to the trend line will indicate better or worse performance than the average. This will give insight into your organization’s current capabilities.
Understanding the baseline for your current development operation can help set reasonable expectations for future projects by showing what has been accomplished in the past. If the desired project parameters push the estimate into uncharted territory, you can use the historical baseline to negotiate for something more reasonable. This baseline can also be used for contract negotiation, evaluating bids and vendor performance, and navigating customer constraints, thus allowing you to achieve your cost reduction and process improvement goals.
Figure 3. Project baseline and outliers.
We can learn a lot about what we do well and what we can improve upon from looking at our projects relative to the baseline. Examining how far a project deviates from the various trends can help isolate best- or worst-in-class performances. Project outliers can also provide great insight into this. Figure 3 displays the company project portfolio against their baseline average productivity. Two of the projects stand out, falling outside two standard deviations, one above the average and one below. Examining the factors that influenced these projects (i.e. new technology, tools and methods, personnel, or project complexity) will help shed some light on why these projects performed so well or so poorly. Mimicking what went well for best-in-class projects and avoiding what didn’t for the worst-in-class projects can help improve the performance of future projects.
While the internal baseline provides valuable insight into the practices of an individual organization, we have the market to think about as well. The internal project standings within an organization may not matter if they’re not competitive with the industry. Therefore, it is important to have an external comparison for your project data.
When using these trend lines to determine performance, it is important to examine the project holistically. Judging a project’s performance on one metric can be very misleading because of the tradeoffs that occur in software development. For instance, a project may have been viewed favorably because it was delivered quickly. However, looking beyond the schedule, a project may not have performed well overall.
Figure 4. Project delivered 4.7 months earlier than average.
Figure 4 shows a project that was delivered 4.7 months ahead of the industry average, an accomplishment that is often viewed favorably by management because it provides an advantage over the competition. While speed of delivery may be desirable in some circumstances, compressing the schedule unnecessarily can have tradeoffs, including higher costs and lower quality.
Figure 5 shows how the project performed in other areas such as productivity, staffing, effort expended, and the number of defects present during testing. These graphs tell a very different story.
Figure 5. Holistic view of project shows that the effort, staffing, and defects are higher than industry average.
While the productivity of this project was comparable with the industry average, the peak staffing and effort expended were drastically higher than what is typical for other projects of similar scopes. This directly translates into higher project costs to pay for the additional labor hours.
Additionally, the defects present in the system were considerably higher than what is typical for the industry (see Figure 5). This is likely a result of the higher staffing. When more people work on a project, there is a greater chance that miscommunication between team members could lead to errors being injected into the system. Also, utilizing more people further divides the code base, which can result in more defects during integration.
Figure 6. The Five Star Report.
Another way to look at overall historic performance is with Five Star Report views, like the one shown in Figure 6. These can be used to rate the overall performance of a project or group of projects. For each metric, the project is given a rating of 1-5 stars with one star being the worst performance and five stars being the best. The Composite Project Star Rating column on the right gives an overall grade to the project by factoring in the performance of all the individual metric scores. This value helps determine what effects adjusting the staffing or schedule will have on the overall project rating. Here it is easy to see when a project staffs up in order to decrease their schedule. In such a situation, the project would have a duration rating of 4 or 5 stars but a staffing and effort rating around 1 or 2 stars. The opposite can also occur, thus indicating a project that used fewer than the average number of staff and lengthened the schedule duration. Ideally, project managers should shoot for an average of 3 or more stars for their projects.
Establishing a baseline will eliminate much of the up-front ambiguity and will provide detailed recommendations based on quantifiable data. As various organizations strive for improvement, knowing where they stand relative to competition is important. A company with a lower than average productivity will have different goals and implement different process improvement measures than one that is average, or better than average. Knowing where you stand as an organization can help you determine the most appropriate measures to take, and decide the best method for moving forward. With this data you will empower the organization to move toward fact-based decision-making, thereby maximizing the possibility of having successful project outcomes.
ABOUT THE AUTHOR(S)
Doug Putnam is Co-CEO for Quantitative Software Management (QSM) Inc. He has 35 years of experience in the software measurement field and is considered a pioneer in the development of this industry. Mr. Putnam has been instrumental in directing the development of the industry leading SLIM Suite of software estimation and measurement tools, and is a sought after international author, speaker and consultant. His responsibilities include managing and delivery of QSM software measurement services, defining requirements for the SLIM Product Suite and overseeing the research activities derived from the QSM benchmark database.
Taylor Putnam-Majarian is a Consulting Analyst at QSM and has over seven years of specialized data analysis, testing, and research experience. In addition to providing consulting support in software estimation and benchmarking engagements to clients from both the commercial and government sectors, Taylor has authored numerous publications about Agile development, software estimation, and process improvement, and is a regular blog contributor for QSM. Most recently, Taylor presented research titled Does Agile Scale? A Quantitative Look at Agile Projects at the 2014 Agile in Government conference in Washington, DC. Taylor holds a bachelor’s degree from Dickinson College.