Using an Agile Waterfall hybrid to manage a major Collaborative Computational Project
Collaborative Computational Project Number 4 (CCP4) in Protein Crystallography was set up in 1979 to support collaboration between researchers working in structural biology, and to assemble a comprehensive collection of software to satisfy the computational requirements of relevant UK groups.
Demand gave rise to the CCP4 program suite, now distributed to academic and commercial users worldwide.
Taking a lead role
Scientist Eugene Krissinel from the Science and Technology Facilities Council (STFC) Scientific Computing Department has taken on the core lead of project managing the vast volumes of collaborative software development, and its distribution for CCP4. In leading the core team he says, “I am responsible for CCP4 infrastructure, software distribution, and everything which goes from CCP4 to users, including some program development.”
CCP4 is a well-known and respected open collaboration with a very good reputation and large numbers of users – upward of 25,000 worldwide.
The project now has a mature agile management style with an Executive Committee to drive targets, and two working groups to advise on software requirements and user needs.
Challenges of the project
“The Software suite grew very fast and now the size and complexity is comparable to Linux distribution, and is managed by only a handful of people.” Eugene Krissinel.
One of the first things that Eugene needed to address when he joined the CCP4 team in 2009 was the size issue, as the volume of software to be distributed was more than was manageable by the resources and technology of the time. The software suite had reached such a size that the ways of managing software were purely technical – from archiving, compilation, testing, to packaging and distribution – and this was taking all the effort from the core team. It was a considerable issue so his first goal was to suggest a more efficient way of handling the software.
The team adopted technologies used by Linux maintainers, which enabled them to develop automatic software management pipelines and introduce hot updates, so CCP4 updates just like an operating system.
This is something Eugene designed, and it took about 3 years to implement to a stage where it was an established modus operandi for the team. ”It took quite a sizeable development of new graphical installers, updaters and new pipelines,” he said. “Those pipelines are big because we have about 10 million lines of code.”
With such a huge infrastructure, there is a lot to manage, and a way forward was to automate certain processes. Eugene explains; “Regression testing of our software is an ongoing problem but now it’s completely automatic and happens every night.”
There is a great deal of communication and collaboration to achieve the mutual goal of the project. The CCP4 team links research community and developers, making sure that users’ feedback reaches program authors. It is a considerable size of code that needs distributing so this takes a lot of time and effort.
Eugene uses an agile style of management to organise the project. He used a coarse-grained plan, and tracking progress of tasks within projects is achieved through regular group meetings. The usable outputs are discussed with stakeholders, allowing the team to have a continuous stream of deliverables.
Eugene highlights the importance of good working relationships and mutual respect. His management style is to give team members assignments that play to their strengths as well as matching the project’s needs. The team has a diverse set of skills and interests and together they successfully deal with a wide variety of tasks; from scientific problems to very technical problems or mundane jobs to very creative jobs.
Benefits in the project are identified by monitoring updates for the software. If liked by the research community they will use the software and this will be shown in download stats and start-up stats for the programs. This is collected only from academic users (not industrial users) and the information is completely anonymous.
The theory is that if academics are happy, then industry will listen. The more industry uses, the more sustainable the funding is for the project. The number of industrial licences is a crucial indicator for financial health. Currently CCP4 sells on average 140 industry licences per year, and that number is growing.
Feedback is key and the CCP4 team has always been very strong on communicating with the community directly. They support the CCP4 ‘bulletin board’, a mailing list of about 8000 subscribers who post between 20 – 100 messages each day. They also have a dedicated line for submitting bug reports, which are frequent and dealt with quickly. “If this line is completely silent I would personally worry because there are always bugs. If nobody is talking to us about them or thinks we can’t be reached, that becomes a big problem,” said Eugene.
The project is a great example of agile management because it’s focused on using regular direct discussion between the development team and the users, with 2 week continuous delivery slots. There is also a strong emphasis on stakeholder communication and reciprocal respect within the industry.
CCP4’s success can be attributed in part to generous industry support. Its roots are in drug research and its industrial customers are all big pharma companies. By purchasing software licences, these companies provide important funding to ensure the continuity of the project. Other funding comes from competitive grants, and STFC’s Scientific Computing Department provides the overall setup and home for the project.
Improvement going forward
Despite its success, the team is always trying to improve. Going forward Eugene would like to see an easier process for supporting short-term activities.
A little bit more autonomy in financial terms would benefit the project processes, especially in terms of purchasing hardware. This can be slowed by the many channels necessary to make purchases.
This project, like so many others, has been impacted by COVID-19 as less spending has occurred and the funds don’t carry over to the following financial year automatically.
Rising to challenges is something Eugene and his colleagues take in their stride, though. CCP4 is hugely successful – something that is borne out by its longevity, its ever-evolving software, its growing community of users and high demand from industry. Importantly, CCP-4 software was used to solve the first COVID-19 virus structures. Taking the agile approach for managing the project has given it a further advantage of increasing the dialogue bandwidth between the development team and the users.