Can you combine Agile software development with safety-critical certification requirements?

Picture of Vincent Lambercy
Posted by Vincent Lambercy

It’s easy to think that software development is like creating a new aeroplane. First, you agree the specifications - range, speed etc. Then you make detailed blueprints. A prototype emerges. Testing begins. Eventually, the plane enters service and goes through regular maintenance and occasional upgrades.

This metaphor describes the ‘waterfall’ approach to software development: requirements, design, implementation, testing and maintenance.

This structured approach has an intuitive appeal and a long history when considering safety-critical systems like air traffic management software. Careful requirements-gathering, detailed design and extensive testing are essential to ensure software quality: plan the code, code the plan.

The challenges of traditional development

Cartoon showing the differences between the 'specification' for a garden swing and the comical but stupid 'implementations'

However, heavily-structured waterfall methodologies come with a cost.

  • They can easily ignore the real needs of users. “What I had in mind as an ATCO was very difficult for a developer to understand,” says Benoit Maffioli, a former ATCO, now in charge of framing ATC decision support tools at Skyguide, the Swiss ANSP. He says that traditional methods were “not good at shaping the correct requirements.”
  • They can be inflexible. A NASA paper describing a waterfall approach to mission control software says, “Six months is a long time from specification to delivery. Over each six-month period, we fell out of touch with our customers. Our expectations diverged, creating a mismatch that resulted in disappointment and frustration for both the developers and our NASA colleagues to whom we delivered.”
  • They downplay the creativity and skill of software developers. By reducing coding to the delivery of carefully specified modules with carefully scripted testing requirements, developers have a more limited role in the process. In part, this is deliberate because it reduces the scope for mistakes. But it also reduces the opportunity for creativity, problem-solving and efficient coding.

    The Agile alternative

    These concerns led to the publication of the Agile Manifesto in 2001. (The Atlantic covers the story of its creation in a fascinating article.)

    It starts by stating, “We are uncovering better ways of developing software by doing it and helping others do it.” It has four core values:

    • Individuals and interactions over processes and tools
    • Working software over comprehensive documentation
    • Customer collaboration over contract negotiation
    • Responding to change over following a plan

    In short, Agile emphasises collaboration, flexibility, and iterative development.

    Skyguide has been experimenting with Agile since 2017, and it has spread through the whole organisation.

    Benoit Maffioli describes an example of this process in action while working on a conflict detection function. Because ATCOs need to develop trust in a critical system like this - showing too many warnings (or too few) would quickly undermine confidence - getting regular ATCO input into the evolution of the software was critical.

    He explains that “by sitting next to each other and looking [at the software] on a weekly basis, it was really helpful in terms of quality and avoiding developing things that were not really usable. The HMI that was implemented was very close to what I was expecting. That was not the case before.” The main benefits are improved functionality, software quality and user acceptance.

    Changing focus from release dates to functionality

    Another important benefit is the ability to “desynchronise delivery and implementation,” as Benoit Maffioli explains.

    With the traditional waterfall methodology, development is structured around specific delivery dates, which can lead to a reduction in scope as the launch date approaches and features aren’t ready. In this model, the constant is the date.

    Agile development, on the other hand, has a regular cadence of updates (for some software this can be weekly or even daily), but features are released when they are ready. They just slot into the next available release. This means that there is no need to reduce the scope to meet deadlines, and features can be tested more thoroughly before they are integrated into the live environment. With this model, the constant is functionality.

    By analogy, Waterfall is like publishing a novel but Agile is like publishing a magazine or a newspaper.

    How to reconcile safety and agility

    One of the main challenges associated with using agile development in safety-critical systems is the need to exhaustively document what is being produced.

    This is a requirement from the Federal Office of Aviation in Switzerland and other regulators, and it can be challenging to document individual features while keeping an overview of the whole system.

    Maffioli described how Skyguide improved documentation recently as a result of an audit from the regulator. “It’s always a matter of being aware. Agile is not just a recipe that you implement,” says Maffioli. “You need to take time to look at how you are organised and identify points of failure in order to improve over time the way you [work with it].”

    Embracing Agile

    Following Skyguide’s example, even safety-critical development can embrace Agile. The promised benefits include better quality functionality, increased responsiveness to user needs and a more regular schedule of software releases. It takes work to make the shift, especially around regulatory compliance and documentation, but it’s clear that ATM software developers can use Agile to deliver better software.