DevOps, ITIL, and Operability

Author: Marco Abis

By identifying Operability as a key concern for both development teams and operations teams, we can bring DevOps and ITIL together in order to build and run resilient, so-called ‘antifragile’ software systems.


In recent years the term ‘DevOps’ has come to refer to a set of practices and approaches to building software that emphasise strong cooperation and collaboration between the various teams involved in building software (especially development teams and operations teams, hence Dev & Ops, or ‘DevOps’). DevOps recognises the harm done by the IT outsourcing craze of the early 21st century (and by ongoing CAPEX/OPEX budget splits) where the development and operation of software systems were/are treated as quite separate activities, rather than being two equally important parts of producing valuable, working software, requiring close communication between teams in order to avoid costly bugs and outages in Production.

The speed with which IT infrastructure can now be ordered, configured, and made available – thanks in part to virtualisation technologies and ‘cloud’ hosting – has driven a need to define and manage much IT infrastructure using software techniques to take advantage of the rapid time-to-market now possible. Version control, test-driven development, continuous integration, and deployment pipelines – all established software development practices – are increasingly being used to automate and test activities that were once manually undertaken by system administrators: this is termed ‘infrastructure as code’.

The increase in speed combined with greater automation has led to the need to establish a strong trust bond between different teams and roles involved in the software systems, which in turn turns a spotlight on the culture within (or between) organisations. Those organisations with a ‘blame culture’ tend to find it difficult or impossible to achieve sustainable, rapid delivery of working software, whereas more open organisations where learning and cooperation are highly valued find this easier.

DevOps and Operability

The four ‘pillars’ of practice that support the collaborative DevOps approach are Culture, Automation, Measurement, and Sharing (CAMS), but none of these pillars is a goal of DevOps in themselves; these are means to an end. Using the CAMS pillars of DevOps, organisations aim to build and operate software that works well in Production. Since software that works well in Production is by definition operable, good software operability is really one of the goals of DevOps.


The body of valuable knowledge and effective practice contained in ITIL (and other IT Service Management (ITSM) guides) is at its heart pragmatic and well-meaning, being based on real-world situations involving the management of IT services. ITIL/ITSM are often criticised for being too ‘heavy-handed’ with respect to change control, and too ‘anonymous’ with respect to a faceless, ‘hide-behind-tickets’ approach to customers; some people claim that ITIL prevents the rapid and frequent changes which modern organisations demand. However, much of the clumsiness associated with ITIL arguably derives from poor, over-complicated implementations (often by ‘experts’) that have lost sight of the real drivers for auditable change control: better fault diagnosis, rapid restoration of service, cross-service incident response and coordination, etc.

ITIL and Operability

By using aspects of ITIL we can capture and reason about operational criteria of software at a sufficiently early stage to be able to test and modify the operability before software reaches Production. With appropriate conversations and interactions with development teams during Service Design, Service Transition, and Service Operation, we can avoid a tendency towards operational costs increasing over time as ever more byzantine fixes and workarounds are applied to software which is itself not operationally ready.

Furthermore, by heavily automating the acceptance and auditing of Standard Changes, we remove much of the ‘bottleneck’ often caused by process-heavy ITIL implementations, freeing up time for reviewing (and possibly automating) aspects of Major Changes. More time is also available for greater collaboration with the software development teams about forthcoming software changes and deployments, which helps the operability of the software to improve.


Full article:

You may also like...

Leave a Reply