Benefits of Research Software Engineer(s)

Research Software Engineering (RSE) applies professional practises to the development of software, where the end goal of the exercise isn’t actually the software itself, but rather a research output. If done well, the software then becomes a tangible output, eligible for a DOI, publication and reuse to underpin other research.

All but one* Russell Group universities have dedicated RSEs, normally in centralised groups. Some institutions have dozens, e.g., UoManchester has a central group of 24 RSEs (up from 17 in 2018) with separate RSE groups within specialised centres. This is not specific to universities, but widely adopted within research facilities e.g., UK Atomic Energy Authority, Alan Turing Institute etc. These are typically partly core-funded and partly costed into grant applications, with a view to being overall cost neutral.

If a grant proposal involves some significant amount of software development but the PI/Co-I’s do not have a track record in research software, review panels could legitimately raise concerns about potential ability to deliver the project successfully. Having dedicated RSE resources attached to a project could alleviate this concern directly. Going forward, the provision of Software Engineering training by an RSE can take researchers from being ‘coders’ who develop paperware (i.e., software only used to publish one paper) to producing properly engineered software. This can increase measures of impact and outcomes through potential release / open-sourcing / commercialisation. Feeding back to the point above, this could make proving software development capability to review panels easier through open metrics, such as downloads, forks, citations etc.

There is a huge move towards gaining recognition for software artefacts that are essential in producing the results that underly published articles, as part of the FAIR principles. Indeed, Jisc1 make the following recommendations:

“Treat computer code like any other output of your research… Share your computer code like you would any other research output…Computer code should have a URL or a DOI (digital object identifier)… Always include these when citing the code, including information on the version you used.”

Specific journals exist to peer-review research software and publish it (e.g., Journal of Software: Practice & Experience (Impact factor:2.028), Nature Toolbox, SoftwareX, Journal of Open Research Software) along with a growing number of ‘artifacts’ tracks at large conferences. ACM have begun adding ‘badges’ to deposited software within these tracks, awarding levels of reusability based on the code quality. Publishing code under this peer review process forces the author to write better code to a higher standard. Better software means better research: greater research integrity, reproducibility and rigour.

Often within the PhD lifecycle, students are presented with a research area and expected to not only define their research gap, but to generate the tools for producing research data, including software. Without training or education in the principles of software engineering to minimise technical debt, the research software can be highly coupled to an individual researcher or project, limiting reuse, and incremental development. This leads to unmaintainable code with a lack of sustainability and a loss of traction and knowledge.

If a 3-4 year PhD student or a 1-2 year Postdoc produces code and data for their project that is hard to find, reproduce and reuse, it makes incremental improvement on results very difficult. You are left with a paper, sometimes the data, but rarely the software needed to replicate/reproduce the results. Academia values journal publications, so that’s what is encouraged and shared, when better science would involve the whole methodology.

* No information could be found from London School of Economics, though the University of London has several groups