RECCOMENDATIONS FOR
Software
Disclaimer
The field of software engineering offers numerous guidelines and standards for developing robust and usable software. Your choice of architectures, technologies, and development tools should align with your software’s intended purpose and development context. While our recommendations are general and primarily focus on ensuring software FAIRness—particularly regarding metadata and reusability—we have included references to widely accepted best practices and standards to guide scholars with limited software development experience who wish to enter this field.
1. Before creating new software from scratch, investigate existing similar solutions and explore opportunities to further develop or adapt them, promoting the reuse and enhancement of existing resources. #
IDENTIFYExample: forks on GitHub of the EVT software.
2. Involve domain experts in software design and apply software engineering methodologies and best practices, in order to create robust software that is easy to use, maintain and further develop. #
PLAN PRODUCE- Utilise documented and shared design patterns (Gamma et al. 1994).
- When applying the object-oriented programming paradigm, follow the SOLID principles (Silén 2024).
- For complex software, implement the “domain-driven design” approach (Evans 2004).
- Organise code into modules to facilitate the reuse of individual components.
- Adopt DevOps practices to streamline development and deployment processes (plan, code, build and test, release, deploy, operate, and monitor) (Silén 2024).
- Ensure that all software dependencies, whether libraries, frameworks, or operating system components, are clearly documented and managed. This also includes defining the operational requirements, such as minimum and optimal hardware resources (e.g., CPU, RAM, disk space) needed to ensure that the software works properly.
- Integrate a structured testing phase as part of the software development process, establishing clear metrics and goals to determine testing success.
Developers working in the DH field can follow and join the activities of DHTech, an ADHO special interest group aimed at supporting the development and reuse of software in the Digital Humanities.
3. Define and implement software integration strategies with the goal of achieving a cohesive, scalable and maintainable software ecosystem, minimising the risks of incompatibility and the efforts required for adaptation. #
PLAN PRODUCEThis also allows easy handling of format migration and can be achieved with the following steps:
- Define integration approaches: whether these will be API-based or exchange files, for example, and prepare standard protocols to facilitate communication.
- Ensure interoperability and compatibility between different systems by considering standard data formats and structured schemas.
- Plan strategies for handling errors and malfunctions.
- Ensure scalability and the ability to handle increased load without compromising overall performance.
4. Employ standard and non-proprietary programming languages and technologies to develop tools, ensuring greater longevity and easier maintainability. #
PRODUCE- W3C standards for web development.
- Community Development of Java Technology Specifications.
Choose a programming language with mature libraries that can ease the development and maintenance of your software. For example, use Python for NLP software development to easily integrate available tools.
5. When possible, develop in open source and foster collaborative development. #
PRODUCE DEPOSIT- Write clear, comprehensive code comments.
- Provide guidelines for contributing to software development.
- Utilise repositories such as GitHub that foster collaboration among developers.
- Follow shared methodologies and strategies for versioning (e.g., Semantic Versioning) and branching (e.g., GitFlow workflow).
6. Release software officially through freely accessible channels (e.g., GitHub), providing detailed and user-friendly documentation. #
PRODUCE DEPOSIT
With each released version, always attach a changelog document that provides a clear and organised chronology of updates, improvements, bug fixes, and other changes.
- State your software licence clearly.
- OSI (Open Source Initiative) Approved licences.
- SPDX licence List a complete list of licences used for software artefacts.
7. Publish your released research software in a trusted scholarly repository (e.g., Zenodo, HAL) with rich metadata to ensure citability and credit to the development team. #
DEPOSIT- To prepare metadata, follow the Research Software MetaData Guidelines.
- Consider depositing your software source code in the Software Heritage Archive.
Software Heritage is an international organisation that collects and preserves software in source code form, recognizing software as a valuable part of our cultural heritage that embodies technical and scientific knowledge. You can use Software Heritage services to store and preserve your own software or to search and access archived software. Software Heritage Documentation.
Existing guidelines
The FAIR Principles for Research Software (FAIR4RS Principles), developed by the FAIR for Research Software Working Group within the Research Data Alliance framework, were designed to improve the sharing and reuse of research software. They expand the FAIR principles, in order to address specific characteristics of software — such as its executability, composite nature, and continuous evolution and versioning — and namely are:
- “F: Software, and its associated metadata, is easy for both humans and machines to find.
- A: Software, and its metadata, is retrievable via standardised protocols.
- I: Software interoperates with other software by exchanging data and/or metadata, and/or through interaction via application programming interfaces (APIs), described through standards.
- R: Software is both usable (can be executed) and reusable (can be understood, modified, built upon, or incorporated into other software).”
The Research Software MetaData Guidelines (RSMD) is a comprehensive set of guidelines aimed at enhancing the findability, accessibility, interoperability, and reusability of research software artifacts, developed within the FAIR-IMPACT project for the European Open Science Cloud (EOSC). The RSMD guidelines are organised into seven distinct aspects — accessibility and preservation, reference and identification, description and classification, credit and attribution, reuse and legal, re-execute, and general remarks — each with a clear objective and a set of actionable recommendations. Recommendations are categorised into three priority levels: essential, important, and useful. This prioritisation helps emphasise the critical recommendations and ensure the guidelines address key areas effectively.
RSMD checklist: quick overview of the RSMD guidelines.
To assess the FAIRness level of a software artifact, you can use the metrics outlined within the FAIR-IMPACT project. These metrics emphasise key aspects such as the importance of clearly describing the software’s purpose, defining its development status, and enabling the identification and reuse of individual software components. You may find the complete list below.
- Does the software have a globally unique and persistent identifier?
- Do the different components of the software have their own identifiers?
- Does each version of the software have a unique identifier?
- Does the software include descriptive metadata which helps define its purpose?
- Does the software include development metadata which helps define its status?
- Does the software include metadata about the contributors and their roles?
- Does the software metadata include the identifier for the software?
- Does the software have a publicly available, openly accessible and persistent metadata record?
- Is the software developed in a code repository / forge that uses standard communications protocols?
- Are the formats used by the data consumed or produced by the software open and a reference provided to the format?
- Does the software use open APIs that support machine-readable interface definition?
- Does the software provide references to other objects that support its use?
- Does the software describe what is required to use it?
- Does the software come with test cases to demonstrate it is working?
- Does the software source code include licensing information for the software and any bundled external software?
- Does the software metadata record include licensing information?
- Does the software include provenance information that describe the development of the software?
Chue Hong, Neil, et al. D5.2 - Metrics for Automated FAIR Software Assessment in a Disciplinary Context. Oct. 2023, https://doi.org/10.5281/ZENODO.10047401.