Recent developments in data science ethics over the last number of months have shown that support and guidance for professionals has become an increasingly important topic; the European Insurance and Occupational Pensions Authority (EIOPA) established a 'Consultative Expert Group on Digital Ethics in Insurance' in September last year, the Actuarial Association of Europe published a document outlining its current perspective on the ethics and responsibility of data scientists and the European Actuarial Academy is hosting its first 'Data Science and Data Ethics Conference' on the 29th and 30th of June in 2020. Much of the activity has discussed the evolving role of actuaries in data science, noting the opportunities within the space but acknowledging that a number of risks still need to be addressed.
To assist in navigating some of the risks posed by the ethical considerations of data science, in October 2019 the Institute and Faculty of Actuaries (IFoA) and the Royal Statistical Society (RSS) Data Science Section jointly published 'A Guide for Ethical Data Science.' This guide aims to address the challenges faced by their members while complementing existing ethical and professional guidance. It is non-mandatory and does not impose any obligations upon RSS or IFoA members.
The development of the guide began in 2018
with four workshops with data science professionals. These workshops centred on
four key questions raised by the RSS and IFoA:
- What does a good data science workflow look like?
- How should data science fit into the structure of an organisation?
- What do executives and managers need to know about data science?
- What is a data scientist's responsibility to wider society?
Although the data science practitioners found that best practice for data science is dependent on industry and even company-specific factors such as organisational design, historical workflows and the availability of skills within teams, they did broadly agree on several high-level principles and practices which have informed the guide. The guide does highlight, however, that some of the more complex issues faced by practitioners will require further thought and input from professionals.
The guide looks at five recurring ethical themes from existing frameworks relating to data science and artificial intelligence (AI):
- Seek to enhance the value of data science for society.
- Avoid harm.
- Apply and maintain professional competence.
- Seek to preserve or increase trustworthiness.
- Maintain accountability and oversight.
The guide gives an overview of each of the above principles, noting that they are not intended to be a comprehensive list of ethical principles. It then goes into further detail on each point separately--covering examples of what potentially could be included within the higher-level themes, and then providing more practical examples of how each principle could be applied in practice. For instance, under 'Seek to preserve or increase trustworthiness,' the guide lists the following two examples as it addresses how to avoid unnecessary complexity in methods in order to improve transparency:
- Consider simpler models and document the performance of different models tested
- Disclose the reasons for any improvements in accuracy if one model is favoured over another
There are a number of common threads between each of the principles, many of which align with existing actuarial guidance--understanding the potential biases, errors, assumptions and risks in modelling, the role of models in decision making, appropriately monitoring and communicating risks to stakeholders and putting clear governance in place, for example.
There are also aspects shared between the principles which are more specific to data science, such as consideration of the wider public as a stakeholder. That's not to say that other professional guidance doesn't also include this as a theme, but it is certainly prominent here--particularly around 'big data' sources, such as social media and mobile phone and app data. The guide extends its consideration of this aspect beyond the more typical ethical concerns like consent to use data and the fair treatment of individuals, and into wider public interests such as understanding how the benefits of a data science project could be distributed across society, publishing results and methods publicly and getting public involvement to feed into projects.
A brief mention is also given to considering the energy cost of storing and processing large volumes of data, and the wider impact on the environment of data science projects.
A project 'Implementation Checklist' is included at the end of the guide, indicating how the data science ethical principles could be embedded within a project or an existing project framework. This includes tasks to be considered in the initial project planning phase, as well as throughout the project itself, with the following example project areas considered:
- Project planning
- Data management
- Analysis and development
- Implementation and delivery
- Communication and oversight
The guide acknowledges that the field of data science is still evolving and moving forward. In keeping with this sentiment, the guide is to be reviewed regularly, and members of the RSS and IFoA are encouraged to send any queries or feedback to [email protected].