How to manage the risk of AI bias in Identity Verification
The spread of remote identify verification (IDV)
We all interact with companies providing us financial products and services. Most of these interactions appear seamless. But all financial transactions, whether with traditional banks or fintechs, are predicated on identity verification.
Regulations require financial services providers to know the identity of an individual before they are onboarded as customers (KYC rules). This helps to prevent money laundering and other illegal activity.
Previously, customers visited bank branches for identity verification and document validation.
The proliferation of online banking, and the rise of fintechs with no physical branch locations, has created the need for remote IDV.
IDV technology allows the individual to prove their identity by submitting images of their face and identity documents. IDV technology replaces the role of the bank teller, who previously would have checked the document against the individual who turned up at the counter, by seeking to determine the likelihood that:
the identity document provided is legitimate and not tampered with;
the facial image embedded in the identity document matches the individual’s face.
After this is completed, the IDV provider generates a report documenting the likelihood that the identity document belongs to the individual, alongside any red flags that may have been triggered by the algorithm.
The bank must then decide how to proceed, based upon the IDV report and its own internal thresholds.
The ethical implications are complex.
Whilst such technology increases efficiencies, it also poses new risks. If online IDV is the only means to access the product and the algorithm fails to correctly match the individual, it means the individual will have no other channels to access the service with that provider. This creates barriers to participation in banking and time-critical products such as access to credit.
How does IDV work from a Machine Learning (ML) perspective?
The ML models that enable IDV perform two fundamental jobs:
they extract relevant data from the identity document such as date of birth, first and last name, document number, and validate that the document is original; and
perform a facial verification control between the photo presented in the identity document and the selfie taken within the IDV app.
To conduct these tasks, the ML model is trained to assign a numerical feature vector to the facial image that is presented: images belonging to the same individual will generate vectors with high similarity scores, and those that represent different people will display vectors with a low similarity score. The similarity score will be underpinned by an agreed threshold: any images equaling or higher to the similarity threshold will be deemed as belonging to the same person, those that are lower will fail the similarity test.
Whilst it is reported that on average such ML models perform better than humans conducting the same checks thanks to advances in deep learning models and computing power, they are still prone to inaccuracies. This is because the performance of ML models is contingent on the quality of their training data.
Algorithmic bias in IDV
Companies relying on ML tools to power their IDV must consider:
Are the training data sets sufficiently large, complete and relevant?
Are the training data sets sufficiently diverse?
Is the system sufficiently robust?
The quality of the datasets will be driven by both intrinsic and extrinsic factors. Intrinsic factors includes diversity of gender, skin tone, age and facial geometry, whereas extrinsic factors includes the background environment, image quality, facial expression and facial decoration.
A training dataset that is insufficiently representative can lead to poor model performance on under-represented populations, even if global metrics suggest strong performance.
In particular, if the dataset is deficient in key intrinsic factors such as skin tone and gender (and particularly an intersection thereof), the model will present differential performance on the individuals that are represented by those factors, leading to algorithmic bias.
Algorithmic bias results into two outcomes, detrimental to both the individual and the business:
Individuals rejected because two images of the same person are given a low similarity score (high false rejection rate)
Individuals face unfair treatment and are not given access to the service whilst companies lose new potential customers
Individual accepted even when the two images belong to different people because they are assigned a high similarity score (high false acceptance rate)
Individuals are more subject to risk of their identities being stolen and used for fraudulent purposes and companies bearing the liability for not preventing fraud
How do we manage AI bias risk in IDV?
The only effective solution is to implement robust AI Risk Management systems and processes.
AI Risk Management is the process of identifying, verifying, mitigating and preventing AI risks. Concrete steps must be taken at each stage of the AI lifecycle, to reduce the likelihood of bias.
Risk management approaches must be adapted to reflect the novel risks AI poses. For example, as AI systems continuously learn and evolve, and performance tends to decay over time, they must be carefully monitored on an ongoing basis. This requires an automated and scalable solution.
Managing AI risks also requires technical assessment of the AI system’s code and data. Best practice entails the independent auditing, testing and review of AI tools against bias metrics and other industry standards.
We have the technical expertise to assess the quality and performance of ML models, and the representativeness of their training datasets, to support IDV providers in mitigating bias risks. We also support businesses in designing and establishing policies and processes to effectively govern the use of AI, such as training, governance and accountability, and other operational controls.
☎️ If you are interested to learn more about AI bias risk management in IDV and beyond, please get in touch with Anna (anna.nicolis@braithwate.com) and Sara (sara.zannone@holisticai.com).