Wojciech Samek presents a novel extension of the Layer-wise Relevance Propagation (LRP) method to address biases and hallucinations in large language models. This approach enables faithful and efficient attribution of both input and latent representations in transformer models, matching the speed of a single backward pass. As a model-aware explainability method, LRP not only highlights relevant input features but also provides deep insights into the model's reasoning process. Evaluations on Llama 2, Flan-T5, and Vision Transformer show that this method surpasses alternatives in faithfulness and facilitates concept-based explanations of latent representations.
To find out more, see the Nokia Bell Labs Responsible AI hub: [ Ссылка ]
#transparency , #ai , #artificialintelligence , #responsibleai , #ResponsibleArtificialIntelligence, #ethicalai , #TrustworthyAI, #RegulatoryActivity, #researchanddevelopment , #trust , #fairness , #safety , #reliability , #security , #privacy , #transparency , #sustainability , #accountability , #innovation , #technology , #BellLabs, #NokiaBellLabs, #nokia
Ещё видео!