Speaker: Marco Tulio Ribeiro @ Microsoft Research
Website: [ Ссылка ]
Title: What is wrong with my model? Detection and analysis of bugs in NLP models
Abstract: I will present two projects that deal with evaluation and analysis of NLP models beyond cross validation accuracy. First, I will talk about Errudite (ACL2019), a tool and set of principles for model-agnostic error analysis that is scalable and reproducible. Instead of manually inspecting a small set of examples, we propose systematically grouping of instances with filtering queries and counterfactual analysis (if possible). I may talk about ongoing work (on arxiv) on counterfactual analysis, depending on time. Then, I will present CheckList (ACL 2020), a task-agnostic methodology and tool for testing NLP models inspired by principles of behavioral testing in software engineering. I'll show a lot of fun bugs we discovered with CheckList, both in commercial models (Microsoft, Amazon, Google) and research models (BERT, RoBERTA for sentiment analysis, QQP, SQuAD). CheckList is a really helpful process and tool for testing and finding bugs in NLP models, both for practitioners and researchers.
#NLProc #MachineLearning
Ещё видео!