LLMs have a capacity for producing toxic, biased responses to even innocuous prompts, and bias benchmarks exist to evaluate models for trustworthiness and identify at-risk subgroups. At 6PM on February 21 in Amos Eaton 214, second-year computer science Ph.D. student Hannah Powers will present "The Need for Multifactor Bias Benchmarking of LLMs" to propose a way of identifying gaps in existing benchmarks and a multi-factor bias analysis of LLMs to identify key factors behind model behavior.
Remote URL
https://tw.rpi.edu/media/foci-genaillm-users-group-bias-bias-evaluation-need-multifactor-bias-benchmarking-llms-21-feb
Audience