realtalksociety

1 Researchers Reduce Bias in aI Models while Maintaining Or Improving Accuracy

Machine-learning models can fail when they try to make forecasts for people who were underrepresented in the datasets they were trained on.

For circumstances, hb9lc.org a model that anticipates the very best treatment option for someone with a persistent disease may be trained utilizing a dataset that contains mainly male patients. That model might make inaccurate predictions for female clients when deployed in a healthcare facility.

To improve results, engineers can try balancing the training dataset by getting rid of data points till all subgroups are represented similarly. While dataset balancing is appealing, it typically needs removing large amount of data, hurting the model's total .

MIT scientists developed a new strategy that identifies and gets rid of specific points in a training dataset that contribute most to a design's failures on minority subgroups. By eliminating far fewer datapoints than other approaches, this strategy maintains the total accuracy of the design while enhancing its efficiency concerning underrepresented groups.

In addition, the strategy can identify hidden sources of bias in a training dataset that lacks labels. Unlabeled data are even more common than identified data for numerous applications.

This technique might also be integrated with other methods to enhance the fairness of machine-learning designs deployed in high-stakes situations. For example, it may one day assist make sure underrepresented patients aren't misdiagnosed due to a biased AI model.

"Many other algorithms that try to resolve this concern presume each datapoint matters as much as every other datapoint. In this paper, we are revealing that assumption is not true. There specify points in our dataset that are contributing to this bias, and we can find those information points, eliminate them, and get much better efficiency," states Kimia Hamidieh, an electrical engineering and computer system science (EECS) graduate trainee at MIT and co-lead author of a paper on this strategy.

She composed the paper with co-lead authors Saachi Jain PhD '24 and fellow EECS graduate trainee Kristian Georgiev