1 Researchers Reduce Bias in aI Models while Maintaining Or Improving Accuracy
wilsonbaldridg edited this page 1 month ago


Machine-learning designs can fail when they try to make predictions for individuals who were underrepresented in the datasets they were trained on.

For circumstances, a model that anticipates the very best treatment alternative for someone with a persistent illness might be trained using a dataset that contains mainly male clients. That model might make inaccurate predictions for female patients when deployed in a health center.

To enhance results, engineers can try stabilizing the training dataset by removing data points up until all subgroups are represented similarly. While dataset balancing is appealing, it frequently needs getting rid of big quantity of information, hurting the design’s total performance.

MIT researchers developed a new strategy that recognizes and removes particular points in a training dataset that contribute most to a model’s failures on minority subgroups. By eliminating far less datapoints than other approaches, this method maintains the general accuracy of the model while improving its concerning underrepresented groups.

In addition, the strategy can recognize covert sources of bias in a training dataset that lacks labels. Unlabeled information are far more prevalent than labeled data for numerous applications.

This approach might likewise be combined with other methods to enhance the fairness of machine-learning models released in high-stakes situations. For example, it might sooner or later help make sure underrepresented patients aren’t misdiagnosed due to a prejudiced AI design.

"Many other algorithms that attempt to resolve this problem presume each datapoint matters as much as every other datapoint. In this paper, we are revealing that presumption is not true. There specify points in our dataset that are contributing to this bias, and we can find those information points, remove them, and improve performance,” states Kimia Hamidieh, an electrical engineering and computer technology (EECS) graduate trainee at MIT and co-lead author of a paper on this method.

She wrote the paper with co-lead authors Saachi Jain PhD ‘24 and fellow EECS graduate trainee Kristian Georgiev