Title: Optimal selection of benchmarking datasets for unbiased and efficient machine learning algorithm evaluation

Speaker: Dr João Luiz Junho Pereira

Artificial Intelligence revolutionized several areas of human knowledge and became very popular in the last decade. Supervised Machine Learning (ML) algorithms are the main protagonists in this revolution and whenever a new supervised ML algorithm is developed or presented, it is crucial to assess its predictive performance across diverse datasets to identify its strengths and weaknesses and situations where it can be most useful. However, current literature shows that these testing benchmarks of datasets are typically gathered from public repositories, being the selection often ad-hoc and lacking specific criteria. Studies are necessary to propose benchmarking datasets that properly evaluate regression and classification algorithms. This can be done by uniting two major and complex areas: meta-learning and optimization. The first identifies the main characteristics and general aspects of each dataset, relating them to the predictive performance of ML models, which the Instance Space Analysis (ISA) has proven to be a powerful tool. Then the optimization can be applied on top of this information to select a subset of datasets that best responds to one or more defined objectives. Satisfactory results have already been found and presented at the European Conference in Machine Learning (ECML PKDD 2023) considering a fixed number of Instances to be selected that have maximum ISA coverage. But now we want to go further by applying multi-objective optimization, where several so-called non-dominated solutions are found and their selection requires complex decision-making tools. The biggest goal now is not just to find diverse enough datasets to challenge the ML algorithms in different ways, but also to do this while minimizing the computational cost necessary to do so, for example.

Dr João Luiz Junho Pereira is a Brazilian postdoctoral researcher from the Aeronautics Institute of Technology (ITA) visiting OPTIMA. He loves to improve the human way of living and creating resources using Artificial Intelligence. He received his PhD from the Federal University of Itajubá, which was winner of the biggest award in brazilian science, the 2023 CAPES, after creating the first metaheuristic for multi-objective optimization from Brazil and the first in the world Inspired by Lightning and applied it, other metaheuristics, and machine learning to solve the most diverse and complex problems in mechanical engineering and beyond. His main areas of activities are optimization with metaheuristics, machine learning, design of experiments (DOE), and finite element method.

MEETING ID: 873 1557 5255; PASSWORD: 778635

WED 22 MAY 2024 16:00-17:00 (AEST, Melbourne Time)

This event is Hybrid: Meet and Greet with light afternoon tea from 15:30 with the seminar to start 16:00
Location for in person: Melbourne Connect Level M Room03

Melbourne Connect
Level M, 700 Swanston St, Carton

