Strategic data navigation: information value-based sample selection

Balogh, Csanád Levente and Pelenczei, Bálint and Kővári, Bálint and Bécsi, Tamás (2024) Strategic data navigation: information value-based sample selection. ARTIFICIAL INTELLIGENCE REVIEW, 57 (7). ISSN 0269-2821 10.1007/s10462-024-10813-3

[img] Text
Balogh_187_35069785_ny.pdf

Download (2MB)

Abstract

Artificial Intelligence represents a rapidly expanding domain, with several industrial applications demonstrating its superiority over traditional techniques. Despite numerous advancements within the subfield of Machine Learning, it encounters persistent challenges, highlighting the importance of ongoing research efforts. Among its primary branches, this study delves into two categories, being Supervised and Reinforcement Learning, particularly addressing the common issue of data selection for training. The inherent variability in informational content among data points is apparent, wherein certain samples offer more valuable information to the neural network than others. However, evaluating the significance of various data points remains a non-trivial task, generating the need for a robust method to effectively prioritize samples. Drawing inspiration from Reinforcement Learning principles, this paper introduces a novel sample prioritization approach, applied to Supervised Learning scenarios, aimed at enhancing classification accuracy through strategic data navigation, while exploring the boundary between Reinforcement and Supervised Learning techniques. We provide a comprehensive description of our methodology while revealing the identification of an optimal prioritization balance and demonstrating its beneficial impact on model performance. Although classification accuracy serves as the primary validation metric, the concept of information density-based prioritization encompasses wider applicability. Additionally, the paper investigates parallels and distinctions between Reinforcement and Supervised Learning methods, declaring that the foundational principle is equally relevant, hence completely adaptable to Supervised Learning with appropriate adjustments due to different learning frameworks. The project page and source code are available at: https://csanadlb.github.io/sl_prioritized_sampling/ .

Item Type: Article
Subjects: Q Science > QA Mathematics and Computer Science > QA75 Electronic computers. Computer science / számítástechnika, számítógéptudomány
Divisions: Systems and Control Lab
SWORD Depositor: MTMT Injector
Depositing User: MTMT Injector
Date Deposited: 06 Jul 2024 10:44
Last Modified: 06 Jul 2024 10:44
URI: https://eprints.sztaki.hu/id/eprint/10755

Update Item Update Item