Conservative Learning Method for Diabetic Issues Using Reinforcement Learning & Brain Strom Optimization

Main Article Content

S.anusuya, K. Anandapadmanabhan

Abstract

Promising outcomes are shown when attention-based sequential recommendation techniques effectively capture the dynamic interests of users in previous interactions. Recent work has started incorporating reinforcement learning (RL) into these models in addition to improving user representations. We offer a recommender system (RS) that combines direct user feedback into the compensating format and takes into account significant features to deliver a more personalized experience, enabling development, by building up consecutive suggestions for RL problems with compensating signals. Recent RS reinforcement learning systems integrate Supervised Negative Q Learning (SNQN) and Supervised Superior Actor Critic (SA2C), yet these issues still exist. For instance, because there aren't many positive compensation signals, Q-value estimates are typically biased toward negative values.Additionally, the precise timestamp of the sequence has a significant influence on the Q value. We suggest using contrast-based objectives using Brain Strom optimization in conjunction with extensions to solve datasets with larger ranges in order to resolve the aforementioned problems. Furthermore, we are aware that using negative sampling could lead to possible instability problems. As a result, we present an enhanced strategy to address these issues with greater effectiveness.

Article Details

Section
Articles