Improved Fine-Tuned Reinforcement Learning From Human Feedback Using Prompting Methods for News Summarization

Pulari, Sini Raj; Umadevi, Maramreddy; Vasudevan, Shriram K.

dc.contributor.author	Pulari, Sini Raj
dc.contributor.author	Umadevi, Maramreddy
dc.contributor.author	Vasudevan, Shriram K.
dc.date	2025-03-01
dc.date.accessioned	2026-03-11T09:47:13Z
dc.date.available	2026-03-11T09:47:13Z
dc.identifier.citation	S. R. Pulari, M. Umadevi, S. K. Vasudevan. Improved Fine-Tuned Reinforcement Learning From Human Feedback Using Prompting Methods for News Summarization, International Journal of Interactive Multimedia and Artificial Intelligence, vol. 9, no. 2, pp. 59-67, 2025, http://dx.doi.org/10.9781/ijimai.2025.02.001	es_ES
dc.identifier.uri	https://reunir.unir.net/handle/123456789/19232
dc.description.abstract	ChatGPT uses a generative pretrained transformer neural network model, which is under the larger umbrella of generative models. One major boom after ChatGPT is the advent of prompt engineering, which is the most critical part of ChatGPT that utilizes Large Language Models (LLM) and helps ChatGPT provide the desired outputs based on the style and tone of interactions carried out with it. Reinforcement learning from human feedback (RLHF) was used as the major aspect for fine-tuning LLM-based models. This work proposes a human selection strategy that is incorporated in the RLHF process to prevent undesirable consequences of the rightful choice of human reviewers for feedback. H-Rouge is a new metric proposed for humanized AI systems. A detailed evaluation of State-of-the-art summarization algorithms and prompt-based methods have been provided as part of the article. The proposed methods have introduced a strategy for human selection of RLHF models which employs multi-objective optimization to balance various goals encountered during the process with H-Rouge. This article will help nuance readers conduct research in the field of text summarization to start with prompt engineering in the summarization field, and future work will help them proceed in the right direction of research.	es_ES
dc.language.iso	eng	es_ES
dc.publisher	UNIR	es_ES
dc.relation.uri	https://www.ijimai.org/index.php/ijimai/article/view/259	es_ES
dc.rights	openAccess	es_ES
dc.subject	Abstractive Summarization	es_ES
dc.subject	Extractive Summarization	es_ES
dc.subject	Natural Language Processing	es_ES
dc.subject	News Summarization	es_ES
dc.subject	Prompt Engineering	es_ES
dc.subject	Reinforcement Learning From Human Feedback (RLHF)	es_ES
dc.title	Improved Fine-Tuned Reinforcement Learning From Human Feedback Using Prompting Methods for News Summarization	es_ES
dc.type	article	es_ES
reunir.tag	~IJIMAI	es_ES
dc.identifier.doi	https://doi.org/10.9781/ijimai.2025.02.001

Ficheros en el ítem

Nombre:: Improved Fine-Tuned Reinforcement ...
Tamaño:: 1.372Mb
Formato:: PDF

Ver/Abrir

Este ítem aparece en la(s) siguiente(s) colección(ones)

vol. 9, nº 2, march 2025

Mostrar el registro sencillo del ítem

Improved Fine-Tuned Reinforcement Learning From Human Feedback Using Prompting Methods for News Summarization

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)

Ítems relacionados

A Fault-Tolerant Mobile Computing Model Based On Scalable Replica ﻿

Feature based video stabilization based on boosted HAAR Cascade and representative point matching algorithm ﻿

Hybrid Model for Passive Locomotion Control of a Biped Humanoid:The Artificial Neural Network Approach ﻿

A Fault-Tolerant Mobile Computing Model Based On Scalable Replica

Feature based video stabilization based on boosted HAAR Cascade and representative point matching algorithm

Hybrid Model for Passive Locomotion Control of a Biped Humanoid:The Artificial Neural Network Approach