• Mi Re-Unir
    Búsqueda Avanzada
    JavaScript is disabled for your browser. Some features of this site may not work without it.
    Ver ítem 
    •   Inicio
    • UNIR REVISTAS
    • Revista IJIMAI
    • 2025
    • vol. 9, nº 2, march 2025
    • Ver ítem
    •   Inicio
    • UNIR REVISTAS
    • Revista IJIMAI
    • 2025
    • vol. 9, nº 2, march 2025
    • Ver ítem

    Improved Fine-Tuned Reinforcement Learning From Human Feedback Using Prompting Methods for News Summarization

    Autor: 
    Pulari, Sini Raj
    ;
    Umadevi, Maramreddy
    ;
    Vasudevan, Shriram K.
    Fecha: 
    01/03/2025
    Palabra clave: 
    Abstractive Summarization; Extractive Summarization; Natural Language Processing; News Summarization; Prompt Engineering; Reinforcement Learning From Human Feedback (RLHF)
    Revista / editorial: 
    UNIR
    Citación: 
    S. R. Pulari, M. Umadevi, S. K. Vasudevan. Improved Fine-Tuned Reinforcement Learning From Human Feedback Using Prompting Methods for News Summarization, International Journal of Interactive Multimedia and Artificial Intelligence, vol. 9, no. 2, pp. 59-67, 2025, http://dx.doi.org/10.9781/ijimai.2025.02.001
    Tipo de Ítem: 
    article
    URI: 
    https://reunir.unir.net/handle/123456789/19232
    DOI: 
    https://doi.org/10.9781/ijimai.2025.02.001
    Dirección web: 
    https://www.ijimai.org/index.php/ijimai/article/view/259
    Open Access
    Resumen:
    ChatGPT uses a generative pretrained transformer neural network model, which is under the larger umbrella of generative models. One major boom after ChatGPT is the advent of prompt engineering, which is the most critical part of ChatGPT that utilizes Large Language Models (LLM) and helps ChatGPT provide the desired outputs based on the style and tone of interactions carried out with it. Reinforcement learning from human feedback (RLHF) was used as the major aspect for fine-tuning LLM-based models. This work proposes a human selection strategy that is incorporated in the RLHF process to prevent undesirable consequences of the rightful choice of human reviewers for feedback. H-Rouge is a new metric proposed for humanized AI systems. A detailed evaluation of State-of-the-art summarization algorithms and prompt-based methods have been provided as part of the article. The proposed methods have introduced a strategy for human selection of RLHF models which employs multi-objective optimization to balance various goals encountered during the process with H-Rouge. This article will help nuance readers conduct research in the field of text summarization to start with prompt engineering in the summarization field, and future work will help them proceed in the right direction of research.
    Mostrar el registro completo del ítem
    Ficheros en el ítem
    icon
    Nombre: Improved Fine-Tuned Reinforcement Learning From Human Feedback Using Prompting Methods for News Summarization.pdf
    Tamaño: 1.372Mb
    Formato: application/pdf
    Ver/Abrir
    Este ítem aparece en la(s) siguiente(s) colección(es)
    • vol. 9, nº 2, march 2025

    Estadísticas de uso

    Año
    2012
    2013
    2014
    2015
    2016
    2017
    2018
    2019
    2020
    2021
    2022
    2023
    2024
    2025
    2026
    Vistas
    0
    0
    0
    0
    0
    0
    0
    0
    0
    0
    0
    0
    0
    0
    6
    Descargas
    0
    0
    0
    0
    0
    0
    0
    0
    0
    0
    0
    0
    0
    0
    0

    Ítems relacionados

    Mostrando ítems relacionados por Título, autor o materia.

    • A Fault-Tolerant Mobile Computing Model Based On Scalable Replica 

      Sati, Meenakshi; Vikash, Vivek; Bijalwan, Vishwanath; Kumari, Pinki; Raj, Manish; Balodhi, Meenu; Gairola, Priya; Bhaskar Semwal, Vijay (International Journal of Interactive Multimedia and Artificial Intelligence (IJIMAI), 06/2014)
      The most frequent challenge faced by mobile user is stay connected with online data, while disconnected or poorly connected store the replica of critical data. Nomadic users require replication to store copies of critical ...
    • Feature based video stabilization based on boosted HAAR Cascade and representative point matching algorithm 

      Raj, Rohit; Rajiv, Pooshkar; Kumar, Prabhat; Khari, Manju; Verdú, Elena ; González-Crespo, Rubén ; Manogarane, Gunasekaran (Image and Vision Computing, 09/2020)
      The success of handheld video capturing devices has further fueled the need of improved video stabilization. The videos often contain many foreground facial features like eyes, nose etc. These foreground features can be ...
    • Hybrid Model for Passive Locomotion Control of a Biped Humanoid:The Artificial Neural Network Approach 

      Bhaskar-Semwal, Vijay; Raj, Manish; Nandi, G C (International Journal of Interactive Multimedia and Artificial Intelligence (IJIMAI), 06/2018)
      Developing a correct model for a biped robot locomotion is extremely challenging due to its inherently unstable structure because of the passive joint located at the unilateral foot-ground contact and varying configurations ...

    Mi cuenta

    AccederRegistrar

    ¿necesitas ayuda?

    Manual de UsuarioContacto: reunir@unir.net

    Listar

    todo Re-UnirComunidades y coleccionesPor fecha de publicaciónAutoresTítulosPalabras claveTipo documentoTipo de accesoEsta colecciónPor fecha de publicaciónAutoresTítulosPalabras claveTipo documentoTipo de acceso






    Aviso Legal Política de Privacidad Política de Cookies Cláusulas legales RGPD
    © UNIR - Universidad Internacional de La Rioja
     
    Aviso Legal Política de Privacidad Política de Cookies Cláusulas legales RGPD
    © UNIR - Universidad Internacional de La Rioja