• Mi Re-Unir
    Búsqueda Avanzada
    JavaScript is disabled for your browser. Some features of this site may not work without it.
    Ver ítem 
    •   Inicio
    • RESULTADOS DE INVESTIGACIÓN
    • Otras Publicaciones: artículos, libros...
    • Ver ítem
    •   Inicio
    • RESULTADOS DE INVESTIGACIÓN
    • Otras Publicaciones: artículos, libros...
    • Ver ítem

    Comparative analysis of paraphrasing performanceof ChatGPT, GPT-3, and T5 language modelsusing a new ChatGPT generated dataset: ParaGPT

    Autor: 
    Pehlivanoglu, Meltem Kurt
    ;
    Abdan Syakura, Muhammad
    ;
    de-la-Fuente-Valentín, Luis
    ;
    Tadesse Gobosho, Robera
    ;
    Shanmuganathan, Vimal
    Fecha: 
    2024
    Palabra clave: 
    ChatGPT; generative artificial intelligence; large language models; machine learning
    Revista / editorial: 
    Expert Systems
    Citación: 
    Kurt Pehlivanoğlu, M., Gobosho, R. T., Syakura, M. A., Shanmuganathan, V., & de-la-Fuente-Valentín, L. (2024). Comparative analysis of paraphrasing performance of ChatGPT, GPT-3, and T5 language models using a new ChatGPT generated dataset: ParaGPT. Expert Systems, 41(11), e13699. https://doi.org/10.1111/exsy.13699
    Tipo de Ítem: 
    article
    URI: 
    https://reunir.unir.net/handle/123456789/17523
    DOI: 
    https://doi.org/10.1111/exsy.13699
    Dirección web: 
    https://onlinelibrary.wiley.com/doi/10.1111/exsy.13699
    Open Access
    Resumen:
    Paraphrase generation is a fundamental natural language processing (NLP) task that refers to the process of generating a well-formed and coherent output sentence that exhibits both syntactic and/or lexical diversity from the input sentence, while simultaneously ensuring that the semantic similarity between the two sentences is preserved. However, the availability of high quality paraphrase datasets has been limited, particularly for machine-generated sentences. In this paper, we present ParaGPT, a new paraphrase dataset of 81,000 machine-generated sentence pairs, including 27,000 reference sentences (ChatGPT-generated sentences), and 81,000 paraphrases obtained by using three different large language models (LLMs): ChatGPT, GPT-3, and T5. We used ChatGPT to generate 27,000 sentences that cover a diverse array of topics and sentence structures, thus providing diverse inputs for the models. In addition, we evaluated the quality of the generated paraphrases using various automatic evaluation metrics. Furthermore, we provide insights into the strengths and drawbacks of each LLM in generating paraphrases by conducting a comparative analysis of the paraphrasing performance of the three LLMs. According to our findings, ChatGPT's performance, as per the evaluation metrics provided, was deemed impressive and commendable, owing to its higher-than-average scores for semantic similarity, which implies a higher degree of similarity between the generated paraphrase and the reference sentence, and its relatively lower scores for syntactic diversity, indicating a greater diversity of syntactic structures in the generated paraphrase. ParaGPT is a valuable resource for researchers working on NLP tasks like paraphrasing, text simplification, and text generation. We make the ParaGPT dataset publicly accessible to researchers, and as far as we are aware, this is the first paraphrase dataset produced based on ChatGPT
    Mostrar el registro completo del ítem
    Ficheros en el ítem
    icon
    Nombre: Comparative analysis of.pdf
    Tamaño: 6.025Mb
    Formato: application/pdf
    Ver/Abrir
    Este ítem aparece en la(s) siguiente(s) colección(es)
    • Otras Publicaciones: artículos, libros...

    Estadísticas de uso

    Año
    2012
    2013
    2014
    2015
    2016
    2017
    2018
    2019
    2020
    2021
    2022
    2023
    2024
    2025
    Vistas
    0
    0
    0
    0
    0
    0
    0
    0
    0
    0
    0
    0
    32
    116
    Descargas
    0
    0
    0
    0
    0
    0
    0
    0
    0
    0
    0
    0
    33
    102

    Ítems relacionados

    Mostrando ítems relacionados por Título, autor o materia.

    • Emerging Technologies Landscape on Education. A review 

      de-la-Fuente-Valentín, Luis; Carrasco, Aurora; Konya, Kinga; Burgos, Daniel (International Journal of Interactive Multimedia and Artificial Intelligence (IJIMAI), 09/2013)
      This paper presents a desk research that analysed available recent studies in the field of Technology Enhanced Learning. The desk research is focused on work produced in the frame of FP6 and FP7 European programs, in the ...
    • Case of Study in Online Course of Computer Engineering during COVID-19 Pandemic 

      Lamo-Anuarbe, Paula ; Perales, Mikel ; de-la-Fuente-Valentín, Luis (Electronics, 2022)
      Practical activities and laboratories, where the students handle hardware devices, are an important part of the curriculum in STEAM degrees. In face-to-face learning, the students go to a specific classroom where the ...
    • Learning Management Systems Activity Records for Students' Assessment of Generic Skills 

      de-la-Fuente-Valentín, Luis ; Ortega-Gómez, Miguel ; Dodero, Juan Manuel; Burgos, Daniel ; Balderas, Antonio (IEEE Access, 2018)
      Students' acquisition of generic skills is a key to their incorporation into the job world. However, teachers encounter several difficulties when measuring their students' performance in generic skills. These difficulties ...

    Mi cuenta

    AccederRegistrar

    ¿necesitas ayuda?

    Manual de UsuarioContacto: reunir@unir.net

    Listar

    todo Re-UnirComunidades y coleccionesPor fecha de publicaciónAutoresTítulosPalabras claveTipo documentoTipo de accesoEsta colecciónPor fecha de publicaciónAutoresTítulosPalabras claveTipo documentoTipo de acceso






    Aviso Legal Política de Privacidad Política de Cookies Cláusulas legales RGPD
    © UNIR - Universidad Internacional de La Rioja
     
    Aviso Legal Política de Privacidad Política de Cookies Cláusulas legales RGPD
    © UNIR - Universidad Internacional de La Rioja