GPTEval: A Survey on Assessments of ChatGPT and GPT-4

Rui Mao; Guanyi Chen; Xulang Zhang; Frank Guerin; Erik Cambria

doi:10.48550/arxiv.2308.12488

Back

GPTEval: A Survey on Assessments of ChatGPT and GPT-4

Preprint

Open access

GPTEval: A Survey on Assessments of ChatGPT and GPT-4

Rui Mao, Guanyi Chen, Xulang Zhang, Frank Guerin and Erik Cambria

arXiv.org

24/08/2023

DOI: https://doi.org/10.48550/arxiv.2308.12488

Abstract

Computer Science - Artificial Intelligence

Computer Science - Computation and Language

The emergence of ChatGPT has generated much speculation in the press about its potential to disrupt social and economic systems. Its astonishing language ability has aroused strong curiosity among scholars about its performance in different domains. There have been many studies evaluating the ability of ChatGPT and GPT-4 in different tasks and disciplines. However, a comprehensive review summarizing the collective assessment findings is lacking. The objective of this survey is to thoroughly analyze prior assessments of ChatGPT and GPT-4, focusing on its language and reasoning abilities, scientific knowledge, and ethical considerations. Furthermore, an examination of the existing evaluation methods is conducted, offering several recommendations for future research in evaluating large language models.

Files and links (1)

url

https://arxiv.org/pdf/2308.12488.pdfView

Preprint (Author's original)CC BY V4.0, Open

Metrics

18 Record Views

Details

Title: GPTEval: A Survey on Assessments of ChatGPT and GPT-4
Creators: Rui Mao - Nanyang Technological University
Guanyi Chen - Central China Normal University
Xulang Zhang - Nanyang Technological University
Frank Guerin - University of Surrey
Erik Cambria - Nanyang Technological University
Publication Details: arXiv.org
Identifiers: 99817429002346
Academic Unit: School of Computer Science and Electronic Engineering
Language: English
Resource Type: Preprint

GPTEval: A Survey on Assessments of ChatGPT and GPT-4

Abstract

Files and links (1)

Metrics

Details

Usage Policy