Peer Review Analysis with LLM-Generated Feedback

This study investigates the potential role of Large Language Models (LLMs) in supporting the peer review process, using both human and LLM-generated reviews to assess consistency and overlap in feedback quality.

Project Overview

Peer review is crucial for maintaining publication quality standards, yet the increasing volume of research articles challenges traditional review systems. This project explores the feasibility of using LLM-generated reviews as a supportive tool to ease reviewer workload without compromising scientific rigor. Using the ReviewCritique dataset by Du et al., which contains human and LLM reviews for both accepted and rejected papers, we analyze the overlap and consistency between reviews provided by humans and LLMs.

This study evaluates review similarities across multiple LLMs—ChatGPT-4o, Claude Opus, and Gemini Pro. Key findings indicate that LLM-generated reviews maintain consistency across LLMs but differ from human feedback, suggesting a potential supplementary role for LLMs in peer review.

Repository Structure

The repository is organized as follows:

data/: Contains the original ReviewCritique dataset, including human and LLM reviews for 20 research papers (accepted and rejected).
extracted/: Holds extracted and formatted data subsets specifically prepared for this research.
processed/: Stores evaluated metrics in JSON format, generated by analysis.py and used as input for plotting.py.
figures/: Contains visualization outputs produced by plotting.py, based on processed data metrics.
prompts/: Contains prompts from Liang et al., used for generating LLM reviews, comparing reviews, and summarizing key findings.
analysis.py: Script to evaluate overlap metrics between human and LLM reviews. Outputs metrics in JSON format to be used in subsequent plotting.
plotting.py: Script to generate figures based on JSON metrics from processed/. Visualizes differences in overlap and consistency across review types.
requirements.txt: Specifies required packages for running the code. Use this file to install dependencies with pip.

Installation

Clone the repository:

git clone https://github.com/dhunstack/llms-for-paper-review.git
cd llms-for-paper-review

Install the required packages:
```
pip install -r requirements.txt
```

Usage

Data Extraction: Follow the comments and code in the notebooks to extract the relevant information of the subset of the dataset to the extracted/ folder.
Data Analysis: Run analysis.py to process review overlaps and output the metrics into JSON format within the processed/ folder.
```
python analysis.py
```
Visualization: Run plotting.py to generate figures from the processed data, saved in the figures/ folder.
```
python plotting.py
```

This project is licensed under the MIT License.

Contributing

Contributions are welcome! Feel free to open an issue for suggestions or bug reports, or submit pull requests for improvements.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Peer Review Analysis with LLM-Generated Feedback

Project Overview

Repository Structure

Installation

Usage

Contributing

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
data		data
extracted		extracted
figures		figures
prompts		prompts
.gitignore		.gitignore
MERE2024_CameraReadySubmission.pdf		MERE2024_CameraReadySubmission.pdf
README.md		README.md
analysis.py		analysis.py
plotting.py		plotting.py
requirements.txt		requirements.txt

dhunstack/llms-for-paper-review

Folders and files

Latest commit

History

Repository files navigation

Peer Review Analysis with LLM-Generated Feedback

Project Overview

Repository Structure

Installation

Usage

Contributing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages