Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,5 @@ src/__pycache__
data_sample/CVE_label/extract.py
data_sample/CVE_label/raw_data.txt
data_full/CVE_label/extract.py
data_full/CVE_label/raw_data.txt
data_full/CVE_label/raw_data.txt
gptlens_env/
4 changes: 2 additions & 2 deletions .streamlit/config.toml
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
[server]
port = 8501
[server]
port = 8501
enableStaticServing = true
1 change: 0 additions & 1 deletion GPTLens
Submodule GPTLens deleted from 7e5a20
161 changes: 91 additions & 70 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,97 +1,124 @@
# GPTLens
# GPTLens with OpenAI GPT and Deepseek Reasoner

This is the repo for the code and datasets used in the paper [Large Language Model-Powered Smart Contract Vulnerability Detection: New Perspectives](https://arxiv.org/pdf/2310.01152.pdf), accepted by the IEEE Trust, Privacy and Security (TPS) conference 2023.
This is a fork of the original [GPTLens repository](https://arxiv.org/pdf/2310.01152.pdf), which was presented in the paper "Large Language Model-Powered Smart Contract Vulnerability Detection: New Perspectives" at the IEEE Trust, Privacy and Security (TPS) conference 2023.

If you find this repository useful, please give us a star! Thank you: )
This fork modifies the original authors' work to include support for both OpenAI GPT models (GPT-3.5-Turbo, GPT-4, GPT-4-Turbo) and Deepseek Reasoner models for both the auditor and critic stages, improving the accuracy and precision of vulnerability detection in smart contracts.

If you wish to run your own dataset, please switch to the "release" branch:
```sh
git checkout release
## Getting Started

### Prerequisites

1. Python 3.8+ environment
2. OpenAI API key (for GPT models)
3. Deepseek API key (for Deepseek models)

### Installation

```bash
# Clone the repository
git clone https://github.com/yourusername/GPTLens.git
cd GPTLens

# Create and activate a virtual environment
python -m venv gptlens_env
source gptlens_env/bin/activate # On Windows: gptlens_env\Scripts\activate

# Install dependencies
pip install -r requirements.txt
```

## Getting Start
### Setting Up Your API Keys

Set your API keys in the environment:

### Step 0: Set up your GPT-4 API
```bash
# On Linux/macOS
export OPENAI_API_KEY="your-openai-api-key-here"
export DEEPSEEK_API_KEY="your-deepseek-api-key-here"

Get GPT-4 API from https://platform.openai.com/account/api-keys
# On Windows
set OPENAI_API_KEY=your-openai-api-key-here
set DEEPSEEK_API_KEY=your-deepseek-api-key-here
```

## Running the Vulnerability Detection Pipeline

Replace OPENAI_API_KEY = "Enter your openai API key" in src/model.py (line 4) with your API key.
The vulnerability detection process consists of three stages:

Set up Python environment by importing environment.yml as a Conda env.
1. **Auditor**: Identifies potential vulnerabilities in smart contracts
2. **Critic**: Evaluates the vulnerabilities identified by the auditor
3. **Ranker**: Combines auditor and critic outputs to produce final vulnerability scores

### Step 1: Run Auditor
### Model Options

Stay on GPTLens base folder
You can choose from the following models:
- OpenAI models: `gpt-3.5-turbo`, `gpt-4`, `gpt-4-turbo-preview`
- Deepseek models: `deepseek-r1`, `deepseek-reasoner`

```sh
python src/run_auditor.py --backend=gpt-4 --temperature=0.7 --topk=3 --num_auditor=1
### Step 1: Run the Auditor

```bash
python src/run_auditor.py --backend=deepseek-r1 --temperature=0.7 --topk=3 --num_auditor=1 --data_dir=data_full/CVE_clean
```

| Parameter | Description |
|-----------------|-----------------------------------------------------------------|
| `backend` | The version of GPT |
| `temperature` | The hyper-parameter that controls the randomness of generation. |
| `topk` | Identify k vulnerabilities per each auditor |
| `num_auditor` | The total number of independent auditors. |

| `backend` | The model to use (deepseek-r1 maps to deepseek-reasoner) |
| `temperature` | Controls randomness (passed for compatibility) |
| `topk` | Number of vulnerabilities to identify per contract |
| `num_auditor` | Number of independent auditors to run |
| `data_dir` | Directory containing the smart contracts to analyze |

### Step 2: Run Critic
### Step 2: Run the Critic

```sh
python src/run_critic.py --backend=gpt-4 --temperature=0 --auditor_dir="auditor_gpt-4_0.7_top3_1" --num_critic=1 --shot=few
```bash
python src/run_critic.py --backend=deepseek-r1 --temperature=0 --auditor_dir="auditor_deepseek-r1_0.7_top3_1" --num_critic=1 --shot=few
```

| Parameter | Description |
|---------------|-----------------------------------------------------------------|
| `backend` | The version of GPT |
| `temperature` | The hyper-parameter that controls the randomness of generation. |
| `auditor_dir` | The directory of logs outputted by the auditor. |
| `num_critic` | The total number of independent critics. |
| `shot` | Whether few shot or zero shot prompt. |


| `backend` | The model to use (deepseek-r1 maps to deepseek-reasoner) |
| `temperature` | Controls randomness (0 for deterministic outputs) |
| `auditor_dir` | Directory containing auditor results |
| `num_critic` | Number of independent critics |
| `shot` | Whether to use few-shot or zero-shot prompting |

### Step 3: Run Ranker
### Step 3: Run the Ranker

```sh
python src/run_rank.py --auditor_dir="auditor_gpt-4_0.7_top3_1" --critic_dir="critic_gpt-4_0_1_few" --strategy="default"
```bash
python src/run_rank.py --auditor_dir="auditor_deepseek-r1_0.7_top3_1" --critic_dir="critic_deepseek-r1_0.0_1_few" --strategy="default"
```

| Parameter | Description |
|---------------|-------------------------------------------------|
| `auditor_dir` | The directory of logs outputted by the auditor. |
| `critic_dir` | The directory of logs outputted by the critic. |
| `strategy` | The strategy for generating the final score. |


Some updates:

**09/28**: We observed that the outputs of auditors can drift largely at different time periods.
For instance, GPT-4 could easily identify the vulnerability in the CVE-2018-19830.sol at Sep. 16 but had difficulty detecting it at Sep. 28.
```sh
{
"function_name": "UBSexToken",
"vulnerability": "Unexpected Behaviour",
"criticism": "The reasoning is correct. The function name does not match the contract name, which means it is not the constructor and can be called by anyone at any time. This can lead to the totalSupply and owner of the token being reset, which is a serious vulnerability.",
"correctness": 9,
"severity": 9,
"profitability": 9,
"reason": "The function name does not match the contract name. This indicates that this function is intended to be the constructor, but it is not. This means that anyone can call the function at any time and reset the totalSupply and owner of the token.",
"code": "function UBSexToken() {\n owner = msg.sender;\n totalSupply = 1.9 * 10 ** 26;\n balances[owner] = totalSupply;\n}",
"label": "Access Control",
"file_name": "2018-19830.sol",
"description": "The UBSexToken() function of a smart contract implementation for Business Alliance Financial Circle (BAFC), an tradable Ethereum ERC20 token, allows attackers to change the owner of the contract, because the function is public (by default) and does not check the caller's identity."
},
```
We uploaded a set of results that we obtained on Sep. 28 using GPT-4 with 1 auditor, 1 critic and 3 outputs per each contract (see src/logs/auditor_gpt-4_0.7_top3_1/critic_gpt-4_0_1_zero_0928).
The composite score less than 5 can be deemed as not being a vulnerability.
| `auditor_dir` | Directory containing auditor results |
| `critic_dir` | Directory containing critic results |
| `strategy` | Strategy for generating the final score |

## Technical Notes

1. **Model Mapping**: The command-line argument `deepseek-r1` is internally mapped to the Deepseek API model name `deepseek-reasoner`.

2. **File Locations**: The smart contracts should be placed in the `data_full/CVE_clean/` directory. If your files are in a different location, use the `--data_dir` parameter to specify the correct path.

3. **JSON Parsing Errors**: You may see "Expecting value" errors during the auditor stage. These are normal and occur when the model's output doesn't conform exactly to the expected JSON format.

4. **Results Location**: The final results will be stored in the `src/logs/` directory, organized by auditor, critic, and ranker directories.

**10/26**: We observed that the output of critic can also be different (-.-) at different time periods, even with the same input and the temperature set to 0 (deterministic generation). This might be caused by the update of GPT-4 (?). To make scoring consistent, we added few shot examples for critic prompt.
We uploaded a set of results of the critic with few-shot prompt that obtained on Oct. 26 using GPT-4 (see src/logs/auditor_gpt-4_0.7_top3_1/critic_gpt-4_0_1_few_1026).
## Troubleshooting

This repo will be continuously updated to make generation more consistent and robust.
- **API Key Issues**: Ensure your Deepseek API key is correctly set in the environment.
- **Missing Files**: If you encounter "File not found" errors, check that your smart contracts are in the correct directory.
- **Model Errors**: If you see "Model Not Exist" errors, ensure you're using the correct model name (`deepseek-r1`).

## Benchmarking Results

The original paper benchmarked GPT-4 on 13 CVE smart contracts. This fork extends that work by benchmarking Deepseek Reasoner on the same dataset. Our preliminary results show that Deepseek Reasoner performs competitively with GPT-4 while offering better cost efficiency.

-----
## Citation

If you use this work in your research, please cite the original paper:

```
@misc{hu2023large,
title={Large Language Model-Powered Smart Contract Vulnerability Detection: New Perspectives},
Expand All @@ -101,10 +128,4 @@ This repo will be continuously updated to make generation more consistent and ro
archivePrefix={arXiv},
primaryClass={cs.CR}
}
```

-----
## Q&A

If you have any questions, you can either open an issue or contact me (sihaohu@gatech.edu), and I will reply as soon as I see the issue or email.

```
Loading