awesome-code-generation-security

awesome code generation & security is a curated collection of scholarly papers and resources focusing on the nexus of code generation, testing, and security. This repository aims to serve as a central hub for researchers, practitioners, and enthusiasts who are interested in exploring and advancing the state-of-the-art in code generation techniques and their implications for software security.

To stay updated with the latest additions and discussions, star or watch this repository. Your involvement can make a significant difference in building a robust community around secure code generation!

It is important to note that some papers are published in conferences or journals that do not offer public download resources. Additionally, some authors may not have made their work available on preprint servers like arXiv. For these resources, we will provide direct links to the official publication wherever possible. Since accessing these papers may require a subscription through academic institutions or individual purchases, we recommend that users contact their libraries or utilize academic networking resources to download them.

This warehouse directory group should be formed as follows:

1.Advancements in Code Generation Models: Techniques and Application
- 1.1 Code Generation Models
- 1.2 Code Generation Evaluation
- 1.3 Others
2.Security in Code Generation: Approaches and Case Studies

If you want to be involved in changes or have new resources you'd like to contribute to, you can make a request. For more information, please contact us at thebinking66@gmail.com

1.Advancements in Code Generation Models: Techniques and Application

1.1 Code Generation Models

2019

[ACL'19] ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for

[paper] [project]

Click to see the abstract!

> Neural language representation models such as BERT pre-trained on large-scale corpora can well capture rich semantic patterns from plain text, and be fine-tuned to consistently improve the performance of various NLP tasks. However, the existing pre-trained language models rarely consider incorporating knowledge graphs (KGs), which can provide rich structured knowledge facts for better language understanding. We argue that informative entities in KGs can enhance language representation with external knowledge. In this paper, we utilize both large-scale textual corpora and KGs to train an enhanced language representation model (ERNIE), which can take full advantage of lexical, syntactic, and knowledge information simultaneously. The experimental results have demonstrated that ERNIE achieves significant improvements on various knowledge-driven tasks, and meanwhile is comparable with the state-of-the-art model BERT on other common NLP tasks. The code and datasets will be available in the future.

2020

[ESEC/FSE'20] IntelliCode Compose: Code Generation using Transformer

[paper] [talk]

Click to see the abstract!

In software development through integrated development environments (IDEs), code completion is one of the most widely used features. Nevertheless, majority of integrated development environments only support completion of methods and APIs, or arguments.

In this paper, we introduce IntelliCode Compose – a general-purpose multilingual code completion tool which is capable of predicting sequences of code tokens of arbitrary types, generating up to entire lines of syntactically correct code. It leverages state-of-the-art generative transformer model trained on 1.2 billion lines of source code in Python, C#, JavaScript and TypeScript programming languages.
IntelliCode Compose is deployed as a cloud-based web service. It makes use of client-side tree-based caching, efficient parallel implementation of the beam search decoder, and compute graph optimizations to meet edit-time completion suggestion requirements in the Visual Studio Code IDE and Azure Notebook.

Our best model yields an average edit similarity of 86.7% and a perplexity of 1.82 for Python programming language.

[AAAI'20] ERNIE 2.0: A Continual Pre-Training Framework for Language Understanding

[paper] [project]

[GITHUB] GPT-J

[project]

[paper] [project]

[paper]

Click to see the abstract!

> Large Language Models (LLMs), such as ChatGPT and Bard, have revolutionized natural language understanding and generation. They possess deep language comprehension, human-like text generation capabilities, contextual awareness, and robust problem-solving skills, making them invaluable in various domains (e.g., search engines, customer support, translation). In the meantime, LLMs have also gained traction in the security community, revealing security vulnerabilities and showcasing their potential in security-related tasks. This paper explores the intersection of LLMs with security and privacy. Specifically, we investigate how LLMs positively impact security and privacy, potential risks and threats associated with their use, and inherent vulnerabilities within LLMs. Through a comprehensive literature review, the paper categorizes the papers into “The Good” (beneficial LLM applications), “The Bad” (offensive applications), and “The Ugly” (vulnerabilities of LLMs and their defenses). We have some interesting findings. For example, LLMs have proven to enhance code security (code vulnerability detection) and data privacy (data confidentiality protection), outperforming traditional methods. However, they can also be harnessed for various attacks (particularly user-level attacks) due to their human-like reasoning abilities. We have identified areas that require further research efforts. For example, Research on model and parameter extraction attacks is limited and often theoretical, hindered by LLM parameter scale and confidentiality. Safe instruction tuning, a recent development, requires more exploration. We hope that our work can shed light on the LLMs’ potential to both bolster and jeopardize cybersecurity.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
attachments		attachments
README.md		README.md

zer0ptr/awesome-code-generation-and-security

Folders and files

Latest commit

History

Repository files navigation

awesome-code-generation-security

1.Advancements in Code Generation Models: Techniques and Application

1.1 Code Generation Models

2019

[ACL'19] ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for

2020

[ESEC/FSE'20] IntelliCode Compose: Code Generation using Transformer

[AAAI'20] ERNIE 2.0: A Continual Pre-Training Framework for Language Understanding

2021

[arxiv'21] Evaluating Large Language Models Trained on Code

[ACL'21] CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation

[arxiv'21] ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation

2022

[ACL'22] Efficient Large Scale Language Modeling with Mixtures of Experts

[ACL'22] GPT-NeoX-20B: An Open-Source Autoregressive Language Model

[ACL'22] Evaluating Large Language Models Trained on Code

[arxiv'22] Efficient Training of Language Models to Fill in the Middle

[Science'22] Competition-level code generation with AlphaCode

[ICLR'22 & PLDI'22] A Systematic Evaluation of Large Language Models of Code

2023

[ICLR'23] INCODER: A GENERATIVE MODEL FOR CODE INFILLING AND SYNTHESIS

[arxiv'23] CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis

[arxiv'23] Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation

[arxiv'23] CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-X

[arxiv'23] BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

[arxiv'23] PanGu-Coder2: Boosting Large Language Models for Code with Ranking Feedback

[arxiv'23] # MoTCoder: Elevating Large Language Models with Modular of Thought for Challenging Programming Tasks

[arxiv'23] Code Llama: Open Foundation Models for Code

[arxiv'23] CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation

[arxiv'23] SQL-PaLM: Improved large language model adaptation for Text-to-SQL (extended)

[arxiv'23] CodeT5+: Open Code Large Language Models for Code Understanding and Generation

[OpenReview ICLR'23] CodeT5Mix: A Pretrained Mixture of Encoder-Decoder Transformers for Code Understanding and Generation

[arxiv'23] CodeT5+: Open Code Large Language Models for Code Understanding and Generation

1.2 Code Generation Evaluation

[NIPS'21] Measuring Coding Challenge Competence with APPS

[arXiv’23] Evaluating Instruction-Tuned Large Language Models on Code Comprehension and Generation

[ICLR'23] Multilingual Evaluation of Code Generation Models

[PMLR'23] DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation

[arXiv'23] ClassEval: A Manually-Crafted Benchmark for Evaluating LLMs on Class-level Code Generation

[AAAI2024] Can LLM Replace Stack Overflow? A Study on Robustness and Reliability of Large Language Model Code Generation

[NeurIPS 2023] Large Language Models of Code Fail at Completing Code with Potential Bugs

[arXiv‘24] CodeApex: A Bilingual Programming Evaluation Benchmark for Large Language Models

[ICSE'24] CoderEval: A Benchmark of Pragmatic Code Generation with Generative Pre-trained Models

[ICSE'24] Lost in Translation: A Study of Bugs Introduced by Large Language Models while Translating Code

[arXiv‘24] BioCoder: A Benchmark for Bioinformatics Code Generation with Large Language Models

[ICML 2024] Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models

[arXiv’24] OctoPack: Instruction Tuning Code Large Language Models

[EACL'24] ICE-Score: Instructing Large Language Models to Evaluate Code

1.3 Others

[GITHUB] GPT-J

[arXiv’22] Large Language Models Meet NL2Code: A Survey

[arXiv'23] OctoPack: Instruction Tuning Code Large Language Models

[arXiv’23] Private-Library-Oriented Code Generation with Large Language Models

[arXiv’23] A Lightweight Framework for High-Quality Code Generation

[arXiv’23] ToolCoder: Teach Code Generation Models to Use API Search Tools

[TSE 2023] An Empirical Evaluation of Using Large Language Models for Automated Unit Test Generation

[arXiv’23] Exploring the Robustness of Large Language Models for Solving Programming Problems

[arXiv’23] Automatic Code Summarization via ChatGPT: How Far Are We?

[ICSE’22] Jigsaw: large language models meet program synthesis

[ICML'24] Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models

2. Security in Code Generation: Approaches and Case Studies

[CCS'17] Security Weaknesses of Copilot Generated Code in GitHub

[arxiv'21] Evaluating Large Language Models Trained on Code

[CCS'23] Do Users Write More Insecure Code with AI Assistants?

[USENIX Security'23] CodexLeaks: Privacy leaks from code generation language models in GitHub copilot

[CCS'23] Large Language Models for Code: Security Hardening and Adversarial Testing

[usenixsecurity'23] Lost at C: A User Study on the Security Implications of Large Language Model Code Assistants

[ACM SIGSAC'23] Large Language Models for Code: Security Hardening and Adversarial Testing

[arxiv'24] Robustness, Security, Privacy, Explainability, Efficiency, and Usability of Large Language Models for Code

[arxiv'24] Codexity: Secure AI-assisted Code Generation

[arxiv'24] DeVAIC: A Tool for Security Assessment of AI-generated Code

[arxiv'24] Constrained Decoding for Secure Code Generation

[arxiv'24] Enhancing Security of AI-Based Code Synthesis with GitHub Copilot via Cheap and Efficient Prompt-Engineering

[arxiv'24] Just another copy and paste? Comparing the security vulnerabilities of ChatGPT generated code and StackOverflow answers

[ICSE-SEIP '24] PrivacyCAT: Privacy-Aware Code Analysis at Scale

Packages