GitHub - princeton-nlp/SWE-agent: [NeurIPS 2024] SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges.

Documentation | Discord | Paper | EnIGMA preprint

SWE-agent turns LMs (e.g. GPT-4) into software engineering agents that can resolve issues in real GitHub repositories and more.

On SWE-bench, SWE-agent resolves 12.47% of issues of the full test set and 23% of issues of SWE-bench lite. SWE-agent EnIGMA solves more than 3x more challenges of the offensive cybersecurity NYU CTF benchmark than the previous SOTA agent.

We accomplish our results by designing simple LM-centric commands and feedback formats to make it easier for the LM to browse the repository, view, edit and execute code files. We call this an Agent-Computer Interface (ACI). Read more about it in our paper!

SWE-agent is built and maintained by researchers from Princeton University.

🚀 Get started!

👉 Try SWE-agent in your browser: (more information)

Read our documentation to learn more:

Our most recent lecture touches on the project's motivation, showcases our research findings and provides a hands-on tutorial on how to install, use, and configure SWE-agent:

🕵️ SWE-agent for offensive cybersecurity (EnIGMA)

SWE-agent: EnIGMA is a mode for solving offensive cybersecurity (capture the flag) challenges. EnIGMA achieves state-of-the-art results on multiple cybersecurity benchmarks (see leaderboard). The EnIGMA project introduced multiple features that are available in all modes of SWE-agent, such as the debugger and server connection tools and a summarizer to handle long outputs.

💫 Contributions

If you'd like to ask questions, learn about upcoming features, and participate in future development, join our Discord community!
If you'd like to contribute to the codebase, we welcome issues and pull requests!

Contact person: John Yang and Carlos E. Jimenez (Email: [email protected], [email protected]).

📝 Citation

If you found this work helpful, please consider citing it using the following:

@misc{yang2024sweagent,
      title={SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering},
      author={John Yang and Carlos E. Jimenez and Alexander Wettig and Kilian Lieret and Shunyu Yao and Karthik Narasimhan and Ofir Press},
      year={2024},
      eprint={2405.15793},
      archivePrefix={arXiv},
      primaryClass={cs.SE}
}

If you used the summarizer, interactive commands or the offensive cybersecurity capabilities in SWE-agent, please also consider citing:

@misc{abramovich2024enigmaenhancedinteractivegenerative,
      title={EnIGMA: Enhanced Interactive Generative Model Agent for CTF Challenges},
      author={Talor Abramovich and Meet Udeshi and Minghao Shao and Kilian Lieret and Haoran Xi and Kimberly Milner and Sofija Jancheska and John Yang and Carlos E. Jimenez and Farshad Khorrami and Prashanth Krishnamurthy and Brendan Dolan-Gavitt and Muhammad Shafique and Karthik Narasimhan and Ramesh Karri and Ofir Press},
      year={2024},
      eprint={2409.16165},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2409.16165},
}

🪪 License

MIT. Check LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 1,042 Commits
.devcontainer		.devcontainer
.github		.github
assets		assets
config		config
docker		docker
docs		docs
inspector		inspector
make_demos		make_demos
scripts		scripts
sweagent		sweagent
tests		tests
trajectories		trajectories
.dockerignore		.dockerignore
.git-blame-ignore-revs		.git-blame-ignore-revs
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
build_deploy.sh		build_deploy.sh
codecov.yml		codecov.yml
environment.yml		environment.yml
mkdocs.yml		mkdocs.yml
mlc_config.json		mlc_config.json
pyproject.toml		pyproject.toml
release_dockerhub.sh		release_dockerhub.sh
requirements.txt		requirements.txt
run.py		run.py
run_replay.py		run_replay.py
setup.sh		setup.sh
setup_ctf.sh		setup_ctf.sh
start_web_ui.sh		start_web_ui.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 Get started!

🕵️ SWE-agent for offensive cybersecurity (EnIGMA)

💫 Contributions

📝 Citation

🪪 License

About

Releases 7

Packages

Contributors 61

Languages

License

princeton-nlp/SWE-agent

Folders and files

Latest commit

History

Repository files navigation

🚀 Get started!

🕵️ SWE-agent for offensive cybersecurity (EnIGMA)

💫 Contributions

📝 Citation

🪪 License

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 7

Packages 0

Contributors 61

Languages

Packages