Carlton Brewster, CISSP, CEH, B.Sc’s Post

From Chatbots to Cyberattacks: LLMs Begin Hacking Websites Autonomously

Assistant professor at UIUC CS

8mo

We recently showed that LLM agents can autonomously hack mock websites, but can they exploit real-world vulnerabilities? In our new work, we created LLM agents that can autonomously exploit one-day vulnerabilities. Only GPT-4 succeeds, while other models and open-source vulnerability scanners fail. One-day vulnerabilities are vulnerabilities that have been disclosed but not yet patched in a system. These vulnerabilities can have real-world implications, especially in hard-to-patch environments. We constructed a benchmark of 15 real-world vulnerabilities. These vulnerabilities span types (web, container management, Python package) and include those of high and critical severity. GPT-4 can exploit 87% of the vulnerabilities in our benchmark, but every other model and open-source vulnerability scanner (ZAP, Metasploit) we tested achieves 0%. We hope that our findings encourage the deployers and developers of LLMs to consider their dual-use nature! Paper: https://lnkd.in/eWTPX-Uh Medium: https://lnkd.in/ey6X7CjP

LLM Agents can Autonomously Exploit One-day Vulnerabilities

arxiv.org

To view or add a comment, sign in

More Relevant Posts

Daniel Kang

Assistant professor at UIUC CS
8mo
Report this post
We recently showed that LLM agents can autonomously hack mock websites, but can they exploit real-world vulnerabilities? In our new work, we created LLM agents that can autonomously exploit one-day vulnerabilities. Only GPT-4 succeeds, while other models and open-source vulnerability scanners fail. One-day vulnerabilities are vulnerabilities that have been disclosed but not yet patched in a system. These vulnerabilities can have real-world implications, especially in hard-to-patch environments. We constructed a benchmark of 15 real-world vulnerabilities. These vulnerabilities span types (web, container management, Python package) and include those of high and critical severity. GPT-4 can exploit 87% of the vulnerabilities in our benchmark, but every other model and open-source vulnerability scanner (ZAP, Metasploit) we tested achieves 0%. We hope that our findings encourage the deployers and developers of LLMs to consider their dual-use nature! Paper: https://lnkd.in/eWTPX-Uh Medium: https://lnkd.in/ey6X7CjP

LLM Agents can Autonomously Exploit One-day Vulnerabilities

arxiv.org

6 Comments
Like Comment
To view or add a comment, sign in
Jamey Hinkelman

MSc* Program Manager | Active Security Clearance | Sec+ | Google Cybersecurity Professional | USAF Veteran
6mo
Report this post
Day 22 exercise 3: TryHackMe Challenge, Completing Daily Challenges until I'm in the Top 1% of Hackers on TryhackMe. Vulnerability Capstone In the Capstone, I had to conduct a Remote Code Execution on given target. I started with researching the exploitable vulnerabilities of the website based on the server. After some failed attempts of exploitable code, I stumbled upon some code that worked. Succesful code:>> https://lnkd.in/ecp9NdyV I used a script of python that was Remote Code Execution to set up a reverse shell with netcat on port 8787 with the target to gain a foothold. After searching around the database, I found the flag I needed. Flag - THM{ACKME_BLOG_HACKED}

TryHackMe | Cyber Security Training

tryhackme.com

4 Comments
Like Comment
To view or add a comment, sign in
Ashwin Chhetri

Information Security Engineer@SOPHOS | Cyber Security Enthusiast | CTF Player | CEH v12
5mo
Report this post
🔍Excited to introduce my latest project: Real-Time CVE Information Extractor 🛡️ In today's digital landscape, staying ahead of vulnerabilities is critical. That's why I've developed a tool that fetches and compiles up-to-date CVE details from the EPSSS API and NVD API. This ensures organizations can promptly assess and mitigate potential risks to their systems and data. What it Does: 🚀 Keeps You Updated: Automatically gathers the latest vulnerability information from the past 24 hours. 📊 Provides Complete Insights: Extracts all crucial details from CVE entries, like severity scores, descriptions, and references. Behind the Scenes: Using Python, I've built a robust backend that handles data efficiently and securely. This ensures reliable performance while safeguarding sensitive information. Why it Matters: My goal is to empower organizations with timely and comprehensive CVE data, enabling them to make informed decisions and protect their digital assets effectively. 👨💻 I'm passionate about improving cybersecurity resilience. Interested in learning more or collaborating? Let's connect and discuss how we can strengthen our defenses together. #Cybersecurity #CVE #DataProtection #APIIntegration #Python #InformationSecurity https://lnkd.in/dD-FPQ3C

GitHub - Ashwinchhetri/Real-Time-Vuln-Feed: A script for fetching real-time vulnerabilities with all necessary details.

github.com

2 Comments
Like Comment
To view or add a comment, sign in
John D. Johnson

CISO, CTO, CEO, Board Member, Community Builder - Cybersecurity, IoT/OT, AI, Blockchain, Quantum Computing and advancing technology - PhD, CISSP, CRISC, SMIEEE, SMISSA, Board certified technical expert (DDN QTE)
7mo
Report this post
No, LLM Agents can not Autonomously Exploit One-day Vulnerabilities -- This blog post takes a critical look at an academic paper that received widespread media attention which claimed to have built an LLM Agent that can exploit one-day vulnerabilities. The author points out that the paper included an agent that had Internet access and that the authors of the paper chose CVEs that had very detailed exploits and PoCs available. The post concludes that the paper demonstrated the capabilities of GPT-4 as an intelligent scanner and crawler but based on the data provided didn't demonstrate its ability to rediscover these vulnerabilities or generate novel exploit code. https://lnkd.in/gSasyK2s

Root Cause

struct.github.io
Like Comment
To view or add a comment, sign in
Vusala Alakbarova

Senior Application Security Specialist | Penetration Tester | OSWE | Security Researcher | Ethical Hacking and Programming Instructor
4mo
Report this post
Hello everyone, I'm excited to share a new hacking tool I've developed: ParaMutator. This Python-based fuzzer is designed to send various payloads to API entry points, proxying them through Burp Suite to reveal potential error messages and anomalies. ParaMutator aims to help penetration testers save time by automating some repetitive tasks during API security testing. Key Features: - Provide a list of URLs or request details in JSON format. - All requests are proxied through Burp Suite for further analysis. - Handles a wide range of payloads, from emojis to overlong UTF-8 encoding. - Customize headers to suit your testing needs. - Skip a request at any time. Currently, ParaMutator tests only URL and body parameters. Future enhancements will include support for headers and more advanced testing scenarios. Feel free to explore this tool and share your insights with me. Happy hacking! https://lnkd.in/eJfqu492 #cybersecurity #fuzzing #python

GitHub - vuusale/ParaMutator: API fuzzer that exposes security flaws by sending malformed inputs

github.com

6 Comments
Like Comment
To view or add a comment, sign in
Gianluca Varisco

Security @ Google Cloud
2mo
Report this post
This blog post delves into the analysis of a control flow obfuscation technique employed by recent LummaC2 (LUMMAC.V2) stealer samples. In addition to the traditional control flow flattening technique used in older versions, the malware now leverages customized control flow indirection to manipulate the execution of the malware. This technique thwarts all binary analysis tools including IDA Pro and Ghidra, significantly hindering not only the reverse engineering process, but also automation tooling designed to capture execution artifacts and generate detections. To provide insights to Google and Mandiant security teams, we developed an automated method for removing this protection layer through symbolic backward slicing. By leveraging the recovered control flow, we are able to rebuild and deobfuscate the samples into a format readily consumable for any static binary analysis platform.

LummaC2: Obfuscation Through Indirect Control Flow | Google Cloud Blog

cloud.google.com
Like Comment
To view or add a comment, sign in
Anurag Tiwari

Advisory Analyst - Trainee @DeloitteUSI | Cybersec Student |
8mo
Report this post
Just completed this room at TryHackMe This was based on AJP Ghostcat (CVE-2020-1938). Got to learn few new things: - Nmap scan gave ssh, port 8009 and port 8080 were open. - Port 8009 was having Apache Jserv v1.3 running. - Searched for its public exploit on Google and found one at exploit-db. - Unfortuately, it was based on python 2, which is depricated.😒 - Little more research lead to a github repo which has the exploit based on python3.😁 - It reads the data present in the file: "WEB-INF/web.xml". It gave the credentials which lead to initial access. - On logging via ssh, no further priv esc method worked. (Even linpeas and exploit-suggestor couldn't find anything) - It had a .PGP and .ASC file is there. (I was clueless what to do with it.) - Searched for it on Google and found that .PGP file may consists of credentials and .ASC file is the key to that .PGP file. So, I transferred it onto my local machine via scp. - Used johntheripper to decrypt the file and finally got the credentials. - For privilege escalation, "sudo -l" showed that /usr/bin/zip can be run as root. - Immediately headed on to GTFOBins and serached for "zip sudo". Executing those cmd gave the root shell.😀 #ethicalhacking #cybersecurity #infosecjourney #iaspire100 #cyberfrat CyberFrat #learningcontinues #learningandgrowing

tomghost

tryhackme.com
Like Comment
To view or add a comment, sign in
Kimberly Nyenga

Cybersecurity Analyst | Information Security Consultant | Ethical hacker in training
5mo
Report this post
Completed the File inclusion room on TryHackMe, diving into vulnerabilities like Local File Inclusion (LFI), Remote File Inclusion (RFI), and directory traversal. File inclusion vulnerabilities are frequently exploited in web applications through programming languages like PHP when they are poorly implemented, allowing attackers to leak data or gain Remote Command Execution. Also explored Path Traversal vulnerabilities arising when user input is mishandled. Addressing Local File Inclusion attacks is crucial, often stemming from a lack of security awareness among developers. Moreover, Remote File Inclusion involves injecting external URLs due to improper input sanitization. Understanding Remote Code Execution is vital for preventing attackers from executing arbitrary code. Explored the remediation process, emphasizing the significance of recognizing and preventing web application vulnerabilities. Excited to continue learning in this field!

TryHackMe | Cyber Security Training

tryhackme.com
Like Comment
To view or add a comment, sign in
Poseidon

522 followers
9mo
Report this post
Quicmap: Fast, open-source QUIC protocol scanner: Quicmap is a fast, open-source QUIC service scanner that streamlines the process by eliminating multiple tool requirements. It effectively identifies QUIC services, the protocol version, and the supported ALPNs. “As I started researching the QUIC protocol, I noticed that my favorite scanner had issues identifying QUIC-enabled services. This is not too surprising, as QUIC used UDP, and anyone who has scanned UDP services knows how difficult this is. I wanted to have a simple tool … More → The post Quicmap: Fast, open-source QUIC protocol scanner appeared first on Help Net Security. @Poseidon-US #HelpNetSecurity #Cybersecurity

Quicmap: Fast, open-source QUIC protocol scanner - Help Net Security

https://www.helpnetsecurity.com
Like Comment
To view or add a comment, sign in
James Quilty

Global Enterprise Cybersecurity Strategies & Solutions Consultant
9mo
Report this post
Quicmap: Fast, open-source QUIC protocol scanner: Quicmap is a fast, open-source QUIC service scanner that streamlines the process by eliminating multiple tool requirements. It effectively identifies QUIC services, the protocol version, and the supported ALPNs. “As I started researching the QUIC protocol, I noticed that my favorite scanner had issues identifying QUIC-enabled services. This is not too surprising, as QUIC used UDP, and anyone who has scanned UDP services knows how difficult this is. I wanted to have a simple tool … More → The post Quicmap: Fast, open-source QUIC protocol scanner appeared first on Help Net Security. #HelpNetSecurity #Cybersecurity

Quicmap: Fast, open-source QUIC protocol scanner - Help Net Security

https://www.helpnetsecurity.com
Like Comment
To view or add a comment, sign in

249 followers

View Profile Follow

Carlton Brewster, CISSP, CEH, B.Sc’s Post

LLM Agents can Autonomously Exploit One-day Vulnerabilities

arxiv.org

More from this author

Raccoon Credential Stealer Yara Rule

Brewster Resume

Explore topics