sfz is an experimental project exploring whether AI can generate fuzzing wordlists in real time instead of relying on large, static lists.
The original idea came from my previous project: https://github.com/ayushkr12/sfz
In that version, the workflow was:
- Crawl the target URL.
- Extract words and paths from the crawl.
- Generate a wordlist from those results.
- create smarter fuzzing entry points.
- Feed it into
ffuf.
That worked, but using generic wordlists was slow and often inefficient. This project was an attempt to go one step further: let an AI infer likely endpoint structures directly from a FUZZ URL pattern and generate the wordlist dynamically.
The idea:
- Take a URL containing
FUZZ. - Ask a model to infer realistic endpoint hierarchies.
- Generate a wordlist on the fly and immediately fuzz with it.
In practice, this approach ran into three main issues:
-
Speed Real-time generation is too slow for most fuzzing workflows. Even local models introduce noticeable latency compared to static wordlists.
-
External APIs Using hosted models is not practical for fuzzing at scale due to rate limits and credit/cost constraints.
-
Local Models (Ollama)
- Larger models were still slow for tight fuzzing loops.
- Smaller models (e.g.,
qwen2.5-coder:1.5b) were fast enough but not capable enough, often producing low-quality or meaningless tokens instead of useful wordlists.
That said, this approach may be viable on high-end machines capable of running larger to medium parameter models (e.g., 7B) at decent speeds.
TODO:
- Explore more efficient prompt engineering techniques to improve output quality.
- Add more target awareness to the system prompt such as business model and industry context possibly let the model automate this as well?