Photon
Incredibly fast crawler designed for OSINT.
Features
Photon can extract the following data while crawling:
- URLs (in-scope & out-of-scope)
- URLs with parameters (example.com/gallery.php?id=2)
- Intel (emails, social media accounts, amazon buckets etc.)
- Files (pdf, png, xml etc.)
- Secret keys (auth/API keys & hashes)
- JavaScript files & Endpoints present in them
- Strings matching custom regex pattern
- Subdomains & DNS related data
Usage
Usage: photon [options]
| Option | Description |
|---|---|
-u, --url | root url |
-l, --level | levels to crawl |
-t, --threads | number of threads |
-d, --delay | delay between requests |
-c, --cookie | cookie |
-r, --regex | regex pattern |
-s, --seeds | additional seed urls |
-e, --export | export formatted result |
-o, --output | specify output directory |
-v, --verbose | verbose output |
--keys | extract secret keys |
--clone | clone the website locally |
--exclude | exclude urls by regex |
--stdout | print a variable to stdout |
--timeout | http requests timeout |
--ninja | ninja mode |
--update | update photon |
--headers | supply http headers |
--dns | enumerate subdomains & dns data |
--only-urls | only extract urls |
--wayback | Use URLs from archive.org as seeds |
--user-agent | specify user-agent(s) |
Examples
Crawl a Single Website
Option: -u or --url
Crawl a single website.
python photon.py -u "http://example.com"Clone the Website Locally
Option: --clone
The crawled webpages can be saved locally for later use.
python photon.py -u "http://example.com" --cloneDepth of Crawling
Option: -l or --level
Default: 2
Set recursion limit for crawling. A depth of 2 means Photon will find URLs from the homepage (level 1), then crawl those pages as well (level 2).
python photon.py -u "http://example.com" -l 3Number of Threads
Option: -t or --threads
Default: 2
Specify the number of concurrent requests to make. Be cautious — higher values can trigger security mechanisms or overwhelm small sites.
python photon.py -u "http://example.com" -t 10Delay Between Each HTTP Request
Option: -d or --delay
Default: 0
Delay (in seconds) between each HTTP(S) request.
python photon.py -u "http://example.com" -d 2Timeout
Option: --timeout
Default: 5
Time (in seconds) to wait before considering a request timed out.
python photon.py -u "http://example.com" --timeout=4Cookies
Option: -c or --cookies
Default: No cookie header sent
Add a Cookie header to HTTP requests. Useful for authenticated sessions.
python photon.py -u "http://example.com" -c "PHPSESSID=u5423d78fqbaju9a0qke25ca87"Specify Output Directory
Option: -o or --output
Default: Domain name of the target
Override the default output directory.
python photon.py -u "http://example.com" -o "mydir"Verbose Output
Option: -v or --verbose
Show all discovered items (pages, keys, files, etc.) in real time.
python photon.py -u "http://example.com" -vExclude Specific URLs
Option: --exclude
Exclude URLs matching the provided regex pattern.
python photon.py -u "http://example.com" --exclude="/blog/20[17|18]"Specify Seed URL(s)
Option: -s or --seeds
Add custom seed URLs, separated by commas.
python photon.py -u "http://example.com" --seeds "http://example.com/blog/2018,http://example.com/portals.html"Specify User-Agent(s)
Option: --user-agent
Default: Entries from user-agents.txt
Set custom user-agent(s), separated by commas.
python photon.py -u "http://example.com" --user-agent "curl/7.35.0,Wget/1.15 (linux-gnu)"This helps simulate different clients without editing the default file.
Custom Regex Pattern
Option: -r or --regex
Extract strings during crawling by providing a regex pattern.
python photon.py -u "http://example.com" --regex "\d{10}"Export Formatted Result
Option: -e or --export
Specify the output format for saved data.
python photon.py -u "http://example.com" --export=jsonSupported formats:
jsoncsv
Use URLs from Archive.org as Seeds
Option: --wayback
Fetch archived URLs from archive.org (only from the current year).
python photon.py -u "http://example.com" --waybackSkip Data Extraction
Option: --only-urls
Crawl only the URLs; skip data extraction like JS files or intel.
python photon.py -u "http://example.com" --only-urlsUpdate
Option: --update
Check for a newer version of Photon and update in place.
python photon.py --updateExtract Secret Keys
Option: --keys
Look for high-entropy strings that may be auth/API keys or hashes.
python photon.py -u http://example.com --keys