Gospider

Fast web spider written in Go

Usage

Usage: gospider [options]

Option	Description
`-s`, `--site` string	Site to crawl
`-S`, `--sites` string	Site list to crawl
`-p`, `--proxy` string	Proxy (Ex: http://127.0.0.1:8080)
`-o`, `--output` string	Output folder
`-u`, `--user-agent` string	User Agent to use (`web`: random web, `mobi`: random mobile, or custom)
`--cookie` string	Cookie to use (e.g., testA=a; testB=b)
`-H`, `--header` stringArray	Header to use (Use multiple flags to set multiple headers)
`--burp` string	Load headers and cookie from burp raw HTTP request
`--blacklist` string	Blacklist URL Regex
`--whitelist` string	Whitelist URL Regex
`--whitelist-domain` string	Whitelist Domain
`-L`, `--filter-length` string	Turn on length filter
`-t`, `--threads` int	Number of threads (Run sites in parallel) (default 1)
`-c`, `--concurrent` int	Max allowed concurrent requests per domain (default 5)
`-d`, `--depth` int	Recursion depth for visited URLs (0 = infinite) (default 1)
`-k`, `--delay` int	Delay before creating new request to matching domains (seconds)
`-K`, `--random-delay` int	Extra randomized delay added to base delay (seconds)
`-m`, `--timeout` int	Request timeout in seconds (default 10)
`-B`, `--base`	Disable all and only use HTML content
`--js`	Enable linkfinder in JavaScript files (default true)
`--sitemap`	Try to crawl sitemap.xml
`--robots`	Try to crawl robots.txt (default true)
`-a`, `--other-source`	Find URLs from 3rd parties (Archive.org, CommonCrawl, etc.)
`-w`, `--include-subs`	Include subdomains crawled from 3rd party sources
`-r`, `--include-other-source`	Also include other-source URLs (still crawled and requested)
`--subs`	Include subdomains
`--debug`	Turn on debug mode
`--json`	Enable JSON output
`-v`, `--verbose`	Turn on verbose
`-q`, `--quiet`	Suppress all output except URLs
`--no-redirect`	Disable redirects
`--version`	Check version
`-l`, `--length`	Turn on length
`-R`, `--raw`	Enable raw output

Examples

Quite output

gospider -q -s "https://google.com/"

Run with single site

gospider -s "https://google.com/" -o output -c 10 -d 1

Run with site list

gospider -S sites.txt -o output -c 10 -d 1

Run with 20 sites at the same time with 10 bot each site

gospider -S sites.txt -o output -c 10 -d 1 -t 20

Also get URLs from 3rd party (Archive.org, CommonCrawl.org, VirusTotal.com, AlienVault.com)

gospider -s "https://google.com/" -o output -c 10 -d 1 --other-source

Also get URLs from 3rd party (Archive.org, CommonCrawl.org, VirusTotal.com, AlienVault.com) and include subdomains

gospider -s "https://google.com/" -o output -c 10 -d 1 --other-source --include-subs

Use custom header/cookies

gospider -s "https://google.com/" -o output -c 10 -d 1 --other-source -H "Accept: */*" -H "Test: test" --cookie "testA=a; testB=b"

gospider -s "https://google.com/" -o output -c 10 -d 1 --other-source --burp burp_req.txt

Blacklist url/file extension.

P/s: gospider blacklisted .(jpg|jpeg|gif|css|tif|tiff|png|ttf|woff|woff2|ico) as default

gospider -s "https://google.com/" -o output -c 10 -d 1 --blacklist ".(woff|pdf)"

Show and Blacklist file length.

gospider -s "https://google.com/" -o output -c 10 -d 1 --length --filter-length "6871,24432"

Knowledge Base

Explorer

gospider

Gospider

Usage

Examples

Quite output

Run with single site

Run with site list

Run with 20 sites at the same time with 10 bot each site

Also get URLs from 3rd party (Archive.org, CommonCrawl.org, VirusTotal.com, AlienVault.com)

Also get URLs from 3rd party (Archive.org, CommonCrawl.org, VirusTotal.com, AlienVault.com) and include subdomains

Use custom header/cookies

Blacklist url/file extension.

Show and Blacklist file length.

Graph View

Table of Contents

Backlinks