htmlq
Like jq, but for HTML. Uses CSS selectors to extract bits of content from HTML files.
Usage
Usage: htmlq [FLAGS] [OPTIONS] [--] [selector]...
Options
| Option | Description |
|---|---|
-B, --detect-base | Try to detect the base URL from the <base> tag in the document. If not found, default to the value of --base, if supplied |
-w, --ignore-whitespace | When printing text nodes, ignore those that consist entirely of whitespace |
-p, --pretty | Pretty-print the serialised output |
-t, --text | Output only the contents of text nodes inside selected elements |
-a, --attribute <attribute> | Only return this attribute (if present) from selected elements |
-b, --base <base> | Use this URL as the base for links |
-f, --filename <FILE> | The input file. Defaults to stdin |
-o, --output <FILE> | The output file. Defaults to stdout |
-r, --remove-nodes <SELECTOR>... | Remove nodes matching this expression before output. May be specified multiple times |