docs: Add parsing custom HTML to README.md (#326)

pull/336/head^2
Toufic Mouallem 5 years ago committed by Adam Pash
parent b3e2a0ffd1
commit da9606a4cb

@ -64,6 +64,8 @@ If Mercury is unable to find a field, that field will return `null`.
#### `parse()` Options
##### Content Formats
By default, Mercury Parser returns the `content` field as HTML. However, you can override this behavior by passing in options to the `parse` function, specifying whether or not to scrape all pages of an article, and what type of output to return (valid values are `'html'`, `'markdown'`, and `'text'`). For example:
```javascript
@ -78,6 +80,19 @@ This returns the the page's `content` as GitHub-flavored Markdown:
"content": "...**Thunder** is the [stage name](https://en.wikipedia.org/wiki/Stage_name) for the..."
```
##### Pre-fetched HTML
You can use Mercury Parser to parse custom or pre-fetched HTML by passing an HTML string to the `parse` function as follows:
```javascript
Mercury.parse(url, {
html:
'<html><body><article><h1>Thunder (mascot)</h1><p>Thunder is the stage name for the horse who is the official live animal mascot for the Denver Broncos</p></article></body></html>',
}).then(result => console.log(result));
```
Note that the URL argument is still supplied, in order to identify the web site and use its custom parser, if it has any, though it will not be used for fetching content.
#### The command-line parser
Mercury Parser also ships with a CLI, meaning you can use the Mercury Parser

Loading…
Cancel
Save