docs: Add parsing custom HTML to README.md (#326)

5 years ago · da9606a4cb
parent b3e2a0ffd1
commit da9606a4cb
1 changed files with 15 additions and 0 deletions
--- a/README.md
+++ b/README.md
@ -64,6 +64,8 @@ If Mercury is unable to find a field, that field will return `null`.

 #### `parse()` Options

+##### Content Formats
+
 By default, Mercury Parser returns the `content` field as HTML. However, you can override this behavior by passing in options to the `parse` function, specifying whether or not to scrape all pages of an article, and what type of output to return (valid values are `'html'`, `'markdown'`, and `'text'`). For example:

 ```javascript
@ -78,6 +80,19 @@ This returns the the page's `content` as GitHub-flavored Markdown:
 "content": "...**Thunder** is the [stage name](https://en.wikipedia.org/wiki/Stage_name) for the..."
 ```

+##### Pre-fetched HTML
+
+You can use Mercury Parser to parse custom or pre-fetched HTML by passing an HTML string to the `parse` function as follows:
+
+```javascript
+Mercury.parse(url, {
+  html:
+    '<html><body><article><h1>Thunder (mascot)</h1><p>Thunder is the stage name for the horse who is the official live animal mascot for the Denver Broncos</p></article></body></html>',
+}).then(result => console.log(result));
+```
+
+Note that the URL argument is still supplied, in order to identify the web site and use its custom parser, if it has any, though it will not be used for fetching content.
+
 #### The command-line parser

 Mercury Parser also ships with a CLI, meaning you can use the Mercury Parser