Commit Graph

204 Commits (master)
 

Author SHA1 Message Date
Yuri Baburov 73c598df81 Updated version to 0.8.1.1 4 years ago
Yuri Baburov 67f46604dd A fix for mac version test. 4 years ago
Yuri Baburov 449dc2066b Releases packaging improvements. 4 years ago
Yuri Baburov bb81dc2c74
Merge pull request #148 from nabinkhadka/master
Changed log level of doc candidates from .info to .debug. #141
4 years ago
Nabin Khadka 531ecc7a29
Changes log level #141 4 years ago
Yuri Baburov 1e3b8504bb
Merge pull request #147 from anekos/fix/UnicodeDecodeError-on-python2
Fix UnicodeDecodeError on python2
4 years ago
anekos 667114463d Fix UnicodeDecodeError on python2 4 years ago
Yuri Baburov e4a699bbb0
Update README.rst 4 years ago
Yuri Baburov 1997b80eaf
Update __init__.py 4 years ago
Yuri Baburov 9e0fb6ec77
Update README.rst 4 years ago
Yuri Baburov a4dbaee02e
Merge pull request #145 from anekos/fix/causing-lxml-error
Fixed lxml error on some Chinese texts.
4 years ago
anekos 6842ea906e Fix causing lxml error 4 years ago
Yuri Baburov ede4d015ab
Merge pull request #142 from tim77/add-license-1
Add LICENSE file
4 years ago
Artem Polishchuk 5916527898
Add LICENSE file 4 years ago
Yuri Baburov 5800210e99
Merge pull request #136 from adbar/master
add coverage testing
4 years ago
Yuri Baburov 4b864d6306
Merge pull request #131 from azmeuk/black
Used black to format the code
4 years ago
Adrien Barbaresi 14d4474f33 add coverage tests 4 years ago
Éloi Rivard e9acdd091b Use black to format the code 4 years ago
Yuri Baburov 5a74140fdb
Merge pull request #132 from azmeuk/readme
Syntax highlight the README
4 years ago
Yuri Baburov 07f6861ece
Merge pull request #135 from adbar/master
unnecessary imports removed
added lines for conformity and readability
linted code parts
4 years ago
Adrien Barbaresi bd8293eb63 code linting 4 years ago
Yuri Baburov 17ffad5a26
Merge pull request #134 from adbar/patch-1
Extended travis config:
 - Python versions added (3.9, pypy)
 - OS added (MacOS, 2 different versions)
4 years ago
Yuri Baburov baf03e0d8e
Update .travis.yml 4 years ago
Yuri Baburov 8c122cc862
Update .travis.yml 4 years ago
Yuri Baburov 28db33a1ad
Update .travis.yml 4 years ago
Yuri Baburov 44ee1c4a87
Update .travis.yml 4 years ago
Adrien Barbaresi 9a85102555
Set TOXENV for macOS tests 4 years ago
Adrien Barbaresi 8ea6a20e01
Skip missing interpreters in tox.ini 4 years ago
Adrien Barbaresi a98151e6dd
Extended travis config
- Python versions added (3.9, pypy)
- OS added (MacOS, 2 different versions)
4 years ago
Éloi Rivard 0556abb794 Syntax highlight the README 4 years ago
Yuri Baburov 615ce803c6
Merge pull request #124 from dariobig/patch-1
Catch LookupError in case of bad encoding string
4 years ago
Yuri Baburov 52f767c812
Update __init__.py 4 years ago
Yuri Baburov c24808fbb2
Update README.rst 4 years ago
Yuri Baburov da9e285f73
Merge pull request #128 from azmeuk/self-closing
Replaced XHTML output with HTML5 output in summary for empty elements (a, br), issue #125
4 years ago
Yuri Baburov 5032e2d3ab
Merge pull request #127 from azmeuk/warnings
Fixed a few regex warnings, thanks azmeuk !
4 years ago
Yuri Baburov 471d89dde9
Merge pull request #126 from azmeuk/py38
Added official python 3.8 support, dropped python 3.4 support.
Thanks Éloi Rivard (@azmeuk) !
4 years ago
Yuri Baburov 4980b0c141
Merge branch 'master' into py38 4 years ago
Yuri Baburov 331b58ef50
Merge pull request #129 from azmeuk/doc
Added basic documentation
4 years ago
Éloi Rivard f9977b727d Documentation draft 4 years ago
Éloi Rivard 0846955dd7 Fixed issue with self-closing tags. Fix #125 4 years ago
Éloi Rivard 6c1c6391e2 Fixed a few regex warnings 4 years ago
Éloi Rivard 326fb43b4c Drop support for python 3.4 - Add support for python 3.8 4 years ago
Dario 0442358942
Catch LookupError in case of bad encoding string
I've seen cases where bad encoding strings will result in errors, catching LookupError should solve the problem by falling back onto `chardet` or `utf-8`

Here's one case:

```
 textPayload: "Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/readability/readability.py", line 189, in summary
    self._html(True)
  File "/opt/conda/lib/python3.7/site-packages/readability/readability.py", line 132, in _html
    self.html = self._parse(self.input)
  File "/opt/conda/lib/python3.7/site-packages/readability/readability.py", line 141, in _parse
    doc, self.encoding = build_doc(input)
  File "/opt/conda/lib/python3.7/site-packages/readability/htmls.py", line 17, in build_doc
    encoding = get_encoding(page) or 'utf-8'
  File "/opt/conda/lib/python3.7/site-packages/readability/encoding.py", line 46, in get_encoding
    page.decode(encoding)
LookupError: unknown encoding: utf-8, ie=edge, chrome=1
```
5 years ago
Yuri Baburov de20908e57
Update README.rst 5 years ago
Yuri Baburov 4fa85d2778
Merge pull request #116 from baby5/master
Fixed compile_pattern to support uppercase.
5 years ago
baby5 0ac3c5bbc6 Fix compile_pattern not support uppercase 5 years ago
Yuri Baburov a4ac1c7704
Merge pull request #115 from johnklee/Issue99
Fix #99 - Hiding exception raised during "a href" normalization, added handle_failures parameter defaulting to "discard" bad urls.
5 years ago
jkclee bac691a0a4 Fix #99 5 years ago
Yuri Baburov 3cbede6be4
Update README.rst 5 years ago
Yuri Baburov d40c4dd34d
Update README.rst 6 years ago