anekos
6842ea906e
Fix causing lxml error
4 years ago
Yuri Baburov
ede4d015ab
Merge pull request #142 from tim77/add-license-1
...
Add LICENSE file
4 years ago
Artem Polishchuk
5916527898
Add LICENSE file
4 years ago
Yuri Baburov
5800210e99
Merge pull request #136 from adbar/master
...
add coverage testing
4 years ago
Yuri Baburov
4b864d6306
Merge pull request #131 from azmeuk/black
...
Used black to format the code
4 years ago
Adrien Barbaresi
14d4474f33
add coverage tests
4 years ago
Éloi Rivard
e9acdd091b
Use black to format the code
4 years ago
Yuri Baburov
5a74140fdb
Merge pull request #132 from azmeuk/readme
...
Syntax highlight the README
4 years ago
Yuri Baburov
07f6861ece
Merge pull request #135 from adbar/master
...
unnecessary imports removed
added lines for conformity and readability
linted code parts
4 years ago
Adrien Barbaresi
bd8293eb63
code linting
4 years ago
Yuri Baburov
17ffad5a26
Merge pull request #134 from adbar/patch-1
...
Extended travis config:
- Python versions added (3.9, pypy)
- OS added (MacOS, 2 different versions)
4 years ago
Yuri Baburov
baf03e0d8e
Update .travis.yml
4 years ago
Yuri Baburov
8c122cc862
Update .travis.yml
4 years ago
Yuri Baburov
28db33a1ad
Update .travis.yml
4 years ago
Yuri Baburov
44ee1c4a87
Update .travis.yml
4 years ago
Adrien Barbaresi
9a85102555
Set TOXENV for macOS tests
4 years ago
Adrien Barbaresi
8ea6a20e01
Skip missing interpreters in tox.ini
4 years ago
Adrien Barbaresi
a98151e6dd
Extended travis config
...
- Python versions added (3.9, pypy)
- OS added (MacOS, 2 different versions)
4 years ago
Éloi Rivard
0556abb794
Syntax highlight the README
4 years ago
Yuri Baburov
615ce803c6
Merge pull request #124 from dariobig/patch-1
...
Catch LookupError in case of bad encoding string
4 years ago
Yuri Baburov
52f767c812
Update __init__.py
4 years ago
Yuri Baburov
c24808fbb2
Update README.rst
4 years ago
Yuri Baburov
da9e285f73
Merge pull request #128 from azmeuk/self-closing
...
Replaced XHTML output with HTML5 output in summary for empty elements (a, br), issue #125
4 years ago
Yuri Baburov
5032e2d3ab
Merge pull request #127 from azmeuk/warnings
...
Fixed a few regex warnings, thanks azmeuk !
4 years ago
Yuri Baburov
471d89dde9
Merge pull request #126 from azmeuk/py38
...
Added official python 3.8 support, dropped python 3.4 support.
Thanks Éloi Rivard (@azmeuk) !
4 years ago
Yuri Baburov
4980b0c141
Merge branch 'master' into py38
4 years ago
Yuri Baburov
331b58ef50
Merge pull request #129 from azmeuk/doc
...
Added basic documentation
4 years ago
Éloi Rivard
f9977b727d
Documentation draft
4 years ago
Éloi Rivard
0846955dd7
Fixed issue with self-closing tags. Fix #125
4 years ago
Éloi Rivard
6c1c6391e2
Fixed a few regex warnings
4 years ago
Éloi Rivard
326fb43b4c
Drop support for python 3.4 - Add support for python 3.8
4 years ago
Dario
0442358942
Catch LookupError in case of bad encoding string
...
I've seen cases where bad encoding strings will result in errors, catching LookupError should solve the problem by falling back onto `chardet` or `utf-8`
Here's one case:
```
textPayload: "Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/readability/readability.py", line 189, in summary
self._html(True)
File "/opt/conda/lib/python3.7/site-packages/readability/readability.py", line 132, in _html
self.html = self._parse(self.input)
File "/opt/conda/lib/python3.7/site-packages/readability/readability.py", line 141, in _parse
doc, self.encoding = build_doc(input)
File "/opt/conda/lib/python3.7/site-packages/readability/htmls.py", line 17, in build_doc
encoding = get_encoding(page) or 'utf-8'
File "/opt/conda/lib/python3.7/site-packages/readability/encoding.py", line 46, in get_encoding
page.decode(encoding)
LookupError: unknown encoding: utf-8, ie=edge, chrome=1
```
5 years ago
Yuri Baburov
de20908e57
Update README.rst
5 years ago
Yuri Baburov
4fa85d2778
Merge pull request #116 from baby5/master
...
Fixed compile_pattern to support uppercase.
5 years ago
baby5
0ac3c5bbc6
Fix compile_pattern not support uppercase
5 years ago
Yuri Baburov
a4ac1c7704
Merge pull request #115 from johnklee/Issue99
...
Fix #99 - Hiding exception raised during "a href" normalization, added handle_failures parameter defaulting to "discard" bad urls.
5 years ago
jkclee
bac691a0a4
Fix #99
5 years ago
Yuri Baburov
3cbede6be4
Update README.rst
6 years ago
Yuri Baburov
d40c4dd34d
Update README.rst
6 years ago
Yuri Baburov
9aba330e68
Update README.rst
6 years ago
Yuri Baburov
0b28643f0d
Update README.rst
6 years ago
Yuri Baburov
59b99ffa0b
Merge pull request #105 from pypt/many_repeated_spaces_timeout
...
Trim many repeated spaces to make clean() faster
6 years ago
Yuri Baburov
494b19ed4e
Merge branch 'master' into many_repeated_spaces_timeout
6 years ago
Yuri Baburov
dca6e2197a
Merge pull request #107 from pypt/module_version_constant
...
Add __version__ constant to __init__.py, read it in setup.py
6 years ago
Yuri Baburov
5215ab657b
Merge pull request #106 from pypt/python_3_7
...
Improvements for Python 3.7 support and CI
6 years ago
Linas Valiukas
68fb5ad4c0
Try a workaround to make build work on 3.7
...
https://github.com/travis-ci/travis-ci/issues/9815
6 years ago
Linas Valiukas
34fce7664d
Update Python version in .travis.yml
6 years ago
Linas Valiukas
0233936e72
Add __version__ constant to __init__.py, read it in setup.py
...
Users wouldn't need to install, import and use Pip ("pkg_resources") to
find out which version of readability-lxml is being used.
6 years ago
Linas Valiukas
63fbc36cb8
Close sample input file after reading it
...
Otherwise tests spit out:
ResourceWarning: unclosed file <_io.TextIOWrapper name='/Users/pypt/Dropbox/etc-MediaCloud/python-readability/tests/samples/si-game.sample.html' mode='r' encoding='UTF-8'>
return open(os.path.join(SAMPLES, filename)).read()
6 years ago
Linas Valiukas
bdb6d671d8
Test with Python 3.7 on Travis
6 years ago