-
509aed0d9f
Move the module into the readable_lxml space so that we can actually import it nicely.
Richard Harding
2012-04-17 21:49:59 -0400
-
273878214f
Move version to package
Richard Harding
2012-04-17 21:40:32 -0400
-
674e5f9ef2
Work on adding client.py to pull out cmd line code
Richard Harding
2012-04-17 21:39:33 -0400
-
1c1cbaefa5
Move the readability files into src dir
Richard Harding
2012-04-17 21:34:48 -0400
-
7e57767070
Update tests to src dir
Richard Harding
2012-04-17 21:29:04 -0400
-
62e153eaf8
Start to update the module for a better directory layout
Richard Harding
2012-04-17 21:28:25 -0400
-
d11b928504
Add credits file
Richard Harding
2012-04-17 14:37:34 -0400
-
a6361854a9
Update the setup.py, tweak makefile for building/cleaning/tests
Richard Harding
2012-04-17 14:35:50 -0400
-
b498df200b
More pep8, almost there
Richard Harding
2012-04-17 14:14:02 -0400
-
bbb60ed077
Update the name of the summary option to
Richard Harding
2012-04-17 13:59:02 -0400
-
-
a19e766900
Update version so we can upload new tar.gz to pypi
Richard Harding
2012-04-17 13:40:25 -0400
-
b9f6f6777f
Merge branch 'master' of github.com:buriy/python-readability
Richard Harding
2012-04-17 13:36:00 -0400
-
-
873562cfba
Update setup.py for finding the package correctly
Richard Harding
2012-04-17 13:35:54 -0400
-
e9a5cbfe7f
Remove pdb dummy
Richard Harding
2012-04-17 11:33:09 -0400
-
f1a79fb8f8
Update to make sure we don't drop the html tag when ditching elements
Richard Harding
2012-04-17 11:04:36 -0400
-
46f0302ebc
rename the document_only flag to html_partial
Richard Harding
2012-04-17 10:17:14 -0400
-
6e8a1f5ce2
Merge pull request #18 from mitechie/add_makefile
Rick Harding
2012-04-17 06:22:52 -0700
-
-
-
-
312fd55b4d
Merge
b8fc399fac
into 7338e9ef63
GitHub Merge Button
2012-04-17 06:22:00 -0700
-
-
-
-
-
b8fc399fac
Fix rebase issue in the Makefile
Richard Harding
2012-04-17 09:20:23 -0400
-
1986b5dcaf
Merge
82804b664d
into 7338e9ef63
GitHub Merge Button
2012-04-17 05:49:42 -0700
-
-
-
-
-
82804b664d
Update .gitignore file for venv and nosetests.
Richard Harding
2012-04-17 08:37:51 -0400
-
4376eedc13
Add makefile testing, building, uploading.
Richard Harding
2012-04-17 08:39:45 -0400
-
7338e9ef63
Added test suite to setup.py Bump to version 0.2.4
Yuri Baburov
2012-04-17 11:38:36 +0700
-
63840eedc1
Added test suite to setup.py Bump to version 0.2.4
0.2.4
Yuri Baburov
2012-04-17 11:38:36 +0700
-
-
-
a1ae4eaf72
Merge pull request #15 from mitechie/master
Yuri Baburov
2012-04-16 21:26:19 -0700
-
-
-
-
7cf8883002
Merge
8d3e39f04e
into ab783b25b7
GitHub Merge Button
2012-04-16 18:33:05 -0700
-
-
-
-
-
8d3e39f04e
Update readme
Richard Harding
2012-04-16 21:24:33 -0400
-
a46dc14251
Try to pep8 all the things but give up when I got close.
Richard Harding
2012-04-16 21:23:19 -0400
-
5a98e2c1b8
Correct appending and allow for document only
Richard Harding
2012-04-16 20:55:13 -0400
-
edccec5d3b
Work on why we have an empty <body/> tag
Richard Harding
2012-04-16 17:13:24 -0400
-
-
ab783b25b7
Merge pull request #11 from JanX2/master
Yuri Baburov
2012-03-25 22:39:29 -0700
-
-
1d2c22e6e9
Merge
3cdc3d67af
into f9b604c9a8
GitHub Merge Button
2012-03-24 02:07:55 -0700
-
-
-
-
3cdc3d67af
Adding comment about oversight in transform_misused_divs_into_paragraphs().
Jan Weiß
2012-03-24 10:00:07 +0100
-
960f885edf
Continue early in remove_unlikely_candidates() in case there is neither a class nor an id attribute.
Jan Weiß
2012-03-24 09:56:08 +0100
-
6b3961cd30
Fixing gap in node_length coverage.
Jan Weiß
2012-03-24 09:54:41 +0100
-
-
f9b604c9a8
Merge pull request #10 from facundo/master
Yuri Baburov
2012-02-07 20:02:05 -0800
-
-
e9055afaf3
Merge
bb93ae1e5f
into fc6a500298
GitHub Merge Button
2012-02-06 20:08:29 -0800
-
-
-
-
bb93ae1e5f
fixed a small issue on the Document score_paragraphs method
facundo
2012-02-06 23:05:26 -0500
-
-
fc6a500298
Merge pull request #9 from Psycojoker/master
Yuri Baburov
2012-01-07 23:59:08 -0800
-
-
924f1e3246
Merge
1583d8a794
into 11c4d95411
GitHub Merge Button
2012-01-07 12:52:25 -0800
-
-
-
-
1583d8a794
add lxml missing dependancy
Laurent Peuch
2012-01-07 21:48:46 +0100
-
-
11c4d95411
Fixed indentation, encoding issue and README bug. Thanks to Greg Jastrab. Bump version to 0.2.3
0.2.3
Yuri Baburov
2011-07-27 01:56:17 +0700
-
6bf4948e69
More README fixes for pipy and github. Bump to version 0.2.2
0.2.2
Yuri Baburov
2011-07-26 13:40:53 +0700
-
cc0af7a105
Add beginnings of regression tests
Jerry Charumilind
2011-07-08 15:55:30 -0700
-
82eabfc6b1
Bump version number
Jerry Charumilind
2011-07-05 17:23:05 -0700
-
cba19f209b
Fix issue with trying to drop root node
Jerry Charumilind
2011-07-05 17:17:38 -0700
-
18fa6b5146
Bump version number for external use
Jerry Charumilind
2011-07-05 13:36:58 -0700
-
cdd30f625e
Return confidence level when retieving summary
Jerry Charumilind
2011-07-05 13:35:36 -0700
-
f189ab905d
Fixed README for pypi.
0.2.1
Yuri Baburov
2011-07-02 00:24:15 +0700
-
7aac0f0855
Return a div fragment instead of a whole HTML page
Jerry Charumilind
2011-06-30 14:25:19 -0700
-
ac517834e6
Convert tabs to spaces; put article in body
Jerry Charumilind
2011-06-30 11:04:31 -0700
-
61715dca0a
Bump to version 0.2
0.2
Yuri Baburov
2011-06-30 12:08:46 +0700
-
21906f1c44
Better setup.py, now we're "readability-lxml" in pypi. Thanks to Jerry Charumilind.
Yuri Baburov
2011-06-30 11:51:58 +0700
-
c2ec1d1c38
Sorted out unicode issues, thanks to Lee Semel.
Yuri Baburov
2011-06-30 11:51:16 +0700
-
45781a600f
Added command-line usage
Yuri Baburov
2011-06-30 11:47:14 +0700
-
97ba2a0369
Debug utilities.
Yuri Baburov
2011-06-30 11:46:37 +0700
-
f3d0a8d842
Allow passing unicode objects
Lee Semel
2011-06-28 00:54:36 +0800
-
ad38fac40a
Add chardet to installation requirements
Jerry Charumilind
2011-06-30 05:00:30 +0800
-
8c1adc5141
Expose Document in readability package
Jerry Charumilind
2011-06-30 04:57:14 +0800
-
bae87079e9
Change to automatically find packages
Jerry Charumilind
2011-06-30 04:50:51 +0800
-
5bf5192d03
Add version number to track changes more easily
Jerry Charumilind
2011-06-30 04:33:53 +0800
-
72b541c9a1
Merge
fbc91add56
into 7a1e063c22
GitHub Merge Button
2011-06-29 21:11:22 -0700
-
-
-
01247903b8
Add chardet to installation requirements
Jerry Charumilind
2011-06-29 14:00:30 -0700
-
33f935e39a
Expose Document in readability package
Jerry Charumilind
2011-06-29 13:57:14 -0700
-
7ceb8e6d7b
Change to automatically find packages
Jerry Charumilind
2011-06-29 13:50:51 -0700
-
8877754d7e
Add version number to track changes more easily
Jerry Charumilind
2011-06-29 13:33:53 -0700
-
-
fbc91add56
Allow passing unicode objects
Lee Semel
2011-06-27 12:54:36 -0400
-
7a1e063c22
Updated setup.py to my fork, changed package name to lxml-readability
Yuri Baburov
2011-06-25 23:14:01 -0700
-
-
43c34bacc1
Renamed encodings to encoding to avoid conflicts with system module.
Yuri Baburov
2011-06-16 17:53:02 +0700
-
096d4db6ce
Added usage
Yuri Baburov
2011-06-14 04:33:15 -0700
-
f55f16baa1
Updated scoring algorithm to match readability.js v1.7.1
Yuri Baburov
2011-06-01 12:16:32 +0700
-
96f476181c
Improved title shortener method, and added it to the Document class.
Yuri Baburov
2011-05-11 19:58:27 +0700
-
f925e3ef05
Corrected README
Yuri Baburov
2011-05-02 21:45:23 -0700
-
dada82099b
Moved to lxml (based on decruft version); better encoding recognition.
Yuri Baburov
2011-05-03 11:34:29 +0700
-
b5639a0822
well that was quick; first fork added
gfxmonk
2011-01-20 23:03:30 +1100
-
324e280e16
added note to readme to make it clear that I'm not actively working on this library
gfxmonk
2011-01-20 22:28:01 +1100
-
7ebbcc03d2
made setup.py executable
Tim Cuthbertson
2010-09-16 22:01:13 +1000
-
a5d47a1129
added setup.py
Sean Brant
2010-09-14 19:18:35 -0500
-
2b6a2d3db4
removing empty paragraphs is not very useful, and can break some (stupid) websites
gfxmonk
2010-05-01 00:08:23 +1000
-
1d862a00c3
fixed bug where only immediate text was being considered for weights, instead of all nested text
gfxmonk
2010-05-01 00:07:30 +1000
-
0eacd959a4
failsafe parsing and more logging
gfxmonk
2010-04-30 22:33:22 +1000
-
87ad057706
unicode, dammit!
gfxmonk
2010-04-26 23:22:54 +1000
-
a224c5b759
minor
gfxmonk
2010-04-24 14:00:32 +1000
-
e42a39e1aa
modified readme
gfxmonk
2010-04-24 13:47:35 +1000
-
f73b5f05c4
split out into content and summary methods
gfxmonk
2010-04-24 00:37:42 +1000
-
c952f421b7
clean up content method and debug
gfxmonk
2010-04-23 21:14:13 +1000
-
c0ca60ee26
use a more leniant parser
gfxmonk
2010-04-23 20:51:56 +1000
-
ad3d52ade4
initial
gfxmonk
2010-04-22 21:55:00 +1000