Default Branch

master

d5eea06a00 · exclude additional elements based on their role (#619) · Updated 4 years ago

Branches

build-issues-node-js-update

d8cad379f9 · Switch to a newer node.js to fix build issues · Updated 5 years ago

33
1
plain-js-nodelist-foreach

71a0b32e41 · Fix #280 and #271 by being more careful about how we iterate over node lists in case they're live · Updated 8 years ago

210
1
france24

61dc9b64b4 · Fixes #220 - Use article content when a single one exists in the page. · Updated 9 years ago

235
1
limit-counted-commas-score

61d8e6213f · Fixes #216 - Limit counted commas score. · Updated 9 years ago

235
1
fix-excluded-linked-list-items

670322a4da · Fixes #198 - Avoid stripping linked list items. · Updated 9 years ago

235
1
fix-sfgate

2c5ba594dd · Refs #209 - Increase score for elements containing large amount of text. · Updated 9 years ago

235
1
support-dailymotion-videos

cc18cb5787 · Ref #195 - Add support for dailymotion videos. · Updated 9 years ago

246
0
Included
improved-author-meta-extraction

7aee44adb2 · Improved author metadata detection. · Updated 9 years ago

258
0
Included
better-detect-main-content-node

d188d89e02 · Added more weight to section tags. · Updated 9 years ago

265
1
getElementsByClassName

972924df80 · Added support for getElementsByClassName to JSDOMParser. · Updated 9 years ago

267
1
improved-detection

d59c89ce48 · Specialized tests between extraction and detection. · Updated 9 years ago

276
2
print-err-stack-on-generation-failure

0a9f9a6804 · Print exception stack when generating a test case fails. · Updated 9 years ago

315
1
keep-tabular-data

62fae22849 · Closes #66 - Keep tabular data as they're most likely to be part of content. · Updated 9 years ago

319
1