Commit Graph

575 Commits (fix-remove-moment-js)
 

Author SHA1 Message Date
Janet c94eee7f92 feat: cinema blend parser (#105)
* feat: cinema blend parser

all systems go

* fix: timezone
7 years ago
Janet 64e3c205e8 feat: the political insider parser (#99)
* feat: the political insider parser with timezone
7 years ago
Janet 7b52d3d1fc feat: al.com parser (#110)
* feat: al.com parser

I think this is good but could you pls double check time zone on the
date? Thanks

* fix: date_published timezone
7 years ago
Janet 15df58496f feat: westernjournalism parser (#113)
* feat: westernjournalism parser

Adjacent sibling selector FTW!

Image not displaying in preview.

* feat: fix assertion, body does not include _Advertisement_ subtext
7 years ago
Janet ae12a1d701 feat: mental floss parser (#94)
* feat: mental floss parser
7 years ago
Janet bf29291395 feat: thepennyhoarder parser (#112)
* feat: thepennyhoarder parser

Looks good, although no image in preview!

* fix: adds selector for article lead image
7 years ago
Janet fadd198d04 feat: abcnewsgo parser (#90)
* feat: abcnewsgo parser
7 years ago
Adam Pash 25d9642ff9 feat: support cleaning and transforms for all fields (#138) 7 years ago
Janet 1054d854dd feat: america now parser (#114)
* feat: america now parser

Looks good but lead image did not display in preview.

* feat: adds selector for lead image
7 years ago
David A. Viramontes 7b3ad73282 Merge pull request #115 from postlight/feat-fusion-extractor
feat: fusion parser
7 years ago
dviramontes 93c8ba0e56 feat: adds selector for lead image 7 years ago
dviramontes f71fe7685d feat: adds video embed transform 7 years ago
dviramontes a77515d861 fix: author selector, less brittle 7 years ago
Janet 4c48acba59 feat: fusion parser
Looks okay — image did not load in preview.
7 years ago
David A. Viramontes fa71cacf5a Merge pull request #137 from postlight/feat-the-verge-polygon-supported-domain
feat: adds www.polygon.com to list of www.theverge.com supportedDomains
7 years ago
David A. Viramontes c679e493de Merge branch 'master' into feat-the-verge-polygon-supported-domain 7 years ago
Janet d292d8ef3a feat: ny daily news parser (#87)
* feat: ny daily news parser
7 years ago
dviramontes a53587acef feat: adds www.polygon.com to list of www.theverge.com supportedDomains 7 years ago
Janet 385b9d76a3 feat: sciencefly extractor (#116)
* feat: sciencefly extractor, use loading image rather than 404'ing meta
7 years ago
Adam Pash 601b0fac16 release: 1.0.5 (#136) 7 years ago
Adam Pash 6bd6278a07 feat: custom parser for wh blog (#130) 7 years ago
Adam Pash aa682d71e8 fix: medium bug (#129)
* fix: improved medium parser for images and multi-section content

* fix: duplicate video
7 years ago
Adam Pash 4e049de61a fix: i put a bad comment in .gitattributes (#125)
* marking html fixtures as "vendored"

* fix: bad comment
7 years ago
Adam Pash 8aa215c4c2 chore: marking html fixtures as "vendored" (#124) 7 years ago
Adam Pash 31eb4f9222 Feat: LinkedIn parser (#123)
* feat: rebuild custom parser

* feat: linkedin custom parser
7 years ago
Adam Pash dbc706410b release: 1.0.4 (#122) 7 years ago
Adam Pash 8662474d8a feat: changed user agent to latest chrome (#121)
* feat: changed user agent to latest chrome

* removed dead link
7 years ago
Janet 7709d69379 feat: npr parser (#86)
* feat: npr parser

Lead image appears in preview, but the test fails for some reason.

AssertionError: null ==
'https://media.npr.org/assets/img/2016/12/15/gettyimages-540681598_wide-
8b160732b96c083dc115134c3c019f3ac73586ba.jpg?s=1400'

Looks okay otherwise.

* feat: transformed figures/figcaptions, improved date_published and
addressed NPR's bad image metadata
7 years ago
Janet 8a82f2c0ab feat: recode parser (#85)
* feat: recode parser

Thumbs up, as far as I can tell.

Note: No image appeared in the preview.

* feat: pulling in lead image
7 years ago
Janet ad29acd7b7 feat: fortune parser (#84)
* feat: fortune parser

For some reason, the dek doesn’t appear in the local version of the
article I selected. I tried parsing the meta tag containing
og:description but it’s not working, and the description is slightly
longer than the dek in the original article.

I’m not sure why, but for the lead image, the meta tag for og:image is
not parsing the image url.

:(

* feat: fortune redesigned, so re-did extractor

* fix: added timezone
7 years ago
Janet c133ddf614 feat: qz parser (#81)
* feat: qz parser

I couldn’t figure out how to parse the date, but otherwise should be
fine. I added a clean for the div.article-aside element based on what I
saw in how the chrome extension worked.

* feat: updated content to grab top image

test: date is null :/
7 years ago
Janet 84312b6ef1 feat: dmagazine parser (#80)
* feat: dmagazine parser

I’m sorry to have failed you. :-( These are the issues I encountered:

1) author - does not have a unique selector to distinguish it from the
date, couldn’t parse it
2) date - no meta data in the head
3) no meta og:image in the head (my go to), so I couldn’t get the image
test to pass, but it appears to be parsing. The caption below it is the
same size as the body copy in the preview. I couldn’t figure out how to
“transform” it to caption size.

* feat: update date, image, and author selectors and corresponding tests

* feat: generalized content selector
7 years ago
Janet e035f36361 feat: reuters parser (#78)
* feat: reuters parser

Date parses correctly but fails test because of format discrepancy.

Author tags are nested within the content, which is why the author
names are appearing twice. I wasn’t sure how to address this.

Additionally, the location appears twice, so I cleaned the location
tags from the content.

* test: fix date format

* transform .article-subtitle to h4; cleaning author but leaving location
7 years ago
Janet dec49ab073 feat: mashable parser (#76)
* feat: mashable parser

As usual the date is giving me issues because of formatting
discrepancies:
AssertionError: '2016-12-13T22:33:06.000Z' == '2016-12-14T03:33:06.000Z'

Not sure how we wanna deal with Twitter card embeds that don’t show up?

Also, image credits did not show up in preview.

* test: fixed date format

* transforming .image-credit to figcaption
7 years ago
Janet cddc1afb69 feat: chicago tribune parser (#75)
* feat: chicago tribune parser

Date is parsing but failing the test because:
AssertionError: '2016-12-13T21:45:00.000Z' == '2016-12-13T13:45:00-0800'

I tried to insert a line of code for Time Zone but I’m a n00b so I
don’t think I did it right.

No image showing up in the preview.

* fix: remove timezone from date_published extractor

* test: update unit tests to assert the correct value for date_published
7 years ago
Janet aff651c2d8 feat: hellogiggles parser (#107)
Looks good to me!
7 years ago
Janet 11ad7b9a92 feat: thought catalog parser (#102)
Looks good!
7 years ago
Janet aa43a6091c feat: cnbc parser (#96)
Should be good to go!
7 years ago
Janet cd245f7980 feat: popsugar parser (#93)
I think this one is good to go!
7 years ago
Janet a8ab7135e1 feat: observer parser (#91)
no problems
7 years ago
Janet 3bee7224cb feat: nbc news parser (#74) 7 years ago
Janet 88242dd233 feat: nj.com parser (#73) 7 years ago
Janet 1ac5670a54 feat: inquisitor parser (#72) 7 years ago
Janet 9e5b91ed8b feat: refinery29 parser (#71) 8 years ago
Janet b78c58c43a feat: miami herald parser (#69) 8 years ago
Janet aedf83edc6 feat: eonline parser (#68) 8 years ago
Janet a20da5eb31 uproxx extractor (#66) 8 years ago
Janet 87c42b6358 feat: 247sports.com extractor (#64) 8 years ago
Janet 22e6c884fb feat: rolling stone extractor (#65) 8 years ago
Janet 6337231697 feat: usmagazine extractor (#63) 8 years ago