Fixes #216 - Limit counted commas score.

pull/217/head
Nicolas Perriault 9 years ago
parent 8510106638
commit 61d8e6213f

@ -716,8 +716,8 @@ Readability.prototype = {
// Add a point for the paragraph itself as a base.
contentScore += 1;
// Add points for any commas within this paragraph.
contentScore += innerText.split(',').length;
// Add points for any commas within this paragraph. Up to 60 points.
contentScore += Math.min(innerText.split(',').length, 60);
// For every 100 characters in this paragraph, add another point. Up to 3 points.
contentScore += Math.min(Math.floor(innerText.length / 100), 3);

@ -1,6 +1,5 @@
<div id="readability-page-1" class="page">
<div>
<td>
<h3 align="center ">Study Webtext</h3>
<h2 align="center "><span face="Lucida Handwriting " color="Maroon
">"Bartleby the Scrivener: A Story of Wall-Street " </span>(1853)&nbsp;<br>
@ -259,6 +258,5 @@
<p>* * * * * * * *</p>
<p>There would seem little need for proceeding further in this history. Imagination will readily supply the meagre recital of poor Bartleby's interment. But ere parting with the reader, let me say, that if this little narrative has sufficiently interested him, to awaken curiosity as to who Bartleby was, and what manner of life he led prior to the present narrator's making his acquaintance, I can only reply, that in such curiosity I fully share, but am wholly unable to gratify it. Yet here I hardly know whether I should divulge one little item of rumor, which came to my ear a few months after the scrivener's decease. Upon what basis it rested, I could never ascertain; and hence how true it is I cannot now tell. But inasmuch as this vague report has not been without a certain strange suggestive interest to me, however said, it may prove the same with some others; and so I will briefly mention it. The report was this: that Bartleby had been a subordinate clerk in the Dead Letter Office at <a href="http://raven.cc.ukans.edu/%7Ezeke/bartleby/parker.html" target="_blank">Washington</a>, from which he had been suddenly removed by a change in the administration. When I think over this rumor, I cannot adequately express the emotions which seize me. Dead letters! does it not sound like dead men? Conceive a man by nature and misfortune prone to a pallid hopelessness, can any business seem more fitted to heighten it than that of continually handling these dead letters and assorting them for the flames? For by the cart-load they are annually burned. Sometimes from out the folded paper the pale clerk takes a ring:--the bank-note sent in swiftest charity:--he whom it would relieve, nor eats nor hungers any more; pardon for those who died despairing; hope for those who died unhoping; good tidings for those who died stifled by unrelieved calamities. On errands of life, these letters speed to death. </p>
<p> Ah Bartleby! Ah humanity!</p>
</td>
</div>
</div>

@ -0,0 +1,6 @@
{
"title": "Hypertext Transfer Protocol version 2",
"byline": "Authors' Addresses",
"excerpt": "This specification describes an optimized expression of the semantics of the Hypertext Transfer Protocol (HTTP). HTTP/2 enables a more efficient use of network resources and a reduced perception of latency by introducing header field compression and allowing multiple concurrent exchanges on the same connection. It also introduces unsolicited push of representations from servers to clients. This specification is an alternative to, but does not obsolete, the HTTP/1.1 message syntax. HTTP's existing semantics remain unchanged.",
"readerable": true
}

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

@ -1,6 +1,4 @@
<div id="readability-page-1" class="page">
<div class="postField postField--body">
<section name="ef8c" class=" section--first section--last">
<div class="section-content">
<div class="section-inner u-sizeFullWidth">
<figure name="b9ad" id="b9ad" class="graf--figure postField--fillWidthImage graf--first">
@ -155,6 +153,4 @@
<p name="c30a" id="c30a" data-align="center" class="graf--p graf--last">Follow Backchannel: <a href="https://twitter.com/backchnnl" data-href="https://twitter.com/backchnnl" class="markup--anchor markup--p-anchor" rel="nofollow"><em class="markup--em markup--p-em">Twitter</em></a> <em class="markup--em markup--p-em">|</em><a href="https://www.facebook.com/pages/Backchannel/1488568504730671" data-href="https://www.facebook.com/pages/Backchannel/1488568504730671" class="markup--anchor markup--p-anchor" rel="nofollow"><em class="markup--em markup--p-em">Facebook</em></a> </p>
</div>
</div>
</section>
</div>
</div>

@ -1,6 +1,4 @@
<div id="readability-page-1" class="page">
<div>
<td class="MidColumn">
<div class="ArticleText">
<h2 class="SummaryHL"><a href="http://fakehost/Articles/637755/">A trademark battle in the Arduino community</a></h2>
<p>The <a href="https://en.wikipedia.org/wiki/Arduino">Arduino</a> has been one of the biggest success stories of the open-hardware movement, but that success does not protect it from internal conflict. In recent months, two of the project's founders have come into conflict about the direction of future efforts—and that conflict has turned into a legal dispute about who owns the rights to the Arduino trademark. </p>
@ -575,7 +573,4 @@ bark but the caravan moves on.</span>" That may be true, but, in this case, the
<p style="display: inline;" class="readability-styled">: </p><a href="http://fakehost/Articles/637395/">Security&gt;&gt;</a>
<br>
</div>
</td>
<td class="RightColumn"> </td>
</div>
</div>
Loading…
Cancel
Save