David Ziegler's Internets - Latest Comments in A Python Script to Extract Excerpts From Articles

Re: A Python Script to Extract Excerpts From Articles

Sujit — Fri, 10 May 2013 23:21:37 -0000

Hey i got this error msg: global name 'SoupStrainer' is not defined

Re: A Python Script to Extract Excerpts From Articles

Tahia — Fri, 24 Aug 2012 00:53:35 -0000

This is awesome! Thanks for sharing :)

-tk

Re: A Python Script to Extract Excerpts From Articles

Remove Antispy Safeguard Virus — Thu, 14 Oct 2010 10:48:54 -0000

Wow, thanks so much! I'm searching for such information!

Re: A Python Script to Extract Excerpts From Articles

Spelling Games — Mon, 20 Sep 2010 13:46:24 -0000

A slightly more computer-science-like approach for extracting text from websites is http://www.aidanf.net/softw.... It works pretty well for decently structured html although your cleaning suggestions for CSS etc. help, too. If I need a shorter version of the full text I just use the first N chars.

Re: A Python Script to Extract Excerpts From Articles

athikitie — Sun, 01 Aug 2010 22:03:19 -0000

Hemp is is far more than a psychoactive drug. And indeed the perfect food, and when learned. Go to http://www.hempproteinguide... for great information.

Re: A Python Script to Extract Excerpts From Articles

aidanf — Tue, 20 Apr 2010 13:48:10 -0000

Actually that link is broken. The link to that post is http://www.aidanf.net/archi...

The latest code for BTE is on github: http://github.com/aidanf/BTE

Re: A Python Script to Extract Excerpts From Articles

Baby Gifts — Tue, 05 Jan 2010 16:58:29 -0000

Was hoping to find the equivalent of this in Ruby, but no luck so far. Not sure how easy it would be to do so if anyone has a heads up, feel free to let us know.
------------------------------------------
Tommy - Personalised Childrens Gifts

Re: A Python Script to Extract Excerpts From Articles

sordyl — Sat, 26 Sep 2009 21:06:56 -0000

@trifilij did you have any luck with a ruby equivalent? I'm about to try this with RubyfulSoup and could use a headstart.

Re: A Python Script to Extract Excerpts From Articles

Nail Arts Designer — Sun, 09 Aug 2009 13:09:38 -0000

What a useful post here. Very informative for me..TQ friends...

Cheers,
gadgettechblog.com

Re: A Python Script to Extract Excerpts From Articles

Michael — Tue, 16 Jun 2009 15:57:04 -0000

Re: A Python Script to Extract Excerpts From Articles

dziegler — Sat, 13 Jun 2009 11:37:33 -0000

Yeah, that was pretty ugly. I updated the post to reflect the changes.

Re: A Python Script to Extract Excerpts From Articles

Bartek — Sat, 13 Jun 2009 09:28:43 -0000

Your removeHeaders(soup): looks scary. Could you not rewrite it to something like so:

[tree.extract() for tree in [soup(arg) for arg in ['h1','h2','h3']]]

Didn't test it, and it's the morning but that chunk should definitly be done differently. Regardless, nice little article :)

edit: Just realized you already did something similar in your github code. My bad!