Another huge commit.
Now uses OOP where it fits. Atom feeds are supported, but no real tests were made. Unix globbing is now possible for urls. Caching is done a cleaner way. Feedburner links are also replaced. HTML is cleaned a more efficient way. Code is now much cleaner, using lxml.objectify and a small wrapper to access Atom feeds as if they were RSS feeds (and much faster than feedparser). README has been updated.
This commit is contained in:
30
rules
30
rules
@@ -1,15 +1,37 @@
|
||||
TehranTimes
|
||||
http://www.tehrantimes.com/component/ninjarsssyndicator/?feed_id=1&format=raw
|
||||
http://www.tehrantimes.com/*
|
||||
http://tehrantimes.com/*
|
||||
//div[@class='article-indent']
|
||||
|
||||
FranceInfo
|
||||
http://www.franceinfo.fr/rss.xml
|
||||
http://www.franceinfo.fr/rss*
|
||||
//h2[@class='chapo']/..
|
||||
|
||||
Les Echos
|
||||
http://rss.feedsportal.com/c/499/f/413829/index.rss
|
||||
http://syndication.lesechos.fr/rss/*
|
||||
//h1/../..
|
||||
|
||||
Spiegel
|
||||
http://www.spiegel.de/schlagzeilen/tops/index.rss
|
||||
http://www.spiegel.de/schlagzeilen/*
|
||||
//div[@id='spArticleSection']
|
||||
|
||||
Le Soir
|
||||
http://www.lesoir.be/feed/La%20Une/destination_une_block/
|
||||
http://www.lesoir.be/feed/*
|
||||
//div[@class='article-content']
|
||||
|
||||
Stack Overflow
|
||||
http://stackoverflow.com/feeds/*
|
||||
//*[@id='question']
|
||||
|
||||
Daily Telegraph
|
||||
http://www.telegraph.co.uk/*
|
||||
//*[@id='mainBodyArea']
|
||||
|
||||
Cracked.com
|
||||
http://feeds.feedburner.com/CrackedRSS
|
||||
//div[@class='content']|//section[@class='body']
|
||||
|
||||
TheOnion
|
||||
http://feeds.theonion.com/*
|
||||
//article
|
||||
|
Reference in New Issue
Block a user