Commit Graph

879 Commits (77159b99caab13d51b49bc89b0b5845b14757a2c)
 

Author SHA1 Message Date
pictuga bb48ee4518 Forgot one chunk of code in previous commit 2013-12-22 18:27:18 +01:00
pictuga a5e4775f68 Move some code into Gather()
Will be useful when playing with WSGIREF
2013-12-22 18:25:33 +01:00
pictuga b86b0f371a Remove silly debug print 2013-12-22 12:45:21 +01:00
pictuga 0badec72a9 Allow iteration over RSS objects
Useful for JSON
2013-12-22 12:44:33 +01:00
pictuga d97a72bc0b Use Cache dict for facebook storage 2013-12-22 12:43:03 +01:00
pictuga afe31e695a "json" arg to output json feed 2013-12-21 20:14:10 +01:00
pictuga 47013c7948 Use sys.stdout.write instead of print in progress
For some reason, this avoids having multiple "lines" on top of each other, which was always due to print()' newlines. And this makes the client/js part much more stable.
2013-12-21 20:11:20 +01:00
pictuga eafbc14a58 Allow "progress" in JSON 2013-12-21 20:09:14 +01:00
pictuga cbdb61a918 Allow real python dict in Cache
Makes it possible to store things a nicer way
2013-12-21 17:05:12 +01:00
pictuga 2a8c94fd88 Work with unicode strings
In downloads at least (I don't quite remember what readability returns actually, but "it just works")
2013-12-12 23:12:33 +01:00
pictuga eac2e7a79a Try to avoid erros with saving
Mainly: do not create empty cache file, and accept empty cache file (doesn't sound logical but still)
2013-12-12 16:41:58 +01:00
pictuga 82f01acfb1 Fix types (int mostly) in facebook 2013-12-09 22:13:23 +01:00
pictuga fa14fa0ccb Fix encoding in feedify 2013-12-09 22:11:46 +01:00
pictuga d2345a94f8 Speed up a little html encoding detection
Not sure this is a smart move
2013-12-07 20:35:32 +01:00
pictuga 245ba99ae9 Always clean up downloads' enconding
TPB, fuck ye
2013-12-07 20:33:33 +01:00
pictuga e3f525ff2a Fix JSON related encoding issue 2013-12-04 20:26:39 +01:00
pictuga c43bf9f35f Feed: print roottree
Keeps doctype declaration, xml stylesheets and so on
2013-12-04 20:14:09 +01:00
pictuga 71e4d597f8 Atom: support for <link rel="relative">
Not very clean. Clean work would require full rewrite of xgetCreate & maybe the whole item "selection" system
2013-12-04 20:12:52 +01:00
pictuga e2512dba0c Cath all errors in threads
Makes in possible to empty the queue whatsoever happens
2013-12-01 19:45:00 +01:00
pictuga 1644a409ef Add application/xml as html mimetype 2013-12-01 15:48:17 +01:00
pictuga 8b3062eaa2 Make Accept headers py 2.6 compatible 2013-12-01 15:47:20 +01:00
pictuga 66c6dcf203 Store "as it" in JSON (int or str)
The json lib does all the work
2013-12-01 15:46:02 +01:00
pictuga 52fe67beb3 Create cache file when missing
Damn it...
2013-12-01 15:45:21 +01:00
pictuga 020c095194 Pretty-print JSON to cache
Easier for debugging, not really needed though, takes more place for nothing
2013-12-01 15:44:19 +01:00
pictuga c83f5bb23d URI based Cache filename
Problem: lengthy filenames, have to cut them after 255 chars, sucks with FB, can cause problems
2013-12-01 15:42:35 +01:00
pictuga 1a50cb390a User JSON to save cache
Faster, cleaner, safer, unobfuscated, not 1/3 bigger
2013-11-30 19:59:19 +01:00
pictuga fde5d3e8c5 Argument "silent" prevents from outputting feed
Also fixes small indentation typo (non blocking though)
2013-11-30 19:39:27 +01:00
pictuga 1d10d30173 Reddit: check feed url instead of link url 2013-11-30 19:36:44 +01:00
pictuga b927a4d089 Use "str.startswith" instead of ugly hack 2013-11-30 19:36:15 +01:00
pictuga 8916216403 Add "txt" argument, to change HTTP Content-Type 2013-11-30 19:35:24 +01:00
pictuga 336d98230f Make sure final data is there in downloader 2013-11-30 19:34:34 +01:00
pictuga 2634dab40c Use HTTP Accept headers
Tries to tell servers what we need, hopefull this will avoid the useless donwloading of pdf, images and so on (especially on Hacker News)
2013-11-30 19:33:36 +01:00
pictuga 0de5adc505 Remove return value in one __ini__
It was some kind of previous code left-over
2013-11-30 19:30:54 +01:00
pictuga 363014ee4c User IOError to catch errors
Most http related exceptions inherit from that one
2013-11-30 17:33:57 +01:00
pictuga 7156dd5522 Support for non-textual gzipped http content 2013-11-24 21:55:07 +01:00
pictuga ecc18dc4ab Thread Gather()
Twice faster. Ctrl-C no longer works in terminal. Don't know how to fix it. Using threading.active_count() or sth didn't work as expected. New settings (THREADS) to change the number of threads to use.
2013-11-24 20:52:53 +01:00
pictuga 8832611a05 Fix MAX_ITEM condition 2013-11-24 20:49:14 +01:00
pictuga 1274df4e1d Yet another feedburner workaround
This one really drove me mad. When the "Referer" is feedburner itself, it just will show html page (with Google Reader Add buttons), instead of the regular feed. Itself adding "?format=xml" at the end of the url also solves it, but users don't know about that...
2013-11-24 20:43:09 +01:00
pictuga d3db04332e Don't run Fix() on dropped elements 2013-11-24 17:36:38 +01:00
pictuga 35fba5ffad Toggle isPermaLing when changing item url 2013-11-16 19:08:08 +01:00
pictuga 91251ec306 isPermaLink in feeds.py (RSS only) 2013-11-16 19:07:25 +01:00
pictuga 68768dd6e4 Add FeedBool in feeds.py
To handle boolean values
2013-11-16 19:06:35 +01:00
pictuga 610334dca9 Use self.nname in descriptor to store preformated 2013-11-16 19:05:57 +01:00
pictuga 03884be1bd Update README
LIM_TIME, arguments, CLI use case, facebook api, feedify, arguments explanation
2013-11-16 17:48:21 +01:00
pictuga f94bc4545e Add "LIM_TIME" to drop elements based on time
Also this value and the similars, all take -1 as unlimited value
2013-11-16 16:02:35 +01:00
pictuga 417609c1f0 Fix crash on empty html pages
Was due to "chardet" returning "None", which then crashed "str.replace"
2013-11-16 16:00:07 +01:00
pictuga 6975772ef9 Revert "Improve facebook and feedburner workarounds"
This PARTIALLY reverts commit 8a5202efd5 (only the feedburner part).
2013-11-15 16:54:41 +01:00
pictuga 3a784be93e End of typo fox in cachePath 2013-11-11 00:34:32 +01:00
pictuga 8a367f2251 Feedify support for DuckDuckGo
Trying to work around google's bann
2013-11-11 00:33:43 +01:00
pictuga ddc9e4e3cb Fix typo around cachePath
Identation was wrong in if/else statement
2013-11-10 23:55:44 +01:00