Hello community,
here is the log from the commit of package urlwatch for openSUSE:Factory checked in at 2018-09-04 22:57:50
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Comparing /work/SRC/openSUSE:Factory/urlwatch (Old)
and /work/SRC/openSUSE:Factory/.urlwatch.new (New)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "urlwatch"
Tue Sep 4 22:57:50 2018 rev:12 rq:632968 version:2.14
Changes:
--------
--- /work/SRC/openSUSE:Factory/urlwatch/urlwatch.changes 2018-06-08 23:16:24.210031361 +0200
+++ /work/SRC/openSUSE:Factory/.urlwatch.new/urlwatch.changes 2018-09-04 22:58:07.189400169 +0200
@@ -1,0 +2,14 @@
+Tue Sep 4 06:34:45 UTC 2018 - mvetter@suse.com
+
+- Update to 2.14:
+ * Added filter to pretty-print JSON data: format-json (by Niko Böckerman, PR#250)
+ * Added list active Telegram chats using --telegram-chats (with fixes by Georg Pichler, PR#270)
+ * Added support for HTTP ETag header in URL jobs and If-None-Match (by Karol Babioch, PR#256)
+ * Added xupport for filtering HTML using XPath expressions, with lxml (PR#274, Fixes #226)
+ * Added install_dependencies to setup.py commands for easy installing of dependencies
+ * Added ignore_connection_errors per-job configuration option (by Karol Babioch, PR#261)
+ * Improved code (HTTP status codes, by Karol Babioch PR#258)
+ * Improved documentation for setting up Telegram chat bots
+ * Allow multiple chats for Telegram reporting (by Georg Pichler, PR#271)
+
+-------------------------------------------------------------------
Old:
----
urlwatch-2.13.tar.gz
New:
----
urlwatch-2.14.tar.gz
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Other differences:
------------------
++++++ urlwatch.spec ++++++
--- /var/tmp/diff_new_pack.wj9slx/_old 2018-09-04 22:58:07.541401369 +0200
+++ /var/tmp/diff_new_pack.wj9slx/_new 2018-09-04 22:58:07.541401369 +0200
@@ -17,7 +17,7 @@
Name: urlwatch
-Version: 2.13
+Version: 2.14
Release: 0
Summary: A tool for monitoring webpages for updates
License: BSD-3-Clause
++++++ urlwatch-2.13.tar.gz -> urlwatch-2.14.tar.gz ++++++
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/.travis.yml new/urlwatch-2.14/.travis.yml
--- old/urlwatch-2.13/.travis.yml 2018-06-03 14:42:56.000000000 +0200
+++ new/urlwatch-2.14/.travis.yml 2018-08-30 10:36:16.000000000 +0200
@@ -4,5 +4,5 @@
- "3.5"
- "3.6"
install:
- - pip install pyyaml minidb requests keyring pycodestyle appdirs
+ - python setup.py install_dependencies
script: nosetests -v
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/CHANGELOG.md new/urlwatch-2.14/CHANGELOG.md
--- old/urlwatch-2.13/CHANGELOG.md 2018-06-03 14:42:56.000000000 +0200
+++ new/urlwatch-2.14/CHANGELOG.md 2018-08-30 10:36:16.000000000 +0200
@@ -4,6 +4,22 @@
The format mostly follows [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
+## [2.14] -- 2018-08-30
+
+### Added
+- Filter to pretty-print JSON data: `format-json` (by Niko Böckerman, PR#250)
+- List active Telegram chats using `--telegram-chats` (with fixes by Georg Pichler, PR#270)
+- Support for HTTP `ETag` header in URL jobs and `If-None-Match` (by Karol Babioch, PR#256)
+- Support for filtering HTML using XPath expressions, with `lxml` (PR#274, Fixes #226)
+- Added `install_dependencies` to `setup.py` commands for easy installing of dependencies
+- Added `ignore_connection_errors` per-job configuration option (by Karol Babioch, PR#261)
+
+### Changed
+- Improved code (HTTP status codes, by Karol Babioch PR#258)
+- Improved documentation for setting up Telegram chat bots
+- Allow multiple chats for Telegram reporting (by Georg Pichler, PR#271)
+
+
## [2.13] -- 2018-06-03
### Added
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/README.md new/urlwatch-2.14/README.md
--- old/urlwatch-2.13/README.md 2018-06-03 14:42:56.000000000 +0200
+++ new/urlwatch-2.14/README.md 2018-08-30 10:36:16.000000000 +0200
@@ -26,28 +26,19 @@
* [requests](http://python-requests.org/)
* [keyring](https://github.com/jaraco/keyring/)
* [appdirs](https://github.com/ActiveState/appdirs)
- * [chump](https://github.com/karanlyons/chump/) (for Pushover support)
- * [pushbullet.py](https://github.com/randomchars/pushbullet.py) (for Pushbullet support)
+ * [lxml](https://lxml.de)
The dependencies can be installed with (add `--user` to install to `$HOME`):
-`python3 -m pip install pyyaml minidb requests keyring appdirs`
+`python3 -m pip install pyyaml minidb requests keyring appdirs lxml`
-For optional pushover support the chump package is required:
-`python3 -m pip install chump`
+Optional dependencies (install via `python3 -m pip install <packagename>`):
-For optional pushbullet support the pushbullet.py package is required:
-
-`python3 -m pip install pushbullet.py`
-
-For optional support for the "browser" job kind, Requests-HTML is needed:
-
-`python3 -m pip install requests-html`
-
-For unit tests, you also need to install pycodestyle:
-
-`python3 -m pip install pycodestyle`
+ * Pushover reporter: [chump](https://github.com/karanlyons/chump/)
+ * Pushbullet reporter: [pushbullet.py](https://github.com/randomchars/pushbullet.py)
+ * "browser" job kind: [requests-html](https://html.python-requests.org)
+ * Unit testing: [pycodestyle](http://pycodestyle.pycqa.org/en/latest/)
MIGRATION FROM URLWATCH 1.x
@@ -144,6 +135,30 @@
`brew install wdiff` on macOS). Coloring is supported for `wdiff`-style
output, but potentially not for other diff tools.
+To filter based on an [XPath](https://www.w3.org/TR/1999/REC-xpath-19991116/)
+expression, you can use the `xpath` filter like so (see Microsoft's
+[XPath Examples](https://msdn.microsoft.com/en-us/library/ms256086(v=vs.110).aspx)
+page for some other examples):
+
+```yaml
+url: https://example.net/
+filter: xpath:/body
+```
+
+This filters only the `<body>` element of the HTML document, stripping
+out everything else.
+
+In some cases, it might be useful to ignore (temporary) network errors to
+avoid notifications being sent. While there is a `display.error` config
+option (defaulting to `True`) to control reporting of errors globally, to
+ignore network errors for specific jobs only, you can use the
+`ignore_connection_errors` key in the job list configuration file:
+
+```yaml
+url: https://example.com/
+ignore_connection_errors: true
+```
+
PUSHOVER
--------
@@ -168,6 +183,7 @@
Telegram notifications are configured using the Telegram Bot API.
For this, you'll need a Bot API token and a chat id (see https://core.telegram.org/bots).
Sample configuration:
+
```yaml
telegram:
bot_token: '999999999:3tOhy2CuZE0pTaCtszRfKpnagOG8IQbP5gf' # your bot api token
@@ -175,6 +191,28 @@
enabled: true
```
+To set up Telegram, from your Telegram app, chat up BotFather (New Message,
+Search, "BotFather"), then say `/newbot` and follow the instructions.
+Eventually it will tell you the bot token (in the form seen above,
+`<number>:<random string>`) - add this to your config file.
+
+You can then click on the link of your bot, which will send the message `/start`.
+At this point, you can use the command `urlwatch --telegram-chats` to list the
+private chats the bot is involved with. This is the chat ID that you need to put
+into the config file as `chat_id`. You may add multiple chat IDs as a YAML list:
+```yaml
+telegram:
+ bot_token: '999999999:3tOhy2CuZE0pTaCtszRfKpnagOG8IQbP5gf' # your bot api token
+ chat_id:
+ - '11111111'
+ - '22222222'
+ enabled: true
+```
+
+Don't forget to also enable the reporter.
+
+
+
BROWSER
-------
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/lib/urlwatch/__init__.py new/urlwatch-2.14/lib/urlwatch/__init__.py
--- old/urlwatch-2.13/lib/urlwatch/__init__.py 2018-06-03 14:42:56.000000000 +0200
+++ new/urlwatch-2.14/lib/urlwatch/__init__.py 2018-08-30 10:36:16.000000000 +0200
@@ -12,5 +12,5 @@
__author__ = 'Thomas Perl '
__license__ = 'BSD'
__url__ = 'https://thp.io/2008/urlwatch/'
-__version__ = '2.13'
+__version__ = '2.14'
__user_agent__ = '%s/%s (+https://thp.io/2008/urlwatch/info.html)' % (pkgname, __version__)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/lib/urlwatch/command.py new/urlwatch-2.14/lib/urlwatch/command.py
--- old/urlwatch-2.13/lib/urlwatch/command.py 2018-06-03 14:42:56.000000000 +0200
+++ new/urlwatch-2.14/lib/urlwatch/command.py 2018-08-30 10:36:16.000000000 +0200
@@ -33,6 +33,7 @@
import os
import shutil
import sys
+import requests
from .filters import FilterBase
from .handler import JobState
@@ -175,6 +176,41 @@
if self.urlwatch_config.edit_config:
sys.exit(self.urlwatcher.config_storage.edit())
+ def check_telegram_chats(self):
+ if self.urlwatch_config.telegram_chats:
+ config = self.urlwatcher.config_storage.config['report'].get('telegram', None)
+ if not config:
+ print('You need to configure telegram in your config first (see README.md)')
+ sys.exit(1)
+
+ bot_token = config.get('bot_token', None)
+ if not bot_token:
+ print('You need to set up your bot token first (see README.md)')
+ sys.exit(1)
+
+ info = requests.get('https://api.telegram.org/bot{}/getMe'.format(bot_token)).json()
+
+ chats = {}
+ for chat_info in requests.get('https://api.telegram.org/bot{}/getUpdates'.format(bot_token)).json()['result']:
+ chat = chat_info['message']['chat']
+ if chat['type'] == 'private':
+ chats[str(chat['id'])] = ' '.join((chat['first_name'], chat['last_name'])) if 'last_name' in chat else chat['first_name']
+
+ if not chats:
+ print('No chats found. Say hello to your bot at https://t.me/{}'.format(info['result']['username']))
+ sys.exit(1)
+
+ headers = ('Chat ID', 'Name')
+ maxchat = max(len(headers[0]), max((len(k) for k, v in chats.items()), default=0))
+ maxname = max(len(headers[1]), max((len(v) for k, v in chats.items()), default=0))
+ fmt = '%-' + str(maxchat) + 's %s'
+ print(fmt % headers)
+ print(fmt % ('-' * maxchat, '-' * maxname))
+ for k, v in sorted(chats.items(), key=lambda kv: kv[1]):
+ print(fmt % (k, v))
+ print('\nChat up your bot here: https://t.me/{}'.format(info['result']['username']))
+ sys.exit(0)
+
def check_smtp_login(self):
if self.urlwatch_config.smtp_login:
config = self.urlwatcher.config_storage.config['report']['email']
@@ -222,6 +258,7 @@
def run(self):
self.check_edit_config()
self.check_smtp_login()
+ self.check_telegram_chats()
self.handle_actions()
self.urlwatcher.run_jobs()
self.urlwatcher.close()
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/lib/urlwatch/config.py new/urlwatch-2.14/lib/urlwatch/config.py
--- old/urlwatch-2.13/lib/urlwatch/config.py 2018-06-03 14:42:56.000000000 +0200
+++ new/urlwatch-2.14/lib/urlwatch/config.py 2018-08-30 10:36:16.000000000 +0200
@@ -89,6 +89,7 @@
group = parser.add_argument_group('Authentication')
group.add_argument('--smtp-login', action='store_true', help='Enter password for SMTP (store in keyring)')
+ group.add_argument('--telegram-chats', action='store_true', help='List telegram chats the bot is joined to')
group = parser.add_argument_group('job list management')
group.add_argument('--list', action='store_true', help='list jobs')
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/lib/urlwatch/filters.py new/urlwatch-2.14/lib/urlwatch/filters.py
--- old/urlwatch-2.13/lib/urlwatch/filters.py 2018-06-03 14:42:56.000000000 +0200
+++ new/urlwatch-2.14/lib/urlwatch/filters.py 2018-08-30 10:36:16.000000000 +0200
@@ -32,11 +32,14 @@
import logging
import itertools
import os
+import io
import imp
import html.parser
import hashlib
+import json
from enum import Enum
+from lxml import etree
from .util import TrackSubClasses
@@ -183,6 +186,19 @@
return ical2text(data)
+class JsonFormatFilter(FilterBase):
+ """Convert to formatted json"""
+
+ __kind__ = 'format-json'
+
+ def filter(self, data, subfilter=None):
+ indentation = 4
+ if subfilter is not None:
+ indentation = int(subfilter)
+ parsed_json = json.loads(data)
+ return json.dumps(parsed_json, sort_keys=True, indent=indentation)
+
+
class GrepFilter(FilterBase):
"""Filter only lines matching a regular expression"""
@@ -349,3 +365,18 @@
return '\n'.join('%s %s' % (' '.join('%02x' % c for c in block),
''.join((chr(c) if (c > 31 and c < 127) else '.')
for c in block)) for block in blocks)
+
+
+class XPathFilter(FilterBase):
+ """Filter XML/HTML using XPath expressions"""
+
+ __kind__ = 'xpath'
+
+ def filter(self, data, subfilter=None):
+ if subfilter is None:
+ raise ValueError('Need an XPath expression for filtering')
+
+ parser = etree.HTMLParser()
+ tree = etree.parse(io.StringIO(data), parser)
+ return '\n'.join(etree.tostring(element, pretty_print=True, method='html', encoding='unicode')
+ for element in tree.xpath(subfilter))
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/lib/urlwatch/handler.py new/urlwatch-2.14/lib/urlwatch/handler.py
--- old/urlwatch-2.13/lib/urlwatch/handler.py 2018-06-03 14:42:56.000000000 +0200
+++ new/urlwatch-2.14/lib/urlwatch/handler.py 2018-08-30 10:36:16.000000000 +0200
@@ -51,9 +51,10 @@
self.exception = None
self.traceback = None
self.tries = 0
+ self.etag = None
def load(self):
- self.old_data, self.timestamp, self.tries = self.cache_storage.load(self.job, self.job.get_guid())
+ self.old_data, self.timestamp, self.tries, self.etag = self.cache_storage.load(self.job, self.job.get_guid())
if self.tries is None:
self.tries = 0
@@ -62,7 +63,7 @@
# If no new data has been retrieved due to an exception, use the old job data
self.new_data = self.old_data
- self.cache_storage.save(self.job, self.job.get_guid(), self.new_data, time.time(), self.tries)
+ self.cache_storage.save(self.job, self.job.get_guid(), self.new_data, time.time(), self.tries, self.etag)
def process(self):
logger.info('Processing: %s', self.job)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/lib/urlwatch/jobs.py new/urlwatch-2.14/lib/urlwatch/jobs.py
--- old/urlwatch-2.13/lib/urlwatch/jobs.py 2018-06-03 14:42:56.000000000 +0200
+++ new/urlwatch-2.14/lib/urlwatch/jobs.py 2018-08-30 10:36:16.000000000 +0200
@@ -180,7 +180,7 @@
__required__ = ('url',)
__optional__ = ('cookies', 'data', 'method', 'ssl_no_verify', 'ignore_cached', 'http_proxy', 'https_proxy',
- 'headers')
+ 'headers', 'ignore_connection_errors')
CHARSET_RE = re.compile('text/(html|plain); charset=([^;]*)')
@@ -197,10 +197,14 @@
'https': os.getenv('HTTPS_PROXY'),
}
+ if job_state.etag is not None:
+ headers['If-None-Match'] = job_state.etag
+
if job_state.timestamp is not None:
headers['If-Modified-Since'] = email.utils.formatdate(job_state.timestamp)
if self.ignore_cached:
+ headers['If-None-Match'] = None
headers['If-Modified-Since'] = email.utils.formatdate(0)
headers['Cache-Control'] = 'max-age=172800'
headers['Expires'] = email.utils.formatdate()
@@ -234,9 +238,12 @@
proxies=proxies)
response.raise_for_status()
- if response.status_code == 304:
+ if response.status_code == requests.codes.not_modified:
raise NotModifiedError()
+ # Save ETag from response into job_state, which will be saved in cache
+ job_state.etag = response.headers.get('ETag')
+
# If we can't find the encoding in the headers, requests gets all
# old-RFC-y and assumes ISO-8859-1 instead of UTF-8. Use the old
# urlwatch behavior and try UTF-8 decoding first.
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/lib/urlwatch/reporters.py new/urlwatch-2.14/lib/urlwatch/reporters.py
--- old/urlwatch-2.13/lib/urlwatch/reporters.py 2018-06-03 14:42:56.000000000 +0200
+++ new/urlwatch-2.14/lib/urlwatch/reporters.py 2018-08-30 10:36:16.000000000 +0200
@@ -485,7 +485,7 @@
try:
json_res = result.json()
- if (result.status_code == 200):
+ if (result.status_code == requests.codes.ok):
logger.info("Mailgun response: id '{0}'. {1}".format(json_res['id'], json_res['message']))
else:
logger.error("Mailgun error: {0}".format(json_res['message']))
@@ -506,7 +506,8 @@
def submit(self):
bot_token = self.config['bot_token']
- chat_id = self.config['chat_id']
+ chat_ids = self.config['chat_id']
+ chat_ids = [chat_ids] if isinstance(chat_ids, str) else chat_ids
text = '\n'.join(super().submit())
@@ -515,9 +516,11 @@
return
result = None
-
for chunk in self.chunkstring(text, self.MAX_LENGTH):
- result = self.submitToTelegram(bot_token, chat_id, chunk)
+ for chat_id in chat_ids:
+ res = self.submitToTelegram(bot_token, chat_id, chunk)
+ if res.status_code != requests.codes.ok or res is None:
+ result = res
return result
@@ -529,7 +532,7 @@
try:
json_res = result.json()
- if (result.status_code == 200):
+ if (result.status_code == requests.codes.ok):
logger.info("Telegram response: ok '{0}'. {1}".format(json_res['ok'], json_res['result']))
else:
logger.error("Telegram error: {0}".format(json_res['description']))
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/lib/urlwatch/storage.py new/urlwatch-2.14/lib/urlwatch/storage.py
--- old/urlwatch-2.13/lib/urlwatch/storage.py 2018-06-03 14:42:56.000000000 +0200
+++ new/urlwatch-2.14/lib/urlwatch/storage.py 2018-08-30 10:36:16.000000000 +0200
@@ -97,6 +97,11 @@
'enabled': False,
'api_key': '',
},
+ 'telegram': {
+ 'enabled': False,
+ 'bot_token': '',
+ 'chat-id': '',
+ },
'mailgun': {
'enabled': False,
'api_key': '',
@@ -359,7 +364,7 @@
...
@abstractmethod
- def save(self, job, guid, data, timestamp, tries):
+ def save(self, job, guid, data, timestamp, tries, etag=None):
...
@abstractmethod
@@ -372,12 +377,12 @@
def backup(self):
for guid in self.get_guids():
- data, timestamp, tries = self.load(None, guid)
- yield guid, data, timestamp, tries
+ data, timestamp, tries, etag = self.load(None, guid)
+ yield guid, data, timestamp, tries, etag
def restore(self, entries):
- for guid, data, timestamp, tries in entries:
- self.save(None, guid, data, timestamp, tries)
+ for guid, data, timestamp, tries, etag in entries:
+ self.save(None, guid, data, timestamp, tries, etag)
def gc(self, known_guids):
for guid in set(self.get_guids()) - set(known_guids):
@@ -420,10 +425,10 @@
timestamp = os.stat(filename)[stat.ST_MTIME]
- return data, timestamp
+ return data, timestamp, None
- def save(self, job, guid, data, timestamp):
- # Timestamp is always ignored
+ def save(self, job, guid, data, timestamp, etag=None):
+ # Timestamp and ETag are always ignored
filename = self._get_filename(guid)
with open(filename, 'w+') as fp:
fp.write(data)
@@ -443,6 +448,7 @@
timestamp = int
data = str
tries = int
+ etag = str
class CacheMiniDBStorage(CacheStorage):
@@ -464,15 +470,15 @@
return (guid for guid, in CacheEntry.query(self.db, minidb.Function('distinct', CacheEntry.c.guid)))
def load(self, job, guid):
- for data, timestamp, tries in CacheEntry.query(self.db, CacheEntry.c.data // CacheEntry.c.timestamp // CacheEntry.c.tries,
- order_by=minidb.columns(CacheEntry.c.timestamp.desc, CacheEntry.c.tries.desc),
- where=CacheEntry.c.guid == guid, limit=1):
- return data, timestamp, tries
+ for data, timestamp, tries, etag in CacheEntry.query(self.db, CacheEntry.c.data // CacheEntry.c.timestamp // CacheEntry.c.tries // CacheEntry.c.etag,
+ order_by=minidb.columns(CacheEntry.c.timestamp.desc, CacheEntry.c.tries.desc),
+ where=CacheEntry.c.guid == guid, limit=1):
+ return data, timestamp, tries, etag
- return None, None, 0
+ return None, None, 0, None
- def save(self, job, guid, data, timestamp, tries):
- self.db.save(CacheEntry(guid=guid, timestamp=timestamp, data=data, tries=tries))
+ def save(self, job, guid, data, timestamp, tries, etag=None):
+ self.db.save(CacheEntry(guid=guid, timestamp=timestamp, data=data, tries=tries, etag=etag))
self.db.commit()
def delete(self, guid):
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/lib/urlwatch/worker.py new/urlwatch-2.14/lib/urlwatch/worker.py
--- old/urlwatch-2.13/lib/urlwatch/worker.py 2018-06-03 14:42:56.000000000 +0200
+++ new/urlwatch-2.14/lib/urlwatch/worker.py 2018-08-30 10:36:16.000000000 +0200
@@ -70,6 +70,8 @@
if isinstance(job_state.exception, NotModifiedError):
logger.info('Job %s has not changed (HTTP 304)', job_state.job)
report.unchanged(job_state)
+ elif isinstance(job_state.exception, requests.exceptions.ConnectionError) and job_state.job.ignore_connection_errors:
+ logger.info('Connection error while executing job %s, ignored due to ignore_connection_errors', job_state.job)
elif job_state.tries < max_tries:
logger.debug('This was try %i of %i for job %s', job_state.tries,
max_tries, job_state.job)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/setup.cfg new/urlwatch-2.14/setup.cfg
--- old/urlwatch-2.13/setup.cfg 2018-06-03 14:42:56.000000000 +0200
+++ new/urlwatch-2.14/setup.cfg 2018-08-30 10:36:16.000000000 +0200
@@ -1,2 +1,2 @@
-[pep8]
+[pycodestyle]
max-line-length = 120
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/setup.py new/urlwatch-2.14/setup.py
--- old/urlwatch-2.13/setup.py 2018-06-03 14:42:56.000000000 +0200
+++ new/urlwatch-2.14/setup.py 2018-08-30 10:36:16.000000000 +0200
@@ -1,6 +1,7 @@
#!/usr/bin/env python3
from setuptools import setup
+from distutils import cmd
import os
import re
@@ -16,7 +17,7 @@
m['name'] = 'urlwatch'
m['author'], m['author_email'] = re.match(r'(.*) <(.*)>', m['author']).groups()
m['description'], m['long_description'] = docs[0].strip().split('\n\n', 1)
-m['install_requires'] = ['minidb', 'PyYAML', 'requests', 'keyring', 'pycodestyle', 'appdirs']
+m['install_requires'] = ['minidb', 'PyYAML', 'requests', 'keyring', 'pycodestyle', 'appdirs', 'lxml']
m['scripts'] = ['urlwatch']
m['package_dir'] = {'': 'lib'}
m['packages'] = ['urlwatch']
@@ -29,5 +30,29 @@
]),
]
+
+class InstallDependencies(cmd.Command):
+ """Install dependencies only"""
+
+ description = 'Only install required packages using pip'
+ user_options = []
+
+ def initialize_options(self):
+ ...
+
+ def finalize_options(self):
+ ...
+
+ def run(self):
+ global m
+ try:
+ from pip._internal import main
+ except ImportError:
+ from pip import main
+ main(['install', '--upgrade'] + m['install_requires'])
+
+
+m['cmdclass'] = {'install_dependencies': InstallDependencies}
+
del m['copyright']
setup(**m)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/test/test_filters.py new/urlwatch-2.14/test/test_filters.py
--- old/urlwatch-2.13/test/test_filters.py 2018-06-03 14:42:56.000000000 +0200
+++ new/urlwatch-2.14/test/test_filters.py 2018-08-30 10:36:16.000000000 +0200
@@ -1,5 +1,6 @@
from urlwatch.filters import GetElementById
from urlwatch.filters import GetElementByTag
+from urlwatch.filters import JsonFormatFilter
from nose.tools import eq_
@@ -35,3 +36,29 @@
""", 'div')
print(result)
eq_(result, """<div>foo</div><div>bar</div>""")
+
+
+def test_json_format_filter():
+ json_format_filter = JsonFormatFilter(None, None)
+ result = json_format_filter.filter(
+ """{"field1": {"f1.1": "value"},"field2": "value"}""")
+ print(result)
+ eq_(result, """{
+ "field1": {
+ "f1.1": "value"
+ },
+ "field2": "value"
+}""")
+
+
+def test_json_format_filter_subfilter():
+ json_format_filter = JsonFormatFilter(None, None)
+ result = json_format_filter.filter(
+ """{"field1": {"f1.1": "value"},"field2": "value"}""", "2")
+ print(result)
+ eq_(result, """{
+ "field1": {
+ "f1.1": "value"
+ },
+ "field2": "value"
+}""")
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/urlwatch-2.13/test/test_handler.py new/urlwatch-2.14/test/test_handler.py
--- old/urlwatch-2.13/test/test_handler.py 2018-06-03 14:42:56.000000000 +0200
+++ new/urlwatch-2.14/test/test_handler.py 2018-08-30 10:36:16.000000000 +0200
@@ -161,14 +161,14 @@
def test_number_of_tries_in_cache_is_increased():
urlwatcher, cache_storage = prepare_retry_test()
job = urlwatcher.jobs[0]
- old_data, timestamp, tries = cache_storage.load(job, job.get_guid())
+ old_data, timestamp, tries, etag = cache_storage.load(job, job.get_guid())
assert tries == 0
urlwatcher.run_jobs()
urlwatcher.run_jobs()
job = urlwatcher.jobs[0]
- old_data, timestamp, tries = cache_storage.load(job, job.get_guid())
+ old_data, timestamp, tries, etag = cache_storage.load(job, job.get_guid())
assert tries == 2
assert urlwatcher.report.job_states[-1].verb == 'error'
@@ -179,7 +179,7 @@
urlwatcher, cache_storage = prepare_retry_test()
job = urlwatcher.jobs[0]
- old_data, timestamp, tries = cache_storage.load(job, job.get_guid())
+ old_data, timestamp, tries, etag = cache_storage.load(job, job.get_guid())
assert tries == 0
urlwatcher.run_jobs()
@@ -194,13 +194,13 @@
urlwatcher, cache_storage = prepare_retry_test()
job = urlwatcher.jobs[0]
- old_data, timestamp, tries = cache_storage.load(job, job.get_guid())
+ old_data, timestamp, tries, etag = cache_storage.load(job, job.get_guid())
assert tries == 0
urlwatcher.run_jobs()
job = urlwatcher.jobs[0]
- old_data, timestamp, tries = cache_storage.load(job, job.get_guid())
+ old_data, timestamp, tries, etag = cache_storage.load(job, job.get_guid())
assert tries == 1
# use an url that definitely exists
@@ -210,5 +210,5 @@
urlwatcher.run_jobs()
job = urlwatcher.jobs[0]
- old_data, timestamp, tries = cache_storage.load(job, job.get_guid())
+ old_data, timestamp, tries, etag = cache_storage.load(job, job.get_guid())
assert tries == 0