Hello community, here is the log from the commit of package youtube-dl for openSUSE:Factory checked in at 2019-04-28 20:15:13 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Comparing /work/SRC/openSUSE:Factory/youtube-dl (Old) and /work/SRC/openSUSE:Factory/.youtube-dl.new.5536 (New) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Package is "youtube-dl" Sun Apr 28 20:15:13 2019 rev:104 rq:698682 version:2019.04.24 Changes: -------- --- /work/SRC/openSUSE:Factory/youtube-dl/python-youtube-dl.changes 2019-04-22 12:25:58.392971569 +0200 +++ /work/SRC/openSUSE:Factory/.youtube-dl.new.5536/python-youtube-dl.changes 2019-04-28 20:15:39.214336440 +0200 @@ -2 +2 @@ -Fri Apr 19 19:53:24 UTC 2019 - Luigi Baldoni <aloisio@gmx.com> +Wed Apr 24 06:29:42 UTC 2019 - Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> @@ -4 +4,25 @@ -- Fix runtime requirements +- Update to new upstream release 2019.04.24 + * youtube: Fix extraction (#20758, #20759, #20761, #20762, #20764, #20766, + #20767, #20769, #20771, #20768, #20770) + * toutv: Fix extraction and extract series info (#20757) + * vrv: Add support for movie listings (#19229) + * youtube: Print error when no data is available (#20737) + * soundcloud: Add support for new rendition and improve extraction (#20699) + * ooyala: Add support for geo verification proxy + * nrl: Add support for nrl.com (#15991) + * vimeo: Extract live archive source format (#19144) + * vimeo: Add support for live streams and improve info extraction (#19144) + * ntvcojp: Add support for cu.ntv.co.jp + * nhk: Extract RTMPT format + * nhk: Add support for audio URLs + * udemy: Add another course id extraction pattern (#20491) + * openload: Add support for oload.services (#20691) + * openload: Add support for openloed.co (#20691, #20693) + * bravotv: Fix extraction (#19213) +- Unify previous changelogs so that pre_checkin.sh do not break them + +------------------------------------------------------------------- +Fri Apr 19 19:54:19 UTC 2019 - Luigi Baldoni <aloisio@gmx.com> + +- youtube-dl: Switch build to python3 +- python-youtube-dl: Fix runtime requirements @@ -30,0 +55 @@ +- Require full python [boo#1121694, boo#1120842] --- /work/SRC/openSUSE:Factory/youtube-dl/youtube-dl.changes 2019-04-22 12:25:58.464971541 +0200 +++ /work/SRC/openSUSE:Factory/.youtube-dl.new.5536/youtube-dl.changes 2019-04-28 20:15:39.402336324 +0200 @@ -1,0 +2,23 @@ +Wed Apr 24 06:29:42 UTC 2019 - Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> + +- Update to new upstream release 2019.04.24 + * youtube: Fix extraction (#20758, #20759, #20761, #20762, #20764, #20766, + #20767, #20769, #20771, #20768, #20770) + * toutv: Fix extraction and extract series info (#20757) + * vrv: Add support for movie listings (#19229) + * youtube: Print error when no data is available (#20737) + * soundcloud: Add support for new rendition and improve extraction (#20699) + * ooyala: Add support for geo verification proxy + * nrl: Add support for nrl.com (#15991) + * vimeo: Extract live archive source format (#19144) + * vimeo: Add support for live streams and improve info extraction (#19144) + * ntvcojp: Add support for cu.ntv.co.jp + * nhk: Extract RTMPT format + * nhk: Add support for audio URLs + * udemy: Add another course id extraction pattern (#20491) + * openload: Add support for oload.services (#20691) + * openload: Add support for openloed.co (#20691, #20693) + * bravotv: Fix extraction (#19213) +- Unify previous changelogs so that pre_checkin.sh do not break them + +------------------------------------------------------------------- @@ -4 +27,2 @@ -- Switch build to python3 +- youtube-dl: Switch build to python3 +- python-youtube-dl: Fix runtime requirements @@ -30,0 +55 @@ +- Require full python [boo#1121694, boo#1120842] Old: ---- youtube-dl-2019.04.17.tar.gz youtube-dl-2019.04.17.tar.gz.sig New: ---- youtube-dl-2019.04.24.tar.gz youtube-dl-2019.04.24.tar.gz.sig ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Other differences: ------------------ ++++++ python-youtube-dl.spec ++++++ --- /var/tmp/diff_new_pack.SmF4kd/_old 2019-04-28 20:15:40.178335841 +0200 +++ /var/tmp/diff_new_pack.SmF4kd/_new 2019-04-28 20:15:40.178335841 +0200 @@ -19,7 +19,7 @@ %define modname youtube-dl %{?!python_module:%define python_module() python-%{**} python3-%{**}} Name: python-youtube-dl -Version: 2019.04.17 +Version: 2019.04.24 Release: 0 Summary: A python module for downloading from video sites for offline watching License: SUSE-Public-Domain AND CC-BY-SA-3.0 ++++++ youtube-dl.spec ++++++ --- /var/tmp/diff_new_pack.SmF4kd/_old 2019-04-28 20:15:40.202335826 +0200 +++ /var/tmp/diff_new_pack.SmF4kd/_new 2019-04-28 20:15:40.206335823 +0200 @@ -17,7 +17,7 @@ Name: youtube-dl -Version: 2019.04.17 +Version: 2019.04.24 Release: 0 Summary: A tool for downloading from video sites for offline watching License: SUSE-Public-Domain AND CC-BY-SA-3.0 ++++++ youtube-dl-2019.04.17.tar.gz -> youtube-dl-2019.04.24.tar.gz ++++++ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/ChangeLog new/youtube-dl/ChangeLog --- old/youtube-dl/ChangeLog 2019-04-16 19:20:04.000000000 +0200 +++ new/youtube-dl/ChangeLog 2019-04-24 05:05:46.000000000 +0200 @@ -1,3 +1,25 @@ +version 2019.04.24 + +Extractors +* [youtube] Fix extraction (#20758, #20759, #20761, #20762, #20764, #20766, + #20767, #20769, #20771, #20768, #20770) +* [toutv] Fix extraction and extract series info (#20757) ++ [vrv] Add support for movie listings (#19229) ++ [youtube] Print error when no data is available (#20737) ++ [soundcloud] Add support for new rendition and improve extraction (#20699) ++ [ooyala] Add support for geo verification proxy ++ [nrl] Add support for nrl.com (#15991) ++ [vimeo] Extract live archive source format (#19144) ++ [vimeo] Add support for live streams and improve info extraction (#19144) ++ [ntvcojp] Add support for cu.ntv.co.jp ++ [nhk] Extract RTMPT format ++ [nhk] Add support for audio URLs ++ [udemy] Add another course id extraction pattern (#20491) ++ [openload] Add support for oload.services (#20691) ++ [openload] Add support for openloed.co (#20691, #20693) +* [bravotv] Fix extraction (#19213) + + version 2019.04.17 Extractors diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/docs/supportedsites.md new/youtube-dl/docs/supportedsites.md --- old/youtube-dl/docs/supportedsites.md 2019-04-16 19:20:09.000000000 +0200 +++ new/youtube-dl/docs/supportedsites.md 2019-04-24 05:05:54.000000000 +0200 @@ -201,6 +201,7 @@ - **CSpan**: C-SPAN - **CtsNews**: 華視新聞 - **CTVNews** + - **cu.ntv.co.jp**: Nippon Television Network - **Culturebox** - **CultureUnplugged** - **curiositystream** @@ -624,6 +625,7 @@ - **NRKTVEpisodes** - **NRKTVSeason** - **NRKTVSeries** + - **NRLTV** - **ntv.ru** - **Nuvid** - **NYTimes** Binary files old/youtube-dl/youtube-dl and new/youtube-dl/youtube-dl differ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/bravotv.py new/youtube-dl/youtube_dl/extractor/bravotv.py --- old/youtube-dl/youtube_dl/extractor/bravotv.py 2019-04-08 01:24:14.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/bravotv.py 2019-04-16 20:08:06.000000000 +0200 @@ -1,6 +1,8 @@ # coding: utf-8 from __future__ import unicode_literals +import re + from .adobepass import AdobePassIE from ..utils import ( smuggle_url, @@ -12,16 +14,16 @@ class BravoTVIE(AdobePassIE): _VALID_URL = r'https?://(?:www\.)?bravotv\.com/(?:[^/]+/)+(?P<id>[^/?#]+)' _TESTS = [{ - 'url': 'http://www.bravotv.com/last-chance-kitchen/season-5/videos/lck-ep-12-fishy-f...', - 'md5': '9086d0b7ef0ea2aabc4781d75f4e5863', + 'url': 'https://www.bravotv.com/top-chef/season-16/episode-15/videos/the-top-chef-se...', + 'md5': 'e34684cfea2a96cd2ee1ef3a60909de9', 'info_dict': { - 'id': 'zHyk1_HU_mPy', + 'id': 'epL0pmK1kQlT', 'ext': 'mp4', - 'title': 'LCK Ep 12: Fishy Finale', - 'description': 'S13/E12: Two eliminated chefs have just 12 minutes to cook up a delicious fish dish.', + 'title': 'The Top Chef Season 16 Winner Is...', + 'description': 'Find out who takes the title of Top Chef!', 'uploader': 'NBCU-BRAV', - 'upload_date': '20160302', - 'timestamp': 1456945320, + 'upload_date': '20190314', + 'timestamp': 1552591860, } }, { 'url': 'http://www.bravotv.com/below-deck/season-3/ep-14-reunion-part-1', @@ -32,30 +34,38 @@ display_id = self._match_id(url) webpage = self._download_webpage(url, display_id) settings = self._parse_json(self._search_regex( - r'jQuery\.extend\(Drupal\.settings\s*,\s*({.+?})\);', webpage, 'drupal settings'), + r'<script[^>]+data-drupal-selector="drupal-settings-json"[^>]*>({.+?})</script>', webpage, 'drupal settings'), display_id) info = {} query = { 'mbr': 'true', } account_pid, release_pid = [None] * 2 - tve = settings.get('sharedTVE') + tve = settings.get('ls_tve') if tve: query['manifest'] = 'm3u' - account_pid = 'HNK2IC' - release_pid = tve['release_pid'] + mobj = re.search(r'<[^>]+id="pdk-player"[^>]+data-url=["\']?(?:https?:)?//player\.theplatform\.com/p/([^/]+)/(?:[^/]+/)*select/([^?#&"\']+)', webpage) + if mobj: + account_pid, tp_path = mobj.groups() + release_pid = tp_path.strip('/').split('/')[-1] + else: + account_pid = 'HNK2IC' + tp_path = release_pid = tve['release_pid'] if tve.get('entitlement') == 'auth': - adobe_pass = settings.get('adobePass', {}) + adobe_pass = settings.get('tve_adobe_auth', {}) resource = self._get_mvpd_resource( adobe_pass.get('adobePassResourceId', 'bravo'), tve['title'], release_pid, tve.get('rating')) query['auth'] = self._extract_mvpd_auth( url, release_pid, adobe_pass.get('adobePassRequestorId', 'bravo'), resource) else: - shared_playlist = settings['shared_playlist'] + shared_playlist = settings['ls_playlist'] account_pid = shared_playlist['account_pid'] metadata = shared_playlist['video_metadata'][shared_playlist['default_clip']] - release_pid = metadata['release_pid'] + tp_path = release_pid = metadata.get('release_pid') + if not release_pid: + release_pid = metadata['guid'] + tp_path = 'media/guid/2140479951/' + release_pid info.update({ 'title': metadata['title'], 'description': metadata.get('description'), @@ -67,7 +77,7 @@ '_type': 'url_transparent', 'id': release_pid, 'url': smuggle_url(update_url_query( - 'http://link.theplatform.com/s/%s/%s' % (account_pid, release_pid), + 'http://link.theplatform.com/s/%s/%s' % (account_pid, tp_path), query), {'force_smil_url': True}), 'ie_key': 'ThePlatform', }) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/common.py new/youtube-dl/youtube_dl/extractor/common.py --- old/youtube-dl/youtube_dl/extractor/common.py 2019-04-08 01:24:14.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/common.py 2019-04-16 20:08:06.000000000 +0200 @@ -2019,6 +2019,8 @@ if res is False: return [] mpd_doc, urlh = res + if mpd_doc is None: + return [] mpd_base_url = base_url(urlh.geturl()) return self._parse_mpd_formats( diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/extractors.py new/youtube-dl/youtube_dl/extractor/extractors.py --- old/youtube-dl/youtube_dl/extractor/extractors.py 2019-04-08 01:24:15.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/extractors.py 2019-04-16 20:08:06.000000000 +0200 @@ -808,6 +808,8 @@ NRKTVSeasonIE, NRKTVSeriesIE, ) +from .nrl import NRLTVIE +from .ntvcojp import NTVCoJpCUIE from .ntvde import NTVDeIE from .ntvru import NTVRuIE from .nytimes import ( diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/nhk.py new/youtube-dl/youtube_dl/extractor/nhk.py --- old/youtube-dl/youtube_dl/extractor/nhk.py 2019-04-08 01:24:15.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/nhk.py 2019-04-16 20:08:07.000000000 +0200 @@ -1,54 +1,81 @@ from __future__ import unicode_literals +import re + from .common import InfoExtractor -from ..utils import ExtractorError class NhkVodIE(InfoExtractor): - _VALID_URL = r'https?://www3\.nhk\.or\.jp/nhkworld/en/(?:vod|ondemand)/(?P<id>[^/]+/[^/?#&]+)' + _VALID_URL = r'https?://www3\.nhk\.or\.jp/nhkworld/(?P<lang>[a-z]{2})/ondemand/(?P<type>video|audio)/(?P<id>\d{7}|[a-z]+-\d{8}-\d+)' + # Content available only for a limited period of time. Visit + # https://www3.nhk.or.jp/nhkworld/en/ondemand/ for working samples. _TESTS = [{ - # Videos available only for a limited period of time. Visit - # http://www3.nhk.or.jp/nhkworld/en/vod/ for working samples. - 'url': 'http://www3.nhk.or.jp/nhkworld/en/vod/tokyofashion/20160815', - 'info_dict': { - 'id': 'A1bnNiNTE6nY3jLllS-BIISfcC_PpvF5', - 'ext': 'flv', - 'title': 'TOKYO FASHION EXPRESS - The Kimono as Global Fashion', - 'description': 'md5:db338ee6ce8204f415b754782f819824', - 'series': 'TOKYO FASHION EXPRESS', - 'episode': 'The Kimono as Global Fashion', - }, - 'skip': 'Videos available only for a limited period of time', - }, { 'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/video/2015173/', 'only_matching': True, + }, { + 'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/audio/plugin-20190404-1/', + 'only_matching': True, + }, { + 'url': 'https://www3.nhk.or.jp/nhkworld/fr/ondemand/audio/plugin-20190404-1/', + 'only_matching': True, }] - _API_URL = 'http://api.nhk.or.jp/nhkworld/vodesdlist/v1/all/all/all.json?apikey=EJfK8jdS...' + _API_URL_TEMPLATE = 'https://api.nhk.or.jp/nhkworld/%sodesdlist/v7/episode/%s/%s/all%s.json' def _real_extract(self, url): - video_id = self._match_id(url) - - data = self._download_json(self._API_URL, video_id) + lang, m_type, episode_id = re.match(self._VALID_URL, url).groups() + if episode_id.isdigit(): + episode_id = episode_id[:4] + '-' + episode_id[4:] + + is_video = m_type == 'video' + episode = self._download_json( + self._API_URL_TEMPLATE % ('v' if is_video else 'r', episode_id, lang, '/all' if is_video else ''), + episode_id, query={'apikey': 'EJfK8jdS57GqlupFgAfAAwr573q01y6k'})['data']['episodes'][0] + title = episode.get('sub_title_clean') or episode['sub_title'] - try: - episode = next( - e for e in data['data']['episodes'] - if e.get('url') and video_id in e['url']) - except StopIteration: - raise ExtractorError('Unable to find episode') + def get_clean_field(key): + return episode.get(key + '_clean') or episode.get(key) - embed_code = episode['vod_id'] + series = get_clean_field('title') - title = episode.get('sub_title_clean') or episode['sub_title'] - description = episode.get('description_clean') or episode.get('description') - series = episode.get('title_clean') or episode.get('title') + thumbnails = [] + for s, w, h in [('', 640, 360), ('_l', 1280, 720)]: + img_path = episode.get('image' + s) + if not img_path: + continue + thumbnails.append({ + 'id': '%dp' % h, + 'height': h, + 'width': w, + 'url': 'https://www3.nhk.or.jp' + img_path, + }) - return { - '_type': 'url_transparent', - 'ie_key': 'Ooyala', - 'url': 'ooyala:%s' % embed_code, + info = { + 'id': episode_id + '-' + lang, 'title': '%s - %s' % (series, title) if series and title else title, - 'description': description, + 'description': get_clean_field('description'), + 'thumbnails': thumbnails, 'series': series, 'episode': title, } + if is_video: + info.update({ + '_type': 'url_transparent', + 'ie_key': 'Ooyala', + 'url': 'ooyala:' + episode['vod_id'], + }) + else: + audio = episode['audio'] + audio_path = audio['audio'] + info['formats'] = self._extract_m3u8_formats( + 'https://nhks-vh.akamaihd.net/i%s/master.m3u8' % audio_path, + episode_id, 'm4a', m3u8_id='hls', fatal=False) + for proto in ('rtmpt', 'rtmp'): + info['formats'].append({ + 'ext': 'flv', + 'format_id': proto, + 'url': '%s://flv.nhk.or.jp/ondemand/mp4:flv%s' % (proto, audio_path), + 'vcodec': 'none', + }) + for f in info['formats']: + f['language'] = lang + return info diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/nrl.py new/youtube-dl/youtube_dl/extractor/nrl.py --- old/youtube-dl/youtube_dl/extractor/nrl.py 1970-01-01 01:00:00.000000000 +0100 +++ new/youtube-dl/youtube_dl/extractor/nrl.py 2019-04-16 20:08:07.000000000 +0200 @@ -0,0 +1,30 @@ +# coding: utf-8 +from __future__ import unicode_literals + +from .common import InfoExtractor + + +class NRLTVIE(InfoExtractor): + _VALID_URL = r'https?://(?:www\.)?nrl\.com/tv(/[^/]+)*/(?P<id>[^/?]+)' + _TEST = { + 'url': 'https://www.nrl.com/tv/news/match-highlights-titans-v-knights-862805/', + 'info_dict': { + 'id': 'YyNnFuaDE6kPJqlDhG4CGQ_w89mKTau4', + 'ext': 'mp4', + 'title': 'Match Highlights: Titans v Knights', + }, + 'params': { + # m3u8 download + 'skip_download': True, + 'format': 'bestvideo', + }, + } + + def _real_extract(self, url): + display_id = self._match_id(url) + webpage = self._download_webpage(url, display_id) + q_data = self._parse_json(self._search_regex( + r"(?s)q-data='({.+?})'", webpage, 'player data'), display_id) + ooyala_id = q_data['videoId'] + return self.url_result( + 'ooyala:' + ooyala_id, 'Ooyala', ooyala_id, q_data.get('title')) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/ntvcojp.py new/youtube-dl/youtube_dl/extractor/ntvcojp.py --- old/youtube-dl/youtube_dl/extractor/ntvcojp.py 1970-01-01 01:00:00.000000000 +0100 +++ new/youtube-dl/youtube_dl/extractor/ntvcojp.py 2019-04-16 20:08:07.000000000 +0200 @@ -0,0 +1,49 @@ +# coding: utf-8 +from __future__ import unicode_literals + +from .common import InfoExtractor +from ..utils import ( + js_to_json, + smuggle_url, +) + + +class NTVCoJpCUIE(InfoExtractor): + IE_NAME = 'cu.ntv.co.jp' + IE_DESC = 'Nippon Television Network' + _VALID_URL = r'https?://cu\.ntv\.co\.jp/(?!program)(?P<id>[^/?]+)' + _TEST = { + 'url': 'https://cu.ntv.co.jp/televiva-chill-gohan_181031/', + 'info_dict': { + 'id': '5978891207001', + 'ext': 'mp4', + 'title': '桜エビと炒り卵がポイント! 「中華風 エビチリおにぎり」──『美虎』五十嵐美幸', + 'upload_date': '20181213', + 'description': 'md5:211b52f4fd60f3e0e72b68b0c6ba52a9', + 'uploader_id': '3855502814001', + 'timestamp': 1544669941, + }, + 'params': { + # m3u8 download + 'skip_download': True, + }, + } + BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/%s/default_default/index.html?videoId=%s' + + def _real_extract(self, url): + display_id = self._match_id(url) + webpage = self._download_webpage(url, display_id) + player_config = self._parse_json(self._search_regex( + r'(?s)PLAYER_CONFIG\s*=\s*({.+?})', + webpage, 'player config'), display_id, js_to_json) + video_id = player_config['videoId'] + account_id = player_config.get('account') or '3855502814001' + return { + '_type': 'url_transparent', + 'id': video_id, + 'display_id': display_id, + 'title': self._search_regex(r'<h1[^>]+class="title"[^>]*>([^<]+)', webpage, 'title').strip(), + 'description': self._html_search_meta(['description', 'og:description'], webpage), + 'url': smuggle_url(self.BRIGHTCOVE_URL_TEMPLATE % (account_id, video_id), {'geo_countries': ['JP']}), + 'ie_key': 'BrightcoveNew', + } diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/ooyala.py new/youtube-dl/youtube_dl/extractor/ooyala.py --- old/youtube-dl/youtube_dl/extractor/ooyala.py 2019-04-08 01:24:15.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/ooyala.py 2019-04-16 20:08:07.000000000 +0200 @@ -36,7 +36,7 @@ 'domain': domain, 'supportedFormats': supportedformats or 'mp4,rtmp,m3u8,hds,dash,smooth', 'embedToken': embed_token, - }), video_id) + }), video_id, headers=self.geo_verification_headers()) cur_auth_data = auth_data['authorization_data'][embed_code] diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/openload.py new/youtube-dl/youtube_dl/extractor/openload.py --- old/youtube-dl/youtube_dl/extractor/openload.py 2019-04-08 01:24:15.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/openload.py 2019-04-16 20:08:07.000000000 +0200 @@ -244,7 +244,7 @@ class OpenloadIE(InfoExtractor): - _DOMAINS = r'(?:openload\.(?:co|io|link|pw)|oload\.(?:tv|stream|site|xyz|win|download|cloud|cc|icu|fun|club|info|pw|live|space)|oladblock\.(?:services|xyz|me))' + _DOMAINS = r'(?:openload\.(?:co|io|link|pw)|oload\.(?:tv|stream|site|xyz|win|download|cloud|cc|icu|fun|club|info|pw|live|space|services)|oladblock\.(?:services|xyz|me)|openloed\.co)' _VALID_URL = r'''(?x) https?:// (?P<host> @@ -352,6 +352,9 @@ 'url': 'https://oload.space/f/IY4eZSst3u8/', 'only_matching': True, }, { + 'url': 'https://oload.services/embed/bs1NWj1dCag/', + 'only_matching': True, + }, { 'url': 'https://oladblock.services/f/b8NWEgkqNLI/', 'only_matching': True, }, { @@ -360,6 +363,9 @@ }, { 'url': 'https://oladblock.me/f/b8NWEgkqNLI/', 'only_matching': True, + }, { + 'url': 'https://openloed.co/f/b8NWEgkqNLI/', + 'only_matching': True, }] _USER_AGENT_TPL = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/{major}.0.{build}.{patch} Safari/537.36' diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/soundcloud.py new/youtube-dl/youtube_dl/extractor/soundcloud.py --- old/youtube-dl/youtube_dl/extractor/soundcloud.py 2019-04-08 01:24:16.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/soundcloud.py 2019-04-16 20:08:08.000000000 +0200 @@ -15,7 +15,12 @@ ) from ..utils import ( ExtractorError, + float_or_none, int_or_none, + KNOWN_EXTENSIONS, + merge_dicts, + mimetype2ext, + str_or_none, try_get, unified_timestamp, update_url_query, @@ -57,7 +62,7 @@ 'uploader': 'E.T. ExTerrestrial Music', 'timestamp': 1349920598, 'upload_date': '20121011', - 'duration': 143, + 'duration': 143.216, 'license': 'all-rights-reserved', 'view_count': int, 'like_count': int, @@ -100,7 +105,7 @@ 'uploader': 'jaimeMF', 'timestamp': 1386604920, 'upload_date': '20131209', - 'duration': 9, + 'duration': 9.927, 'license': 'all-rights-reserved', 'view_count': int, 'like_count': int, @@ -120,7 +125,7 @@ 'uploader': 'jaimeMF', 'timestamp': 1386604920, 'upload_date': '20131209', - 'duration': 9, + 'duration': 9.927, 'license': 'all-rights-reserved', 'view_count': int, 'like_count': int, @@ -140,7 +145,7 @@ 'uploader': 'oddsamples', 'timestamp': 1389232924, 'upload_date': '20140109', - 'duration': 17, + 'duration': 17.346, 'license': 'cc-by-sa', 'view_count': int, 'like_count': int, @@ -160,7 +165,7 @@ 'uploader': 'Ori Uplift Music', 'timestamp': 1504206263, 'upload_date': '20170831', - 'duration': 7449, + 'duration': 7449.096, 'license': 'all-rights-reserved', 'view_count': int, 'like_count': int, @@ -180,7 +185,7 @@ 'uploader': 'garyvee', 'timestamp': 1488152409, 'upload_date': '20170226', - 'duration': 207, + 'duration': 207.012, 'thumbnail': r're:https?://.*\.jpg', 'license': 'all-rights-reserved', 'view_count': int, @@ -192,9 +197,31 @@ 'skip_download': True, }, }, + # not avaialble via api.soundcloud.com/i1/tracks/id/streams + { + 'url': 'https://soundcloud.com/giovannisarani/mezzo-valzer', + 'md5': 'e22aecd2bc88e0e4e432d7dcc0a1abf7', + 'info_dict': { + 'id': '583011102', + 'ext': 'mp3', + 'title': 'Mezzo Valzer', + 'description': 'md5:4138d582f81866a530317bae316e8b61', + 'uploader': 'Giovanni Sarani', + 'timestamp': 1551394171, + 'upload_date': '20190228', + 'duration': 180.157, + 'thumbnail': r're:https?://.*\.jpg', + 'license': 'all-rights-reserved', + 'view_count': int, + 'like_count': int, + 'comment_count': int, + 'repost_count': int, + }, + 'expected_warnings': ['Unable to download JSON metadata'], + } ] - _CLIENT_ID = 'NmW1FlPaiL94ueEu7oziOWjYEzZzQDcK' + _CLIENT_ID = 'FweeGBOOEOYJWLJN3oEyToGLKhmSz0I7' @staticmethod def _extract_urls(webpage): @@ -202,10 +229,6 @@ r'<iframe[^>]+src=(["\'])(?P<url>(?:https?://)?(?:w\.)?soundcloud\.com/player.+?)\1', webpage)] - def report_resolve(self, video_id): - """Report information extraction.""" - self.to_screen('%s: Resolving id' % video_id) - @classmethod def _resolv_url(cls, url): return 'https://api.soundcloud.com/resolve.json?url=' + url + '&client_id=' + cls._CLIENT_ID @@ -224,6 +247,10 @@ def extract_count(key): return int_or_none(info.get('%s_count' % key)) + like_count = extract_count('favoritings') + if like_count is None: + like_count = extract_count('likes') + result = { 'id': track_id, 'uploader': username, @@ -231,15 +258,17 @@ 'title': title, 'description': info.get('description'), 'thumbnail': thumbnail, - 'duration': int_or_none(info.get('duration'), 1000), + 'duration': float_or_none(info.get('duration'), 1000), 'webpage_url': info.get('permalink_url'), 'license': info.get('license'), 'view_count': extract_count('playback'), - 'like_count': extract_count('favoritings'), + 'like_count': like_count, 'comment_count': extract_count('comment'), 'repost_count': extract_count('reposts'), 'genre': info.get('genre'), } + + format_urls = set() formats = [] query = {'client_id': self._CLIENT_ID} if secret_token is not None: @@ -248,6 +277,7 @@ # We can build a direct link to the song format_url = update_url_query( 'https://api.soundcloud.com/tracks/%s/download' % track_id, query) + format_urls.add(format_url) formats.append({ 'format_id': 'download', 'ext': info.get('original_format', 'mp3'), @@ -256,44 +286,91 @@ 'preference': 10, }) - # We have to retrieve the url + # Old API, does not work for some tracks (e.g. + # https://soundcloud.com/giovannisarani/mezzo-valzer) format_dict = self._download_json( 'https://api.soundcloud.com/i1/tracks/%s/streams' % track_id, - track_id, 'Downloading track url', query=query) + track_id, 'Downloading track url', query=query, fatal=False) - for key, stream_url in format_dict.items(): - ext, abr = 'mp3', None - mobj = re.search(r'_([^_]+)_(\d+)_url', key) - if mobj: - ext, abr = mobj.groups() - abr = int(abr) - if key.startswith('http'): - stream_formats = [{ - 'format_id': key, - 'ext': ext, - 'url': stream_url, - }] - elif key.startswith('rtmp'): - # The url doesn't have an rtmp app, we have to extract the playpath - url, path = stream_url.split('mp3:', 1) - stream_formats = [{ - 'format_id': key, - 'url': url, - 'play_path': 'mp3:' + path, - 'ext': 'flv', - }] - elif key.startswith('hls'): - stream_formats = self._extract_m3u8_formats( - stream_url, track_id, ext, entry_protocol='m3u8_native', - m3u8_id=key, fatal=False) - else: + if format_dict: + for key, stream_url in format_dict.items(): + if stream_url in format_urls: + continue + format_urls.add(stream_url) + ext, abr = 'mp3', None + mobj = re.search(r'_([^_]+)_(\d+)_url', key) + if mobj: + ext, abr = mobj.groups() + abr = int(abr) + if key.startswith('http'): + stream_formats = [{ + 'format_id': key, + 'ext': ext, + 'url': stream_url, + }] + elif key.startswith('rtmp'): + # The url doesn't have an rtmp app, we have to extract the playpath + url, path = stream_url.split('mp3:', 1) + stream_formats = [{ + 'format_id': key, + 'url': url, + 'play_path': 'mp3:' + path, + 'ext': 'flv', + }] + elif key.startswith('hls'): + stream_formats = self._extract_m3u8_formats( + stream_url, track_id, ext, entry_protocol='m3u8_native', + m3u8_id=key, fatal=False) + else: + continue + + if abr: + for f in stream_formats: + f['abr'] = abr + + formats.extend(stream_formats) + + # New API + transcodings = try_get( + info, lambda x: x['media']['transcodings'], list) or [] + for t in transcodings: + if not isinstance(t, dict): continue - - if abr: - for f in stream_formats: - f['abr'] = abr - - formats.extend(stream_formats) + format_url = url_or_none(t.get('url')) + if not format_url: + continue + stream = self._download_json( + update_url_query(format_url, query), track_id, fatal=False) + if not isinstance(stream, dict): + continue + stream_url = url_or_none(stream.get('url')) + if not stream_url: + continue + if stream_url in format_urls: + continue + format_urls.add(stream_url) + protocol = try_get(t, lambda x: x['format']['protocol'], compat_str) + if protocol != 'hls' and '/hls' in format_url: + protocol = 'hls' + ext = None + preset = str_or_none(t.get('preset')) + if preset: + ext = preset.split('_')[0] + if ext not in KNOWN_EXTENSIONS: + mimetype = try_get( + t, lambda x: x['format']['mime_type'], compat_str) + ext = mimetype2ext(mimetype) or 'mp3' + format_id_list = [] + if protocol: + format_id_list.append(protocol) + format_id_list.append(ext) + format_id = '_'.join(format_id_list) + formats.append({ + 'url': stream_url, + 'format_id': format_id, + 'ext': ext, + 'protocol': 'm3u8_native' if protocol == 'hls' else 'http', + }) if not formats: # We fallback to the stream_url in the original info, this @@ -303,11 +380,11 @@ 'url': update_url_query(info['stream_url'], query), 'ext': 'mp3', }) + self._check_formats(formats, track_id) for f in formats: f['vcodec'] = 'none' - self._check_formats(formats, track_id) self._sort_formats(formats) result['formats'] = formats @@ -319,6 +396,7 @@ raise ExtractorError('Invalid URL: %s' % url) track_id = mobj.group('track_id') + new_info = {} if track_id is not None: info_json_url = 'https://api.soundcloud.com/tracks/' + track_id + '.json?client_id=' + self._CLIENT_ID @@ -344,13 +422,31 @@ if token: resolve_title += '/%s' % token - self.report_resolve(full_title) - - url = 'https://soundcloud.com/%s' % resolve_title - info_json_url = self._resolv_url(url) - info = self._download_json(info_json_url, full_title, 'Downloading info JSON') + webpage = self._download_webpage(url, full_title, fatal=False) + if webpage: + entries = self._parse_json( + self._search_regex( + r'var\s+c\s*=\s*(\[.+?\])\s*,\s*o\s*=Date\b', webpage, + 'data', default='[]'), full_title, fatal=False) + if entries: + for e in entries: + if not isinstance(e, dict): + continue + if e.get('id') != 67: + continue + data = try_get(e, lambda x: x['data'][0], dict) + if data: + new_info = data + break + info_json_url = self._resolv_url( + 'https://soundcloud.com/%s' % resolve_title) + + # Contains some additional info missing from new_info + info = self._download_json( + info_json_url, full_title, 'Downloading info JSON') - return self._extract_info_dict(info, full_title, secret_token=token) + return self._extract_info_dict( + merge_dicts(info, new_info), full_title, secret_token=token) class SoundcloudPlaylistBaseIE(SoundcloudIE): @@ -396,8 +492,6 @@ full_title += '/' + token url += '/' + token - self.report_resolve(full_title) - resolv_url = self._resolv_url(url) info = self._download_json(resolv_url, full_title) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/toutv.py new/youtube-dl/youtube_dl/extractor/toutv.py --- old/youtube-dl/youtube_dl/extractor/toutv.py 2019-04-08 01:24:16.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/toutv.py 2019-04-16 20:08:08.000000000 +0200 @@ -66,7 +66,12 @@ def _real_extract(self, url): path = self._match_id(url) - metadata = self._download_json('http://ici.tou.tv/presentation/%s' % path, path) + metadata = self._download_json( + 'https://services.radio-canada.ca/toutv/presentation/%s' % path, path, query={ + 'client_key': self._CLIENT_KEY, + 'device': 'web', + 'version': 4, + }) # IsDrm does not necessarily mean the video is DRM protected (see # https://github.com/ytdl-org/youtube-dl/issues/13994). if metadata.get('IsDrm'): @@ -77,6 +82,12 @@ return merge_dicts({ 'id': video_id, 'title': details.get('OriginalTitle'), + 'description': details.get('Description'), 'thumbnail': details.get('ImageUrl'), 'duration': int_or_none(details.get('LengthInSeconds')), + 'series': metadata.get('ProgramTitle'), + 'season_number': int_or_none(metadata.get('SeasonNumber')), + 'season': metadata.get('SeasonTitle'), + 'episode_number': int_or_none(metadata.get('EpisodeNumber')), + 'episode': metadata.get('EpisodeTitle'), }, self._extract_info(metadata.get('AppCode', 'toutv'), video_id)) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/udemy.py new/youtube-dl/youtube_dl/extractor/udemy.py --- old/youtube-dl/youtube_dl/extractor/udemy.py 2019-04-08 01:24:16.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/udemy.py 2019-04-16 20:08:08.000000000 +0200 @@ -76,7 +76,10 @@ webpage, 'course', default='{}')), video_id, fatal=False) or {} course_id = course.get('id') or self._search_regex( - r'data-course-id=["\'](\d+)', webpage, 'course id') + [ + r'data-course-id=["\'](\d+)', + r'"courseId"\s*:\s*(\d+)' + ], webpage, 'course id') return course_id, course.get('title') def _enroll_course(self, base_url, webpage, course_id): diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/vimeo.py new/youtube-dl/youtube_dl/extractor/vimeo.py --- old/youtube-dl/youtube_dl/extractor/vimeo.py 2019-04-08 01:24:16.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/vimeo.py 2019-04-16 20:08:08.000000000 +0200 @@ -109,23 +109,9 @@ def _parse_config(self, config, video_id): video_data = config['video'] - # Extract title video_title = video_data['title'] - - # Extract uploader, uploader_url and uploader_id - video_uploader = video_data.get('owner', {}).get('name') - video_uploader_url = video_data.get('owner', {}).get('url') - video_uploader_id = video_uploader_url.split('/')[-1] if video_uploader_url else None - - # Extract video thumbnail - video_thumbnail = video_data.get('thumbnail') - if video_thumbnail is None: - video_thumbs = video_data.get('thumbs') - if video_thumbs and isinstance(video_thumbs, dict): - _, video_thumbnail = sorted((int(width if width.isdigit() else 0), t_url) for (width, t_url) in video_thumbs.items())[-1] - - # Extract video duration - video_duration = int_or_none(video_data.get('duration')) + live_event = video_data.get('live_event') or {} + is_live = live_event.get('status') == 'started' formats = [] config_files = video_data.get('files') or config['request'].get('files', {}) @@ -142,6 +128,7 @@ 'tbr': int_or_none(f.get('bitrate')), }) + # TODO: fix handling of 308 status code returned for live archive manifest requests for files_type in ('hls', 'dash'): for cdn_name, cdn_data in config_files.get(files_type, {}).get('cdns', {}).items(): manifest_url = cdn_data.get('url') @@ -151,7 +138,7 @@ if files_type == 'hls': formats.extend(self._extract_m3u8_formats( manifest_url, video_id, 'mp4', - 'm3u8_native', m3u8_id=format_id, + 'm3u8' if is_live else 'm3u8_native', m3u8_id=format_id, note='Downloading %s m3u8 information' % cdn_name, fatal=False)) elif files_type == 'dash': @@ -164,6 +151,10 @@ else: mpd_manifest_urls = [(format_id, manifest_url)] for f_id, m_url in mpd_manifest_urls: + if 'json=1' in m_url: + real_m_url = (self._download_json(m_url, video_id, fatal=False) or {}).get('url') + if real_m_url: + m_url = real_m_url mpd_formats = self._extract_mpd_formats( m_url.replace('/master.json', '/master.mpd'), video_id, f_id, 'Downloading %s MPD information' % cdn_name, @@ -175,6 +166,15 @@ f['preference'] = -40 formats.extend(mpd_formats) + live_archive = live_event.get('archive') or {} + live_archive_source_url = live_archive.get('source_url') + if live_archive_source_url and live_archive.get('status') == 'done': + formats.append({ + 'format_id': 'live-archive-source', + 'url': live_archive_source_url, + 'preference': 1, + }) + subtitles = {} text_tracks = config['request'].get('text_tracks') if text_tracks: @@ -184,15 +184,33 @@ 'url': 'https://vimeo.com' + tt['url'], }] + thumbnails = [] + if not is_live: + for key, thumb in video_data.get('thumbs', {}).items(): + thumbnails.append({ + 'id': key, + 'width': int_or_none(key), + 'url': thumb, + }) + thumbnail = video_data.get('thumbnail') + if thumbnail: + thumbnails.append({ + 'url': thumbnail, + }) + + owner = video_data.get('owner') or {} + video_uploader_url = owner.get('url') + return { - 'title': video_title, - 'uploader': video_uploader, - 'uploader_id': video_uploader_id, + 'title': self._live_title(video_title) if is_live else video_title, + 'uploader': owner.get('name'), + 'uploader_id': video_uploader_url.split('/')[-1] if video_uploader_url else None, 'uploader_url': video_uploader_url, - 'thumbnail': video_thumbnail, - 'duration': video_duration, + 'thumbnails': thumbnails, + 'duration': int_or_none(video_data.get('duration')), 'formats': formats, 'subtitles': subtitles, + 'is_live': is_live, } def _extract_original_format(self, url, video_id): diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/vrv.py new/youtube-dl/youtube_dl/extractor/vrv.py --- old/youtube-dl/youtube_dl/extractor/vrv.py 2019-04-08 01:24:17.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/vrv.py 2019-04-16 20:08:08.000000000 +0200 @@ -102,6 +102,15 @@ # m3u8 download 'skip_download': True, }, + }, { + # movie listing + 'url': 'https://vrv.co/watch/G6NQXZ1J6/Lily-CAT', + 'info_dict': { + 'id': 'G6NQXZ1J6', + 'title': 'Lily C.A.T', + 'description': 'md5:988b031e7809a6aeb60968be4af7db07', + }, + 'playlist_count': 2, }] _NETRC_MACHINE = 'vrv' @@ -123,23 +132,23 @@ def _extract_vrv_formats(self, url, video_id, stream_format, audio_lang, hardsub_lang): if not url or stream_format not in ('hls', 'dash'): return [] - assert audio_lang or hardsub_lang stream_id_list = [] if audio_lang: stream_id_list.append('audio-%s' % audio_lang) if hardsub_lang: stream_id_list.append('hardsub-%s' % hardsub_lang) - stream_id = '-'.join(stream_id_list) - format_id = '%s-%s' % (stream_format, stream_id) + format_id = stream_format + if stream_id_list: + format_id += '-' + '-'.join(stream_id_list) if stream_format == 'hls': adaptive_formats = self._extract_m3u8_formats( url, video_id, 'mp4', m3u8_id=format_id, - note='Downloading %s m3u8 information' % stream_id, + note='Downloading %s information' % format_id, fatal=False) elif stream_format == 'dash': adaptive_formats = self._extract_mpd_formats( url, video_id, mpd_id=format_id, - note='Downloading %s MPD information' % stream_id, + note='Downloading %s information' % format_id, fatal=False) if audio_lang: for f in adaptive_formats: @@ -155,6 +164,23 @@ resource_path = object_data['__links__']['resource']['href'] video_data = self._call_cms(resource_path, video_id, 'video') title = video_data['title'] + description = video_data.get('description') + + if video_data.get('__class__') == 'movie_listing': + items = self._call_cms( + video_data['__links__']['movie_listing/movies']['href'], + video_id, 'movie listing').get('items') or [] + if len(items) != 1: + entries = [] + for item in items: + item_id = item.get('id') + if not item_id: + continue + entries.append(self.url_result( + 'https://vrv.co/watch/' + item_id, + self.ie_key(), item_id, item.get('title'))) + return self.playlist_result(entries, video_id, title, description) + video_data = items[0] streams_path = video_data['__links__'].get('streams', {}).get('href') if not streams_path: @@ -198,7 +224,7 @@ 'formats': formats, 'subtitles': subtitles, 'thumbnails': thumbnails, - 'description': video_data.get('description'), + 'description': description, 'duration': float_or_none(video_data.get('duration_ms'), 1000), 'uploader_id': video_data.get('channel_id'), 'series': video_data.get('series_title'), diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/youtube.py new/youtube-dl/youtube_dl/extractor/youtube.py --- old/youtube-dl/youtube_dl/extractor/youtube.py 2019-04-08 01:24:17.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/youtube.py 2019-04-16 20:08:08.000000000 +0200 @@ -1652,7 +1652,8 @@ view_count = extract_view_count(get_video_info) if not video_info: video_info = get_video_info - if 'token' in get_video_info: + get_token = get_video_info.get('token') or get_video_info.get('account_playback_token') + if get_token: # Different get_video_info requests may report different results, e.g. # some may report video unavailability, but some may serve it without # any complaint (see https://github.com/ytdl-org/youtube-dl/issues/7362, @@ -1662,7 +1663,8 @@ # due to YouTube measures against IP ranges of hosting providers. # Working around by preferring the first succeeded video_info containing # the token if no such video_info yet was found. - if 'token' not in video_info: + token = video_info.get('token') or video_info.get('account_playback_token') + if not token: video_info = get_video_info break @@ -1671,7 +1673,15 @@ r'(?s)<h1[^>]+id="unavailable-message"[^>]*>(.+?)</h1>', video_webpage, 'unavailable message', default=None) - if 'token' not in video_info: + if not video_info: + unavailable_message = extract_unavailable_message() + if not unavailable_message: + unavailable_message = 'Unable to extract video data' + raise ExtractorError( + 'YouTube said: %s' % unavailable_message, expected=True, video_id=video_id) + + token = video_info.get('token') or video_info.get('account_playback_token') + if not token: if 'reason' in video_info: if 'The uploader has not made this video available in your country.' in video_info['reason']: regions_allowed = self._html_search_meta( diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/version.py new/youtube-dl/youtube_dl/version.py --- old/youtube-dl/youtube_dl/version.py 2019-04-16 19:20:04.000000000 +0200 +++ new/youtube-dl/youtube_dl/version.py 2019-04-24 05:05:46.000000000 +0200 @@ -1,3 +1,3 @@ from __future__ import unicode_literals -__version__ = '2019.04.17' +__version__ = '2019.04.24'