From patchwork Thu Dec 20 06:56:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Arun Isaac X-Patchwork-Id: 494 Return-Path: X-Original-To: patchwork@mira.cbaines.net Delivered-To: patchwork@mira.cbaines.net Received: by mira.cbaines.net (Postfix, from userid 113) id 2BB1F1695A; Thu, 20 Dec 2018 06:57:16 +0000 (GMT) X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on mira.cbaines.net X-Spam-Level: X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00, RCVD_IN_BL_SPAMCOP_NET,RCVD_IN_DNSWL_HI,RCVD_IN_SORBS_WEB,T_DKIM_INVALID, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) by mira.cbaines.net (Postfix) with ESMTP id 8F5EE16897 for ; Thu, 20 Dec 2018 06:57:15 +0000 (GMT) Received: from localhost ([::1]:35568 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gZsGg-0000ih-IS for patchwork@mira.cbaines.net; Thu, 20 Dec 2018 01:57:14 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50675) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gZsGe-0000ia-5o for guix-patches@gnu.org; Thu, 20 Dec 2018 01:57:13 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gZsGZ-00089I-03 for guix-patches@gnu.org; Thu, 20 Dec 2018 01:57:12 -0500 Received: from debbugs.gnu.org ([208.118.235.43]:51111) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gZsGT-00085v-Vr for guix-patches@gnu.org; Thu, 20 Dec 2018 01:57:03 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1gZsGT-0004Gl-RP for guix-patches@gnu.org; Thu, 20 Dec 2018 01:57:01 -0500 X-Loop: help-debbugs@gnu.org Subject: [bug#33801] import: github: Support source URIs that redirect to GitHub Resent-From: Arun Isaac Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Thu, 20 Dec 2018 06:57:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 33801 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: To: Ludovic =?utf-8?q?Court=C3=A8s?= Received: via spool by 33801-submit@debbugs.gnu.org id=B33801.154528900216381 (code B ref 33801); Thu, 20 Dec 2018 06:57:01 +0000 Received: (at 33801) by debbugs.gnu.org; 20 Dec 2018 06:56:42 +0000 Received: from localhost ([127.0.0.1]:55369 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gZsG8-0004G8-QO for submit@debbugs.gnu.org; Thu, 20 Dec 2018 01:56:41 -0500 Received: from vultr.systemreboot.net ([45.77.148.100]:49746) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gZsG5-0004Fx-TM for 33801@debbugs.gnu.org; Thu, 20 Dec 2018 01:56:38 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=systemreboot.net; s=default; h=Content-Type:MIME-Version:Message-ID:Date: References:In-Reply-To:Subject:Cc:To:From:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=v0/WUaQWTJUrZwfhWimL0xYzpPPFZs4QKlsKBwUV25Y=; b=OkXA0RwSFZynmzlfnx4oC0Pp/ ywrdkNHv3MEVWZihU3m8hfb3Til0d+4khLjhMcjXb3utTa7WgtWvvAbwAXojwjh26xJh/DmMe2rYO hrQ7jIdBgAXAjwwwLWAlb3lk+On1mUzF/PPbSN2F96LI0ucqyavvyj6N4eSSHpWSeWe5c=; Received: from [103.5.134.173] (helo=steel) by systemreboot.net with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.91) (envelope-from ) id 1gZsG1-0002DE-Ke; Thu, 20 Dec 2018 12:26:34 +0530 From: Arun Isaac In-Reply-To: <87d0pxtciz.fsf@gnu.org> References: <87d0pxtciz.fsf@gnu.org> Date: Thu, 20 Dec 2018 12:26:26 +0530 Message-ID: MIME-Version: 1.0 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: 33801@debbugs.gnu.org Errors-To: guix-patches-bounces+patchwork=mira.cbaines.net@gnu.org Sender: "Guix-patches" X-getmail-retrieved-from-mailbox: Patches > Do you know how many packages fall into that category? With this patch, we have a problem estimating the coverage using `guix refresh -L'. Now, to estimate coverage, we need to make HTTP requests for every single source tarball in Guix to determine if it redirects to GitHub. This is an enormous number of HTTP requests! When I ran `guix refresh -L', it took a very long time to finish coverage estimation. So, I cancelled the command. Any better way to handle this? >> +(define (follow-redirects-to-github uri) >> + "Follow redirects of URI until a GitHub URI is found. Return that GitHub >> +URI. If no GitHub URI is found, return #f." > > Perhaps add the yt-dl.org example as a comment here. I added a reference to the youtube-dl package in the comments. I also added a few more comments in other places. >> + (define (follow-redirect uri) >> + (receive (response body) (http-get uri #:streaming? #t) > > Add: (close-port body). I switched to using (http-head uri) instead of (http-get uri #:streaming? #t). So, (close-port body) should no longer be required. I also modified follow-redirects-to-github to avoid following redirects on mirror and file URIs. Please find attached a new patch. From 7fa1daaf44720fa31813e4f07a2c49a2540a0526 Mon Sep 17 00:00:00 2001 From: Arun Isaac Date: Wed, 19 Dec 2018 15:59:52 +0530 Subject: [PATCH] import: github: Support source URIs that redirect to GitHub. * guix/import/github.scm (follow-redirects-to-github): New function. (updated-github-url)[updated-url]: For source URIs on other domains, replace all instances of the old version with the new version. (latest-release)[origin-github-uri]: If necessary, follow redirects to find the GitHub URI. --- guix/import/github.scm | 41 +++++++++++++++++++++++++++++++++++++---- 1 file changed, 37 insertions(+), 4 deletions(-) diff --git a/guix/import/github.scm b/guix/import/github.scm index af9f56e1d..8db7db305 100644 --- a/guix/import/github.scm +++ b/guix/import/github.scm @@ -1,6 +1,7 @@ ;;; GNU Guix --- Functional package management for GNU ;;; Copyright © 2016 Ben Woodcroft ;;; Copyright © 2017, 2018 Ludovic Courtès +;;; Copyright © 2018 Arun Isaac ;;; ;;; This file is part of GNU Guix. ;;; @@ -19,6 +20,8 @@ (define-module (guix import github) #:use-module (ice-9 match) + #:use-module (ice-9 receive) + #:use-module (ice-9 regex) #:use-module (srfi srfi-1) #:use-module (srfi srfi-26) #:use-module (srfi srfi-34) @@ -29,6 +32,8 @@ #:use-module (guix packages) #:use-module (guix upstream) #:use-module (guix http-client) + #:use-module (web client) + #:use-module (web response) #:use-module (web uri) #:export (%github-updater)) @@ -39,12 +44,30 @@ false if none is recognized" (list ".tar.gz" ".tar.bz2" ".tar.xz" ".zip" ".tar" ".tgz" ".tbz" ".love"))) +(define (follow-redirects-to-github uri) + "Follow redirects of URI until a GitHub URI is found. Return that GitHub +URI. If no GitHub URI is found, return #f." + (define (follow-redirect uri) + (receive (response body) (http-head uri) + (case (response-code response) + ((301 302) + (uri->string (assoc-ref (response-headers response) 'location))) + (else #f)))) + + (cond + ((string-prefix? "https://github.com/" uri) uri) + ((string-prefix? "http" uri) + (and=> (follow-redirect uri) follow-redirects-to-github)) + ;; Do not attempt to follow redirects on URIs other than http and https + ;; (such as mirror, file) + (else #f))) + (define (updated-github-url old-package new-version) ;; Return a url for the OLD-PACKAGE with NEW-VERSION. If no source url in ;; the OLD-PACKAGE is a GitHub url, then return false. (define (updated-url url) - (if (string-prefix? "https://github.com/" url) + (if (follow-redirects-to-github url) (let ((ext (or (find-extension url) "")) (name (package-name old-package)) (version (package-version old-package)) @@ -83,7 +106,14 @@ false if none is recognized" url) (string-append "/releases/download/" repo "-" version "/" repo "-" version ext)) - (#t #f))) ; Some URLs are not recognised. + ;; As a last resort, attempt to replace all instances of the old + ;; version with the new version. This is necessary to handle URIs + ;; hosted on other domains that redirect to GitHub (for an example, + ;; see the youtube-dl package). We do not know the internal + ;; structure of these URIs and cannot handle them more + ;; intelligently. + (else (regexp-substitute/global + #f version url 'pre new-version 'post)))) #f)) (let ((source-url (and=> (package-source old-package) origin-uri)) @@ -210,11 +240,14 @@ https://github.com/settings/tokens")) (define (latest-release pkg) "Return an for the latest release of PKG." (define (origin-github-uri origin) + ;; We follow redirects to GitHub because the origin URI might appear to be + ;; hosted on some other domain but just redirects to GitHub. For example, + ;; see the youtube-dl package. (match (origin-uri origin) ((? string? url) - url) ;surely a github.com URL + (follow-redirects-to-github url)) ((urls ...) - (find (cut string-contains <> "github.com") urls)))) + (find follow-redirects-to-github urls)))) (let* ((source-uri (origin-github-uri (package-source pkg))) (name (package-name pkg)) -- 2.19.2