templater: add subsetparents(rev, revset) function
authorYuya Nishihara <yuya@tcha.org>
Sun, 15 Mar 2020 16:11:58 +0900
changeset 44592 7cd5c0968139
parent 44591 1f81f680912f
child 44593 496868f1030c
templater: add subsetparents(rev, revset) function Naming suggestions are welcome. And this could be flagged as an (ADVANCED) function since the primary use case is to draw a graph. This provides all data needed for drawing revisions graph filtered by revset, and allows us to implement a GUI graph viewer in some languages better than Python. A frontend grapher will be quite similar to our graphmod since subsetparents() just returns parent-child relations in the filtered sub graph. Frontend example: https://hg.sr.ht/~yuja/hgv/browse/default/core/hgchangesetgrapher.cpp However, the resulting graph will be simpler than the one "hg log -G" would generate because redundant edges are eliminated. This should be the same graph rendering strategy as TortoiseHg. This function could be implemented as a revset predicate, but that would mean the scanning state couldn't be cached and thus slow. Test cases are split to new file since test-template-functions.t is quite big and we'll need a new branchy repository anyway.
mercurial/dagop.py
mercurial/templatefuncs.py
tests/test-template-graph.t
--- a/mercurial/dagop.py	Sun Mar 15 16:00:45 2020 +0900
+++ b/mercurial/dagop.py	Sun Mar 15 16:11:58 2020 +0900
@@ -274,6 +274,238 @@
                 break
 
 
+class subsetparentswalker(object):
+    r"""Scan adjacent ancestors in the graph given by the subset
+
+    This computes parent-child relations in the sub graph filtered by
+    a revset. Primary use case is to draw a revisions graph.
+
+    In the following example, we consider that the node 'f' has edges to all
+    ancestor nodes, but redundant paths are eliminated. The edge 'f'->'b'
+    is eliminated because there is a path 'f'->'c'->'b' for example.
+
+          - d - e -
+         /         \
+        a - b - c - f
+
+    If the node 'c' is filtered out, the edge 'f'->'b' is activated.
+
+          - d - e -
+         /         \
+        a - b -(c)- f
+
+    Likewise, if 'd' and 'e' are filtered out, this edge is fully eliminated
+    since there is a path 'f'->'c'->'b'->'a' for 'f'->'a'.
+
+           (d) (e)
+
+        a - b - c - f
+
+    Implementation-wise, 'f' is passed down to 'a' as unresolved through the
+    'f'->'e'->'d'->'a' path, whereas we do also remember that 'f' has already
+    been resolved while walking down the 'f'->'c'->'b'->'a' path. When
+    processing the node 'a', the unresolved 'f'->'a' path is eliminated as
+    the 'f' end is marked as resolved.
+
+    Ancestors are searched from the tipmost revision in the subset so the
+    results can be cached. You should specify startrev to narrow the search
+    space to ':startrev'.
+    """
+
+    def __init__(self, repo, subset, startrev=None):
+        if startrev is not None:
+            subset = repo.revs(b'%d:null', startrev) & subset
+
+        # equivalent to 'subset = subset.sorted(reverse=True)', but there's
+        # no such function.
+        fastdesc = subset.fastdesc
+        if fastdesc:
+            desciter = fastdesc()
+        else:
+            if not subset.isdescending() and not subset.istopo():
+                subset = smartset.baseset(subset)
+                subset.sort(reverse=True)
+            desciter = iter(subset)
+
+        self._repo = repo
+        self._changelog = repo.changelog
+        self._subset = subset
+
+        # scanning state (see _scanparents):
+        self._tovisit = []
+        self._pendingcnt = {}
+        self._pointers = {}
+        self._parents = {}
+        self._inputhead = nullrev  # reassigned by self._advanceinput()
+        self._inputtail = desciter
+        self._bottomrev = nullrev
+        self._advanceinput()
+
+    def parentsset(self, rev):
+        """Look up parents of the given revision in the subset, and returns
+        as a smartset"""
+        return smartset.baseset(self.parents(rev))
+
+    def parents(self, rev):
+        """Look up parents of the given revision in the subset
+
+        The returned revisions are sorted by parent index (p1/p2).
+        """
+        self._scanparents(rev)
+        return [r for _c, r in sorted(self._parents.get(rev, []))]
+
+    def _parentrevs(self, rev):
+        try:
+            revs = self._changelog.parentrevs(rev)
+            if revs[-1] == nullrev:
+                return revs[:-1]
+            return revs
+        except error.WdirUnsupported:
+            return tuple(pctx.rev() for pctx in self._repo[None].parents())
+
+    def _advanceinput(self):
+        """Advance the input iterator and set the next revision to _inputhead"""
+        if self._inputhead < nullrev:
+            return
+        try:
+            self._inputhead = next(self._inputtail)
+        except StopIteration:
+            self._bottomrev = self._inputhead
+            self._inputhead = nullrev - 1
+
+    def _scanparents(self, stoprev):
+        """Scan ancestors until the parents of the specified stoprev are
+        resolved"""
+
+        # 'tovisit' is the queue of the input revisions and their ancestors.
+        # It will be populated incrementally to minimize the initial cost
+        # of computing the given subset.
+        #
+        # For to-visit revisions, we keep track of
+        # - the number of the unresolved paths: pendingcnt[rev],
+        # - dict of the unresolved descendants and chains: pointers[rev][0],
+        # - set of the already resolved descendants: pointers[rev][1].
+        #
+        # When a revision is visited, 'pointers[rev]' should be popped and
+        # propagated to its parents accordingly.
+        #
+        # Once all pending paths have been resolved, 'pendingcnt[rev]' becomes
+        # 0 and 'parents[rev]' contains the unsorted list of parent revisions
+        # and p1/p2 chains (excluding linear paths.)
+
+        subset = self._subset
+        tovisit = self._tovisit  # heap queue of [-rev]
+        pendingcnt = self._pendingcnt  # {rev: count} for visited revisions
+        pointers = self._pointers  # {rev: [{unresolved_rev: chain}, resolved]}
+        parents = self._parents  # {rev: [(chain, rev)]}
+
+        while tovisit or self._inputhead >= nullrev:
+            if pendingcnt.get(stoprev) == 0:
+                return
+
+            # feed greater revisions from input set to queue
+            if not tovisit:
+                heapq.heappush(tovisit, -self._inputhead)
+                self._advanceinput()
+            while self._inputhead >= -tovisit[0]:
+                heapq.heappush(tovisit, -self._inputhead)
+                self._advanceinput()
+
+            rev = -heapq.heappop(tovisit)
+            if rev < self._bottomrev:
+                return
+            if rev in pendingcnt and rev not in pointers:
+                continue  # already visited
+
+            curactive = rev in subset
+            pendingcnt.setdefault(rev, 0)  # mark as visited
+            if curactive:
+                assert rev not in parents
+                parents[rev] = []
+            unresolved, resolved = pointers.pop(rev, ({}, set()))
+
+            if curactive:
+                # reached to active rev, resolve pending descendants' parents
+                for r, c in unresolved.items():
+                    pendingcnt[r] -= 1
+                    assert pendingcnt[r] >= 0
+                    if r in resolved:
+                        continue  # eliminate redundant path
+                    parents[r].append((c, rev))
+                    # mark the descendant 'r' as resolved through this path if
+                    # there are still pending pointers. the 'resolved' set may
+                    # be concatenated later at a fork revision.
+                    if pendingcnt[r] > 0:
+                        resolved.add(r)
+                unresolved.clear()
+                # occasionally clean resolved markers. otherwise the set
+                # would grow indefinitely.
+                resolved = {r for r in resolved if pendingcnt[r] > 0}
+
+            parentrevs = self._parentrevs(rev)
+            bothparentsactive = all(p in subset for p in parentrevs)
+
+            # set up or propagate tracking pointers if
+            # - one of the parents is not active,
+            # - or descendants' parents are unresolved.
+            if not bothparentsactive or unresolved or resolved:
+                if len(parentrevs) > 1:
+                    # 'rev' is a merge revision. increment the pending count
+                    # as the 'unresolved' dict will be duplicated.
+                    for r in unresolved:
+                        pendingcnt[r] += 1
+                reusable = True  # can we avoid copying the tracking pointer?
+                for i, p in enumerate(parentrevs):
+                    assert p < rev
+                    heapq.heappush(tovisit, -p)
+                    if p in pointers:
+                        # 'p' is a fork revision. concatenate tracking pointers
+                        # and decrement the pending count accordingly.
+                        knownunresolved, knownresolved = pointers[p]
+                        for r, c in unresolved.items():
+                            c += [b'1', b'2'][i]
+                            if r in knownunresolved:
+                                # unresolved at both paths
+                                pendingcnt[r] -= 1
+                                assert pendingcnt[r] > 0
+                                # take shorter chain
+                                knownunresolved[r] = min(c, knownunresolved[r])
+                            else:
+                                knownunresolved[r] = c
+                        # simply propagate the 'resolved' set as deduplicating
+                        # 'unresolved' here would be slightly complicated.
+                        knownresolved.update(resolved)
+                    elif reusable:
+                        pointers[p] = (unresolved, resolved)
+                        reusable = False
+                    else:
+                        pointers[p] = (unresolved.copy(), resolved.copy())
+
+            # then, populate the active parents directly and add the current
+            # 'rev' to the tracking pointers of the inactive parents.
+            # 'pointers[p]' may be optimized out if both parents are active.
+            if curactive and bothparentsactive:
+                for i, p in enumerate(parentrevs):
+                    c = [b'1', b'2'][i]
+                    parents[rev].append((c, p))
+                    # no need to mark 'rev' as resolved since the 'rev' should
+                    # be fully resolved (i.e. pendingcnt[rev] == 0)
+                assert pendingcnt[rev] == 0
+            elif curactive:
+                for i, p in enumerate(parentrevs):
+                    unresolved, resolved = pointers[p]
+                    assert rev not in unresolved
+                    c = [b'1', b'2'][i]
+                    if p in subset:
+                        parents[rev].append((c, p))
+                        # mark 'rev' as resolved through this path
+                        resolved.add(rev)
+                    else:
+                        pendingcnt[rev] += 1
+                        unresolved[rev] = c
+                assert 0 < pendingcnt[rev] <= 2
+
+
 def _reachablerootspure(pfunc, minroot, roots, heads, includepath):
     """See revlog.reachableroots"""
     if not roots:
--- a/mercurial/templatefuncs.py	Sun Mar 15 16:00:45 2020 +0900
+++ b/mercurial/templatefuncs.py	Sun Mar 15 16:11:58 2020 +0900
@@ -16,6 +16,7 @@
 )
 from . import (
     color,
+    dagop,
     diffutil,
     encoding,
     error,
@@ -842,6 +843,45 @@
     return b''
 
 
+@templatefunc(
+    b'subsetparents(rev, revset)',
+    argspec=b'rev revset',
+    requires={b'repo', b'cache'},
+)
+def subsetparents(context, mapping, args):
+    """Look up parents of the rev in the sub graph given by the revset."""
+    if b'rev' not in args or b'revset' not in args:
+        # i18n: "subsetparents" is a keyword
+        raise error.ParseError(_(b"subsetparents expects two arguments"))
+
+    repo = context.resource(mapping, b'repo')
+
+    rev = templateutil.evalinteger(context, mapping, args[b'rev'])
+
+    # TODO: maybe subsetparents(rev) should be allowed. the default revset
+    # will be the revisions specified by -rREV argument.
+    q = templateutil.evalwrapped(context, mapping, args[b'revset'])
+    if not isinstance(q, templateutil.revslist):
+        # i18n: "subsetparents" is a keyword
+        raise error.ParseError(_(b"subsetparents expects a queried revset"))
+    subset = q.tovalue(context, mapping)
+    key = q.cachekey
+
+    if key:
+        # cache only if revset query isn't dynamic
+        cache = context.resource(mapping, b'cache')
+        walkercache = cache.setdefault(b'subsetparentswalker', {})
+        if key in walkercache:
+            walker = walkercache[key]
+        else:
+            walker = dagop.subsetparentswalker(repo, subset)
+            walkercache[key] = walker
+    else:
+        # for one-shot use, specify startrev to limit the search space
+        walker = dagop.subsetparentswalker(repo, subset, startrev=rev)
+    return templateutil.revslist(repo, walker.parentsset(rev))
+
+
 @templatefunc(b'word(number, text[, separator])')
 def word(context, mapping, args):
     """Return the nth word from a string."""
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tests/test-template-graph.t	Sun Mar 15 16:11:58 2020 +0900
@@ -0,0 +1,337 @@
+Test graph-related template functions
+=====================================
+
+  $ cat <<'EOF' >> $HGRCPATH
+  > [extensions]
+  > drawdag = $RUNTESTDIR/drawdag.py
+  > EOF
+
+  $ hg init a
+  $ cd a
+
+  $ hg debugdrawdag <<'EOF'
+  >   l
+  >  / \
+  > |   k
+  > |   |\
+  > |   | j
+  > |   | |
+  > i   | |
+  > |\  | |
+  > h | | |
+  > | | | |
+  > | g | |
+  > | | | |
+  > f | | |
+  > | |/ /
+  > | e |
+  > | |/
+  > | d
+  > |/|
+  > c |
+  > | |
+  > b |
+  >   |
+  >   a
+  > EOF
+
+  $ hg log -Gq -T'{rev} {tags}\n'
+  o    11 l tip
+  |\
+  | o    10 i
+  | |\
+  o \ \    9 k
+  |\ \ \
+  +-----o  8 g
+  | | |
+  | o |  7 j
+  | | |
+  | | o  6 h
+  | | |
+  o | |  5 e
+  |/ /
+  | o  4 f
+  | |
+  o |  3 d
+  |\|
+  | o  2 c
+  | |
+  | o  1 b
+  |
+  o  0 a
+  
+
+  $ cd ..
+
+subsetparents
+-------------
+
+  $ cd a
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+i"))}\n' -r 'c+i'
+  o  10 i: 2
+  :
+  o  2 c:
+  |
+  ~
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+h+i"))}\n' -r 'c+h+i'
+  o    10 i: 6
+  |\
+  o :  6 h: 2
+  :/
+  o  2 c:
+  |
+  ~
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+h+l"))}\n' -r 'c+h+l'
+  o    11 l tip: 6
+  :\
+  : o  6 h: 2
+  :/
+  o  2 c:
+  |
+  ~
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+f+l"))}\n' -r 'c+f+l'
+  o    11 l tip: 4
+  :\
+  : o  4 f: 2
+  :/
+  o  2 c:
+  |
+  ~
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+h+i+k"))}\n' -r 'c+h+i+k'
+  o    10 i: 6
+  |\
+  | : o  9 k: 2
+  | :/
+  o :  6 h: 2
+  :/
+  o  2 c:
+  |
+  ~
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+d+h+i+k"))}\n' -r 'c+d+h+i+k'
+  o    10 i: 6 3
+  |\
+  | : o  9 k: 3
+  | :/
+  o :  6 h: 2
+  : :
+  : o  3 d: 2
+  :/|
+  : ~
+  o  2 c:
+  |
+  ~
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+j+k+i"))}\n' -r 'c+j+k+i'
+  o  10 i: 2
+  :
+  : o  9 k: 7
+  :/|
+  : o  7 j: 2
+  :/
+  o  2 c:
+  |
+  ~
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("c+e+f+j"))}\n' -r 'c+e+f+j'
+  o  7 j: 2
+  :
+  : o  5 e: 2
+  :/
+  : o  4 f: 2
+  :/
+  o  2 c:
+  |
+  ~
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("b+e+f+j"))}\n' -r 'b+e+f+j'
+  o  7 j: 1
+  :
+  : o  5 e: 1
+  :/
+  : o  4 f: 1
+  :/
+  o  1 b:
+  
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("a+c+f+g+j+l"))}\n' -r 'a+c+f+g+j+l'
+  o    11 l tip: 4 8 7
+  :\
+  : \
+  : :\
+  : : \
+  : : :\
+  : : : \
+  : : : :\
+  : o---+ :  8 g: 0 2
+  : :/ / /
+  : +---o  7 j: 0 2
+  : : :/
+  o---+  4 f: 2
+   / /
+  : o  2 c:
+  : |
+  : ~
+  o  0 a:
+  
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("b+i+l"))}\n' -r 'b+i+l'
+  o    11 l tip: 10
+  |\
+  o :  10 i: 1
+  :/
+  o  1 b:
+  
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("b+i+j+l"))}\n' -r 'b+i+j+l'
+  o    11 l tip: 10 7
+  |\
+  | \
+  | :\
+  o : :  10 i: 1
+  :/ /
+  : o  7 j: 1
+  :/
+  o  1 b:
+  
+
+null in subset:
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("null+a+c+f"))}\n' -r 'null+a+c+f'
+  o  4 f: 2
+  |
+  o  2 c: -1
+  :
+  : o  0 a: -1
+  :/
+  @  -1 : -1
+  
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("null+a+b+c+f"))}\n' -r 'null+a+b+c+f'
+  o  4 f: 2
+  |
+  o  2 c: 1
+  |
+  o  1 b: -1
+  |
+  | o  0 a: -1
+  |/
+  @  -1 : -1
+  
+
+wdir in subset:
+
+  $ hg update -qC i
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("f+k+wdir()"))}\n' -r 'f+k+wdir()'
+  o  2147483647 : 4
+  :
+  : o    9 k:
+  : |\
+  : ~ ~
+  o  4 f:
+  |
+  ~
+
+  $ hg update -qC null
+
+Revisions not in subset:
+
+  $ hg log -T '{rev} {tags}: {subsetparents(rev, revset("a+c+f+g+j+l"))}\n'
+  11 l tip: 4 8 7
+  10 i: 
+  9 k: 
+  8 g: 0 2
+  7 j: 0 2
+  6 h: 
+  5 e: 
+  4 f: 2
+  3 d: 
+  2 c: 
+  1 b: 
+  0 a: 
+
+  $ hg log -T '{rev} {tags}: {subsetparents(rev, revset("b+c"))}\n'
+  11 l tip: 
+  10 i: 
+  9 k: 
+  8 g: 
+  7 j: 
+  6 h: 
+  5 e: 
+  4 f: 
+  3 d: 
+  2 c: 1
+  1 b: 
+  0 a: 
+
+  $ hg log -T '{rev} {tags}: {subsetparents(rev, revset("b+c"))}\n' -r'reverse(null:2)'
+  2 c: 1
+  1 b: 
+  0 a: 
+  -1 : 
+
+Nothing excluded:
+
+  $ hg log -T '{rev} {tags}: {subsetparents(rev, revset("null:wdir()"))}\n' -r'reverse(null:wdir())'
+  2147483647 : -1
+  11 l tip: 10 9
+  10 i: 6 8
+  9 k: 5 7
+  8 g: 5
+  7 j: 3
+  6 h: 4
+  5 e: 3
+  4 f: 2
+  3 d: 0 2
+  2 c: 1
+  1 b: -1
+  0 a: -1
+  -1 : -1
+
+Uncachable query:
+
+  $ hg log -Gq -T '{rev} {tags}: {subsetparents(rev, revset("%d:%d", rev, rev - 1))}\n'
+  o    11 l tip: 10
+  |\
+  | o    10 i:
+  | |\
+  o \ \    9 k:
+  |\ \ \
+  +-----o  8 g:
+  | | |
+  | o |  7 j:
+  | | |
+  | | o  6 h:
+  | | |
+  o | |  5 e:
+  |/ /
+  | o  4 f:
+  | |
+  o |  3 d: 2
+  |\|
+  | o  2 c: 1
+  | |
+  | o  1 b:
+  |
+  o  0 a: -1
+  
+
+Invalid arguments:
+
+  $ hg log -T '{subsetparents()}\n'
+  hg: parse error: subsetparents expects two arguments
+  [255]
+  $ hg log -T '{subsetparents("a")}\n'
+  hg: parse error: subsetparents expects two arguments
+  [255]
+  $ hg log -T '{subsetparents(rev, extras)}\n'
+  hg: parse error: subsetparents expects a queried revset
+  [255]
+
+  $ cd ..