changelog: lazily parse description
authorGregory Szorc <gregory.szorc@gmail.com>
Sun, 06 Mar 2016 14:28:46 -0800
changeset 28489 8939a95064f1
parent 28488 437c32dcec7d
child 28490 959eadae589a
changelog: lazily parse description Before, the description field was converted to a localstr at parse time. With this patch, we store the raw description and convert to a localstr when it is first accessed. We see a revset speedup for revsets that don't access the description: author(mpm) 0.896565 0.914234 0.869085 date(2015) 0.878797 0.891980 0.862525 extra(rebase_source) 0.865446 0.912514 0.871500 author(mpm) or author(greg) 1.801832 1.860402 1.791589 date(2015) or branch(default) 0.968276 0.994673 0.974027 author(mpm) or desc(bug) or date(2015) or extra(rebase_source) 3.656193 3.721032 3.643593 As you can see, most of these revsets are already faster than from before this refactoring: we have already offset the performance loss from the introduction of the new class representing parsed changelog entries!
mercurial/changelog.py
--- a/mercurial/changelog.py	Sun Mar 06 13:26:37 2016 -0800
+++ b/mercurial/changelog.py	Sun Mar 06 14:28:46 2016 -0800
@@ -152,7 +152,7 @@
 
     __slots__ = (
         'date',
-        'description',
+        '_rawdesc',
         'extra',
         'files',
         'manifest',
@@ -185,9 +185,10 @@
         #
         # changelog v0 doesn't use extra
 
-        last = text.index("\n\n")
-        self.description = encoding.tolocal(text[last + 2:])
-        l = text[:last].split('\n')
+        doublenl = text.index('\n\n')
+        self._rawdesc = text[doublenl + 2:]
+
+        l = text[:doublenl].split('\n')
         self.manifest = bin(l[0])
         self.user = encoding.tolocal(l[1])
 
@@ -209,6 +210,10 @@
 
         return self
 
+    @property
+    def description(self):
+        return encoding.tolocal(self._rawdesc)
+
 class changelog(revlog.revlog):
     def __init__(self, opener):
         revlog.revlog.__init__(self, opener, "00changelog.i")