revlog: avoid string slice when decompressing u* chunks
authorGregory Szorc <gregory.szorc@gmail.com>
Sun, 20 Dec 2015 16:00:27 -0800
changeset 27475 a2e2a8fa5fd1
parent 27474 e517a89c24e1
child 27476 9ebc02157bf3
revlog: avoid string slice when decompressing u* chunks Revlog chunks can be stored uncompressed. If the first byte of the raw data is \0, we store the data as is. Else we prefix it with 'u'. Before, we performed a string slice to strip out the 'u' prefix. With this patch, we use a buffer to avoid an extra memory copy and associated garbage collection overhead. I was unable to verify any performance impact of this patch. For both mozilla-central and the hg repos, the number of manifest revisions with 'u' prefixes is very small - under 1%. So this change likely isn't called enough to have an impact on manifest reading. However, the reasoning behind this change is solid, so it should be safe.
mercurial/revlog.py
--- a/mercurial/revlog.py	Thu Dec 17 23:53:09 2015 +0900
+++ b/mercurial/revlog.py	Sun Dec 20 16:00:27 2015 -0800
@@ -110,7 +110,7 @@
         except zlib.error as e:
             raise RevlogError(_("revlog decompress error: %s") % str(e))
     if t == 'u':
-        return bin[1:]
+        return util.buffer(bin, 1)
     raise RevlogError(_("unknown compression type %r") % t)
 
 # index v0: