10 is described at https://www.mercurial-scm.org/wiki/DirState. |
10 is described at https://www.mercurial-scm.org/wiki/DirState. |
11 It is made of a flat sequence of unordered variable-size entries, |
11 It is made of a flat sequence of unordered variable-size entries, |
12 so accessing any information in it requires parsing all of it. |
12 so accessing any information in it requires parsing all of it. |
13 Similarly, saving changes requires rewriting the entire file. |
13 Similarly, saving changes requires rewriting the entire file. |
14 |
14 |
15 The newer `dirsate-v2` file format is designed to fix these limitations |
15 The newer `dirstate-v2` file format is designed to fix these limitations |
16 and make `hg status` faster. |
16 and make `hg status` faster. |
17 |
17 |
18 User guide |
18 User guide |
19 ========== |
19 ========== |
20 |
20 |
31 `dirstate-v2` does not affect compatibility with remote Mercurial versions. |
31 `dirstate-v2` does not affect compatibility with remote Mercurial versions. |
32 |
32 |
33 When `share-safe` is enabled, different repositories sharing the same store |
33 When `share-safe` is enabled, different repositories sharing the same store |
34 can use different dirstate formats. |
34 can use different dirstate formats. |
35 |
35 |
36 Enabling `dirsate-v2` for new local repositories |
36 Enabling `dirstate-v2` for new local repositories |
37 ------------------------------------------------ |
37 ------------------------------------------------ |
38 |
38 |
39 When creating a new local repository such as with `hg init` or `hg clone`, |
39 When creating a new local repository such as with `hg init` or `hg clone`, |
40 the `exp-rc-dirstate-v2` boolean in the `format` configuration section |
40 the `exp-rc-dirstate-v2` boolean in the `format` configuration section |
41 controls whether to use this file format. |
41 controls whether to use this file format. |
42 This is disabled by default as of this writing. |
42 This is disabled by default as of this writing. |
43 To enable it for a single repository, run for example:: |
43 To enable it for a single repository, run for example:: |
44 |
44 |
45 $ hg init my-project --config format.exp-rc-dirstate-v2=1 |
45 $ hg init my-project --config format.exp-rc-dirstate-v2=1 |
46 |
46 |
47 Checking the format of an existing local repsitory |
47 Checking the format of an existing local repository |
48 -------------------------------------------------- |
48 -------------------------------------------------- |
49 |
49 |
50 The `debugformat` commands prints information about |
50 The `debugformat` commands prints information about |
51 which of multiple optional formats are used in the current repository, |
51 which of multiple optional formats are used in the current repository, |
52 including `dirstate-v2`:: |
52 including `dirstate-v2`:: |
94 ----------------- |
94 ----------------- |
95 |
95 |
96 The `.hg/requires` file indicates which of various optional file formats |
96 The `.hg/requires` file indicates which of various optional file formats |
97 are used by a given repository. |
97 are used by a given repository. |
98 Mercurial aborts when seeing a requirement it does not know about, |
98 Mercurial aborts when seeing a requirement it does not know about, |
99 which avoids older version accidentally messing up a respository |
99 which avoids older version accidentally messing up a repository |
100 that uses a format that was introduced later. |
100 that uses a format that was introduced later. |
101 For versions that do support a format, the presence or absence of |
101 For versions that do support a format, the presence or absence of |
102 the corresponding requirement indicates whether to use that format. |
102 the corresponding requirement indicates whether to use that format. |
103 |
103 |
104 When the file contains a `dirstate-v2` line, |
104 When the file contains a `dirstate-v2` line, |
106 With no such line `dirstate-v1` is used. |
106 With no such line `dirstate-v1` is used. |
107 |
107 |
108 High level description |
108 High level description |
109 ---------------------- |
109 ---------------------- |
110 |
110 |
111 Whereas `dirstate-v1` uses a single `.hg/disrtate` file, |
111 Whereas `dirstate-v1` uses a single `.hg/dirstate` file, |
112 in `dirstate-v2` that file is a "docket" file |
112 in `dirstate-v2` that file is a "docket" file |
113 that only contains some metadata |
113 that only contains some metadata |
114 and points to separate data file named `.hg/dirstate.{ID}`, |
114 and points to separate data file named `.hg/dirstate.{ID}`, |
115 where `{ID}` is a random identifier. |
115 where `{ID}` is a random identifier. |
116 |
116 |
171 be inlined here by adding 76 bytes to each offset. |
171 be inlined here by adding 76 bytes to each offset. |
172 |
172 |
173 * Offset 120: |
173 * Offset 120: |
174 The used size of the data file, as a 32-bit big-endian integer. |
174 The used size of the data file, as a 32-bit big-endian integer. |
175 The actual size of the data file may be larger |
175 The actual size of the data file may be larger |
176 (if another Mercurial processis in appending to it |
176 (if another Mercurial process is appending to it |
177 but has not updated the docket yet). |
177 but has not updated the docket yet). |
178 That extra data must be ignored. |
178 That extra data must be ignored. |
179 |
179 |
180 * Offset 124: |
180 * Offset 124: |
181 The length of the data file identifier, as a 8-bit integer. |
181 The length of the data file identifier, as a 8-bit integer. |
301 Paths and nodes can be organized in any order in the file, except that sibling |
301 Paths and nodes can be organized in any order in the file, except that sibling |
302 nodes must be next to each other and sorted by their path. |
302 nodes must be next to each other and sorted by their path. |
303 Contiguity lets the parent refer to them all |
303 Contiguity lets the parent refer to them all |
304 by their count and a single pseudo-pointer, |
304 by their count and a single pseudo-pointer, |
305 instead of storing one pseudo-pointer per child node. |
305 instead of storing one pseudo-pointer per child node. |
306 Sorting allows using binary seach to find a child node with a given name |
306 Sorting allows using binary search to find a child node with a given name |
307 in `O(log(n))` byte sequence comparisons. |
307 in `O(log(n))` byte sequence comparisons. |
308 |
308 |
309 The current implemention writes paths and child node before a given node |
309 The current implementation writes paths and child node before a given node |
310 for ease of figuring out the value of pseudo-pointers by the time the are to be |
310 for ease of figuring out the value of pseudo-pointers by the time the are to be |
311 written, but this is not an obligation and readers must not rely on it. |
311 written, but this is not an obligation and readers must not rely on it. |
312 |
312 |
313 A path is stored as a byte string anywhere in the file, without delimiter. |
313 A path is stored as a byte string anywhere in the file, without delimiter. |
314 It is refered to by one or more node by a pseudo-pointer to its start, and its |
314 It is referred to by one or more node by a pseudo-pointer to its start, and its |
315 length in bytes. Since there is no delimiter, |
315 length in bytes. Since there is no delimiter, |
316 when a path is a substring of another the same bytes could be reused, |
316 when a path is a substring of another the same bytes could be reused, |
317 although the implementation does not exploit this as of this writing. |
317 although the implementation does not exploit this as of this writing. |
318 |
318 |
319 A node is stored on 43 bytes with components at fixed offsets. Paths and |
319 A node is stored on 43 bytes with components at fixed offsets. Paths and |
416 * Offset 40: |
416 * Offset 40: |
417 The nanoseconds component of an `mtime` field described below, |
417 The nanoseconds component of an `mtime` field described below, |
418 as a 32-bit integer. |
418 as a 32-bit integer. |
419 When `mtime` is used, |
419 When `mtime` is used, |
420 this is the number of nanoseconds since `mtime.seconds`, |
420 this is the number of nanoseconds since `mtime.seconds`, |
421 always stritctly less than one billion. |
421 always strictly less than one billion. |
422 |
422 |
423 This may be zero if more precision is not available. |
423 This may be zero if more precision is not available. |
424 (This can happen because of limitations in any of Mercurial, Python, |
424 (This can happen because of limitations in any of Mercurial, Python, |
425 libc, the operating system, …) |
425 libc, the operating system, …) |
426 |
426 |
501 `DIRECTORY` |
501 `DIRECTORY` |
502 When set, this entry will match a directory that exists or existed on the |
502 When set, this entry will match a directory that exists or existed on the |
503 file system. |
503 file system. |
504 |
504 |
505 * When `HAS_MTIME` is set a directory has been seen on the file system and |
505 * When `HAS_MTIME` is set a directory has been seen on the file system and |
506 `mtime` matches its last modificiation time. However, `HAS_MTIME` not being set |
506 `mtime` matches its last modification time. However, `HAS_MTIME` not |
507 does not indicate the lack of directory on the file system. |
507 being set does not indicate the lack of directory on the file system. |
508 |
508 |
509 * When not tracked anywhere, this node does not represent an ignored or |
509 * When not tracked anywhere, this node does not represent an ignored or |
510 unknown file on disk. |
510 unknown file on disk. |
511 |
511 |
512 If `HAS_MTIME` is set |
512 If `HAS_MTIME` is set |
560 |
560 |
561 Note that the amount recorded "unknown" children can still be zero if None |
561 Note that the amount recorded "unknown" children can still be zero if None |
562 where present. |
562 where present. |
563 |
563 |
564 Also note that having this flag unset does not imply that no "unknown" |
564 Also note that having this flag unset does not imply that no "unknown" |
565 children have been recorded. Some might be present, but there is no garantee |
565 children have been recorded. Some might be present, but there is |
566 that is will be all of them. |
566 no guarantee that is will be all of them. |
567 |
567 |
568 `ALL_IGNORED_RECORDED` |
568 `ALL_IGNORED_RECORDED` |
569 If set, all "ignored" children existing on disk (at the time of the last |
569 If set, all "ignored" children existing on disk (at the time of the last |
570 status) have been recorded and the `mtime` associated with |
570 status) have been recorded and the `mtime` associated with |
571 `DIRECTORY` can be used for optimization even when "ignored" file |
571 `DIRECTORY` can be used for optimization even when "ignored" file |
573 |
573 |
574 Note that the amount recorded "ignored" children can still be zero if None |
574 Note that the amount recorded "ignored" children can still be zero if None |
575 where present. |
575 where present. |
576 |
576 |
577 Also note that having this flag unset does not imply that no "ignored" |
577 Also note that having this flag unset does not imply that no "ignored" |
578 children have been recorded. Some might be present, but there is no garantee |
578 children have been recorded. Some might be present, but there is |
579 that is will be all of them. |
579 no guarantee that is will be all of them. |
580 |
580 |
581 `HAS_FALLBACK_EXEC` |
581 `HAS_FALLBACK_EXEC` |
582 If this flag is set, the entry carries "fallback" information for the |
582 If this flag is set, the entry carries "fallback" information for the |
583 executable bit in the `FALLBACK_EXEC` flag. |
583 executable bit in the `FALLBACK_EXEC` flag. |
584 |
584 |
610 |
610 |
611 `MTIME_SECOND_AMBIGUOUS` |
611 `MTIME_SECOND_AMBIGUOUS` |
612 This flag is relevant only when `HAS_FILE_MTIME` is set. When set, the |
612 This flag is relevant only when `HAS_FILE_MTIME` is set. When set, the |
613 `mtime` stored in the entry is only valid for comparison with timestamps |
613 `mtime` stored in the entry is only valid for comparison with timestamps |
614 that have nanosecond information. If available timestamp does not carries |
614 that have nanosecond information. If available timestamp does not carries |
615 nanosecond information, the `mtime` should be ignored and no optimisation |
615 nanosecond information, the `mtime` should be ignored and no optimization |
616 can be applied. |
616 can be applied. |