mod_storage_xmlarchive/README.markdown
author Matthew Wild <mwild1@gmail.com>
Tue, 16 Apr 2024 12:41:52 +0100
changeset 5888 49b0873932ca
parent 5745 5232d12eb74d
permissions -rw-r--r--
mod_pubsub_serverinfo: Don't default to non-local pubsub servers (thanks roughnecks)
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
1807
4d73a1a6ba68 Convert all wiki pages to Markdown
Kim Alvefur <zash@zash.se>
parents: 1786
diff changeset
     1
---
4d73a1a6ba68 Convert all wiki pages to Markdown
Kim Alvefur <zash@zash.se>
parents: 1786
diff changeset
     2
labels:
4d73a1a6ba68 Convert all wiki pages to Markdown
Kim Alvefur <zash@zash.se>
parents: 1786
diff changeset
     3
- 'Stage-Beta'
4d73a1a6ba68 Convert all wiki pages to Markdown
Kim Alvefur <zash@zash.se>
parents: 1786
diff changeset
     4
- 'Type-Storage'
4d73a1a6ba68 Convert all wiki pages to Markdown
Kim Alvefur <zash@zash.se>
parents: 1786
diff changeset
     5
- ArchiveStorage
2363
33e85070dc60 mod_storage_xmlarchive: Expand the summary a little bit
Kim Alvefur <zash@zash.se>
parents: 2295
diff changeset
     6
summary: XML file based archive storage
2821
f052b62c653c mod_storage_xmlarchive/README: Minor tweak
Kim Alvefur <zash@zash.se>
parents: 2819
diff changeset
     7
---
1786
29f3d6b7ad16 Import wiki pages
Kim Alvefur <zash@zash.se>
parents:
diff changeset
     8
1807
4d73a1a6ba68 Convert all wiki pages to Markdown
Kim Alvefur <zash@zash.se>
parents: 1786
diff changeset
     9
Introduction
4d73a1a6ba68 Convert all wiki pages to Markdown
Kim Alvefur <zash@zash.se>
parents: 1786
diff changeset
    10
============
1786
29f3d6b7ad16 Import wiki pages
Kim Alvefur <zash@zash.se>
parents:
diff changeset
    11
1807
4d73a1a6ba68 Convert all wiki pages to Markdown
Kim Alvefur <zash@zash.se>
parents: 1786
diff changeset
    12
This module implements stanza archives using files, similar to the
3438
a6722a35f35e mod_storage_xmlarchive: Expand description in README
Kim Alvefur <zash@zash.se>
parents: 2859
diff changeset
    13
default "internal" storage. Unlike "internal", it saves messages in two
a6722a35f35e mod_storage_xmlarchive: Expand description in README
Kim Alvefur <zash@zash.se>
parents: 2859
diff changeset
    14
files per day (and per user), one containing metadata and one containing
a6722a35f35e mod_storage_xmlarchive: Expand description in README
Kim Alvefur <zash@zash.se>
parents: 2859
diff changeset
    15
the actual messages in XML format (hence the name).
a6722a35f35e mod_storage_xmlarchive: Expand description in README
Kim Alvefur <zash@zash.se>
parents: 2859
diff changeset
    16
a6722a35f35e mod_storage_xmlarchive: Expand description in README
Kim Alvefur <zash@zash.se>
parents: 2859
diff changeset
    17
Splitting data per day improves performance for larger archives as it
a6722a35f35e mod_storage_xmlarchive: Expand description in README
Kim Alvefur <zash@zash.se>
parents: 2859
diff changeset
    18
does not have to look through data from other days.
1786
29f3d6b7ad16 Import wiki pages
Kim Alvefur <zash@zash.se>
parents:
diff changeset
    19
1944
453eee191e18 mod_storage_xmlarchive/README: Add example configuration
Kim Alvefur <zash@zash.se>
parents: 1807
diff changeset
    20
Configuration
453eee191e18 mod_storage_xmlarchive/README: Add example configuration
Kim Alvefur <zash@zash.se>
parents: 1807
diff changeset
    21
=============
453eee191e18 mod_storage_xmlarchive/README: Add example configuration
Kim Alvefur <zash@zash.se>
parents: 1807
diff changeset
    22
453eee191e18 mod_storage_xmlarchive/README: Add example configuration
Kim Alvefur <zash@zash.se>
parents: 1807
diff changeset
    23
To use this with [mod\_mam] add this to your config:
453eee191e18 mod_storage_xmlarchive/README: Add example configuration
Kim Alvefur <zash@zash.se>
parents: 1807
diff changeset
    24
453eee191e18 mod_storage_xmlarchive/README: Add example configuration
Kim Alvefur <zash@zash.se>
parents: 1807
diff changeset
    25
``` lua
453eee191e18 mod_storage_xmlarchive/README: Add example configuration
Kim Alvefur <zash@zash.se>
parents: 1807
diff changeset
    26
storage = {
3455
7b97abf26612 mod_storage_xmlarchive/README: Change example to match Prosodys mod_mam
Kim Alvefur <zash@zash.se>
parents: 3438
diff changeset
    27
    archive = "xmlarchive"
1944
453eee191e18 mod_storage_xmlarchive/README: Add example configuration
Kim Alvefur <zash@zash.se>
parents: 1807
diff changeset
    28
}
453eee191e18 mod_storage_xmlarchive/README: Add example configuration
Kim Alvefur <zash@zash.se>
parents: 1807
diff changeset
    29
```
453eee191e18 mod_storage_xmlarchive/README: Add example configuration
Kim Alvefur <zash@zash.se>
parents: 1807
diff changeset
    30
1973
e63dba236a2a mod_storage_xmlarchive: Use datamanager.append_raw (had that code duplicated here)
Kim Alvefur <zash@zash.se>
parents: 1944
diff changeset
    31
To use it with [mod\_mam\_muc] or [mod\_http\_muc\_log]:
1944
453eee191e18 mod_storage_xmlarchive/README: Add example configuration
Kim Alvefur <zash@zash.se>
parents: 1807
diff changeset
    32
453eee191e18 mod_storage_xmlarchive/README: Add example configuration
Kim Alvefur <zash@zash.se>
parents: 1807
diff changeset
    33
``` lua
453eee191e18 mod_storage_xmlarchive/README: Add example configuration
Kim Alvefur <zash@zash.se>
parents: 1807
diff changeset
    34
storage = {
453eee191e18 mod_storage_xmlarchive/README: Add example configuration
Kim Alvefur <zash@zash.se>
parents: 1807
diff changeset
    35
    muc_log = "xmlarchive"
453eee191e18 mod_storage_xmlarchive/README: Add example configuration
Kim Alvefur <zash@zash.se>
parents: 1807
diff changeset
    36
}
453eee191e18 mod_storage_xmlarchive/README: Add example configuration
Kim Alvefur <zash@zash.se>
parents: 1807
diff changeset
    37
```
1786
29f3d6b7ad16 Import wiki pages
Kim Alvefur <zash@zash.se>
parents:
diff changeset
    38
1973
e63dba236a2a mod_storage_xmlarchive: Use datamanager.append_raw (had that code duplicated here)
Kim Alvefur <zash@zash.se>
parents: 1944
diff changeset
    39
Refer to [Prosodys data storage documentation][doc:storage] for more
e63dba236a2a mod_storage_xmlarchive: Use datamanager.append_raw (had that code duplicated here)
Kim Alvefur <zash@zash.se>
parents: 1944
diff changeset
    40
information.
1807
4d73a1a6ba68 Convert all wiki pages to Markdown
Kim Alvefur <zash@zash.se>
parents: 1786
diff changeset
    41
4d73a1a6ba68 Convert all wiki pages to Markdown
Kim Alvefur <zash@zash.se>
parents: 1786
diff changeset
    42
Note that this module does not implement the "keyval" storage method and
1944
453eee191e18 mod_storage_xmlarchive/README: Add example configuration
Kim Alvefur <zash@zash.se>
parents: 1807
diff changeset
    43
can't be used by anything other than archives.
1786
29f3d6b7ad16 Import wiki pages
Kim Alvefur <zash@zash.se>
parents:
diff changeset
    44
1807
4d73a1a6ba68 Convert all wiki pages to Markdown
Kim Alvefur <zash@zash.se>
parents: 1786
diff changeset
    45
Compatibility
4d73a1a6ba68 Convert all wiki pages to Markdown
Kim Alvefur <zash@zash.se>
parents: 1786
diff changeset
    46
=============
1786
29f3d6b7ad16 Import wiki pages
Kim Alvefur <zash@zash.se>
parents:
diff changeset
    47
2295
1eeda34d668b mod_storage_xmlarchive/README: Should work in 0.9 now
Kim Alvefur <zash@zash.se>
parents: 2052
diff changeset
    48
  ------ ---------------
3991
dddcf094af5c mod_storage_xmlarchive: Update compat table
Kim Alvefur <zash@zash.se>
parents: 3464
diff changeset
    49
  trunk  Should work
dddcf094af5c mod_storage_xmlarchive: Update compat table
Kim Alvefur <zash@zash.se>
parents: 3464
diff changeset
    50
  0.11   Works
4542
591c643d55b2 mod_storage_xmlarchive: Insert micropauses in long-running queries
Kim Alvefur <zash@zash.se>
parents: 3991
diff changeset
    51
  0.10   Should work
591c643d55b2 mod_storage_xmlarchive: Insert micropauses in long-running queries
Kim Alvefur <zash@zash.se>
parents: 3991
diff changeset
    52
  0.9    Does not work
2295
1eeda34d668b mod_storage_xmlarchive/README: Should work in 0.9 now
Kim Alvefur <zash@zash.se>
parents: 2052
diff changeset
    53
  ------ ---------------
2819
d48d4d9ccae7 mod_storage_xmlarchive: Add a prosodyctl command for migrating to/from the internal storage format
Kim Alvefur <zash@zash.se>
parents: 2363
diff changeset
    54
d48d4d9ccae7 mod_storage_xmlarchive: Add a prosodyctl command for migrating to/from the internal storage format
Kim Alvefur <zash@zash.se>
parents: 2363
diff changeset
    55
Conversion to or from internal storage
d48d4d9ccae7 mod_storage_xmlarchive: Add a prosodyctl command for migrating to/from the internal storage format
Kim Alvefur <zash@zash.se>
parents: 2363
diff changeset
    56
--------------------------------------
d48d4d9ccae7 mod_storage_xmlarchive: Add a prosodyctl command for migrating to/from the internal storage format
Kim Alvefur <zash@zash.se>
parents: 2363
diff changeset
    57
d48d4d9ccae7 mod_storage_xmlarchive: Add a prosodyctl command for migrating to/from the internal storage format
Kim Alvefur <zash@zash.se>
parents: 2363
diff changeset
    58
This module stores data in a way that overlaps with the more recent
2821
f052b62c653c mod_storage_xmlarchive/README: Minor tweak
Kim Alvefur <zash@zash.se>
parents: 2819
diff changeset
    59
archive support in `mod_storage_internal`, meaning e.g. [mod_migrate]
2819
d48d4d9ccae7 mod_storage_xmlarchive: Add a prosodyctl command for migrating to/from the internal storage format
Kim Alvefur <zash@zash.se>
parents: 2363
diff changeset
    60
will not be able to cleanly convert to or from the `xmlarchive` format.
d48d4d9ccae7 mod_storage_xmlarchive: Add a prosodyctl command for migrating to/from the internal storage format
Kim Alvefur <zash@zash.se>
parents: 2363
diff changeset
    61
d48d4d9ccae7 mod_storage_xmlarchive: Add a prosodyctl command for migrating to/from the internal storage format
Kim Alvefur <zash@zash.se>
parents: 2363
diff changeset
    62
To mitigate this, an migration command has been added to
d48d4d9ccae7 mod_storage_xmlarchive: Add a prosodyctl command for migrating to/from the internal storage format
Kim Alvefur <zash@zash.se>
parents: 2363
diff changeset
    63
`mod_storage_xmlarchive`:
d48d4d9ccae7 mod_storage_xmlarchive: Add a prosodyctl command for migrating to/from the internal storage format
Kim Alvefur <zash@zash.se>
parents: 2363
diff changeset
    64
d48d4d9ccae7 mod_storage_xmlarchive: Add a prosodyctl command for migrating to/from the internal storage format
Kim Alvefur <zash@zash.se>
parents: 2363
diff changeset
    65
``` bash
5745
5232d12eb74d mod_storage_xmlarchive: Pass hostname to converter for converting all users
Kim Alvefur <zash@zash.se>
parents: 5744
diff changeset
    66
prosodyctl mod_storage_xmlarchive convert $DIR internal $STORE $JID+
2819
d48d4d9ccae7 mod_storage_xmlarchive: Add a prosodyctl command for migrating to/from the internal storage format
Kim Alvefur <zash@zash.se>
parents: 2363
diff changeset
    67
```
d48d4d9ccae7 mod_storage_xmlarchive: Add a prosodyctl command for migrating to/from the internal storage format
Kim Alvefur <zash@zash.se>
parents: 2363
diff changeset
    68
2822
88474dd1af48 Various READMEs: s/eg/e.g.g/
Kim Alvefur <zash@zash.se>
parents: 2821
diff changeset
    69
Where `$DIR` is `to` or `from`, `$STORE` is e.g. `archive` or `archive2`
5744
100110d539d3 mod_storage_xmlarchive: Migrate all users/rooms if no JID argument given
Kim Alvefur <zash@zash.se>
parents: 4542
diff changeset
    70
for MAM and `muc_log` for MUC logs. Finally, `$JID` is one or more JID
5745
5232d12eb74d mod_storage_xmlarchive: Pass hostname to converter for converting all users
Kim Alvefur <zash@zash.se>
parents: 5744
diff changeset
    71
of the users or MUC rooms to be migrated.
5232d12eb74d mod_storage_xmlarchive: Pass hostname to converter for converting all users
Kim Alvefur <zash@zash.se>
parents: 5744
diff changeset
    72
5232d12eb74d mod_storage_xmlarchive: Pass hostname to converter for converting all users
Kim Alvefur <zash@zash.se>
parents: 5744
diff changeset
    73
To migrate all users/rooms on a particular host, pass a bare hostname.
2858
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
    74
3463
471855552f49 mod_storage_xmlarchive/README: Add reminder about backups
Kim Alvefur <zash@zash.se>
parents: 3455
diff changeset
    75
::: {.alert .alert-danger}
471855552f49 mod_storage_xmlarchive/README: Add reminder about backups
Kim Alvefur <zash@zash.se>
parents: 3455
diff changeset
    76
Since this is a destructive command, don't forget to backup your data
471855552f49 mod_storage_xmlarchive/README: Add reminder about backups
Kim Alvefur <zash@zash.se>
parents: 3455
diff changeset
    77
first.
3464
f25b0623b276 mod_storage_xmlarchive/README: And don't have prosody running at the same time
Kim Alvefur <zash@zash.se>
parents: 3463
diff changeset
    78
f25b0623b276 mod_storage_xmlarchive/README: And don't have prosody running at the same time
Kim Alvefur <zash@zash.se>
parents: 3463
diff changeset
    79
Prosody should *not* be running while converting data.
3463
471855552f49 mod_storage_xmlarchive/README: Add reminder about backups
Kim Alvefur <zash@zash.se>
parents: 3455
diff changeset
    80
:::
471855552f49 mod_storage_xmlarchive/README: Add reminder about backups
Kim Alvefur <zash@zash.se>
parents: 3455
diff changeset
    81
471855552f49 mod_storage_xmlarchive/README: Add reminder about backups
Kim Alvefur <zash@zash.se>
parents: 3455
diff changeset
    82
2858
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
    83
Data structure
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
    84
==============
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
    85
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
    86
Data is split in three kinds of files and messages are grouped by day.
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
    87
Prosodys `util.datamanager` is used, so all special characters in these
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
    88
filenames are escaped and reside under `hostname/store` in Prosodys Data
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
    89
directory, commonly `/var/lib/prosody`.
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
    90
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
    91
`username.list`
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
    92
:   A list of dates in `YYYY-MM-DD` format.
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
    93
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
    94
`username@YYYY-MM-DD.list`
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
    95
:   Index containing metadata for messages stored on that day.
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
    96
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
    97
`username@YYYY-MM-DD.xml`
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
    98
:   Messages in textual XML format, separated by newlines.
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
    99
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
   100
This makes it fairly simple and fast to find messages by timestamp.
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
   101
Queries that are not time based, but limited to a specific contact may
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
   102
be expensive as potentially the entire archive will be read.
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
   103
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
   104
Each archive ID is of the form `YYYY-MM-DD-random`, making lookups by
687b19cad4f5 mod_storage_xmlarchive/README: Add description of how data is stored
Kim Alvefur <zash@zash.se>
parents: 2822
diff changeset
   105
archive id just as simple as time based queries.
3438
a6722a35f35e mod_storage_xmlarchive: Expand description in README
Kim Alvefur <zash@zash.se>
parents: 2859
diff changeset
   106
a6722a35f35e mod_storage_xmlarchive: Expand description in README
Kim Alvefur <zash@zash.se>
parents: 2859
diff changeset
   107
## Limitations
a6722a35f35e mod_storage_xmlarchive: Expand description in README
Kim Alvefur <zash@zash.se>
parents: 2859
diff changeset
   108
a6722a35f35e mod_storage_xmlarchive: Expand description in README
Kim Alvefur <zash@zash.se>
parents: 2859
diff changeset
   109
-   Only XML stanzas can be stored.
a6722a35f35e mod_storage_xmlarchive: Expand description in README
Kim Alvefur <zash@zash.se>
parents: 2859
diff changeset
   110
-   The deletion method only supports removing entire days at a time.