You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
spike 87bcc2f559 update 4 years ago
__pycache__ update 4 years ago
templates initial 4 years ago
.gitignore update 4 years ago
LICENSE.txt initial 4 years ago
Pipfile initial 4 years ago
Pipfile.lock initial 4 years ago
README.md initial 4 years ago
database_connection.py initial 4 years ago
decrypt_attachment.py update 4 years ago
download_images.py update 4 years ago
export_messages.py update 4 years ago
freeze.txt update 4 years ago
import_messages.py initial 4 years ago
list_rooms.py initial 4 years ago
matrix_connection.py initial 4 years ago
schema.py update 4 years ago
setup.cfg initial 4 years ago

README.md

Matrix Archive Tools

Import messages from a matrix.org room, for research, archival, and preservation.

Developed at Dinacon 2018, for use by the documentation team.

Use this responsibly and ethically. Don't re-publish people's messages without their knowledge and consent.

Setup

Install Pipenv. Run pipenv install.

Set these environment variables: MATRIX_USER, MATRIX_PASSWORD, MATRIX_ROOM_IDS.

MATRIX_ROOM_IDS should be a comma-separated list of Matrix room IDs (or a single id). Run pipenv run list_rooms.py to list the room ids.

Set MONGODB_URI to a MongoDB connection URL, or install a local MongoDB instance.

Usage

Import Messages

pipenv run import imports the messages into the database.

Export Messages

pipenv run export filename.html exports a text, HTML, JSON, or YAML file, depending on the name of filename.html. The file contains links to the image download URLs on the Matrix server.

Download Images

pipenv run download_images.py downloads all the thumbnail images in the database into a download directory (default thumbnails), skipping images that have already been downloaded.

Use the --no-thumbnails option to download full size images instead of thumbnails. In this case, the default directory is images instead of thumbnails.

References

Matrix Client-Server API

License

MIT