Git and GitHub for Obsidian Users
Git and GitHub for Obsidian Users
I've been doing some repair work around my house recently and have been reflecting on how important the choice of tools are in making a project go quickly and smoothly. More specifically, I've been thinking about how important it is to choose a tool designed for a specific purpose rather than a general purpose one. For example, while you can screw a nut on a bolt with vise grips, you're much better off using ratchet wrench which is designed for that specific purpose. It is faster, easier, and less likely to do damage to the nut.
I've been seeing quite a few articles about git and GitHub in the context of backups and synchronization. In this article I would like to explain exactly what they are and why they are not good tools to use for backing up or synchronizing an Obsidian vault. I will then propose best-of-class solutions for these needs. At the end, I will show how git and GitHub can, in fact, be useful for some special purposes.
Git
Git is a version control system designed for collaborative software development. Using git, developers are able to manage a project's code base throughout development cycles, allowing multiple authors to contribute code to a single project, everyone keeping up to date with the latest version. It allows for maintaining separate branches for production and development, branches for features, and branches for each developer. The various branches can be merged as they are completed, and thereby update the production or main development branch.
If you are wondering how any of this relates to Obsidian, which is not a software development project and does not typically have multiple contributors, well, it doesn't, which is kind of my point. It's a vice grips solution for Obsidian.
Its design is clever, though, and worth taking a moment to understand. The information generated by git is stored in a repository. A repository is simply a hidden directory created in the main directory of your project, or in this case, your vault. The repository itself keeps track of commits. A commit is a file which describes exactly what files have been added, removed, or edited since the last commit. In the case of edits, it keeps track of the specific changes made to each file by tracking changes to each line. The changes are combined with information such as the author of the change and a description of what was changed, and this becomes a new commit. The granularity allows for identifying specific lines which introduced bugs. They can then "re-set" the project to the prior, bug-free state while someone fixes the bug, or to a point where a deleted file still existed.
Before moving on, I will just point out that the git repository has nothing to do with GitHub. It's just a hidden directory on your local file system.
GitHub
At it's core, GitHub is service like Dropbox which provides cloud storage. But there are important differences. It was specifically created to promote sharing of code and collaboration among software developers. As such, you can have as many free public repositories as you want, but need to pay for private ones. A limitation of GitHub is file size. There is a maximum size of 100 MB per file. This is not a problem for most people, but should you have any videos in your vault, for example, you will not be able to use GitHub at all. In any case, unless you need tools provided by GitHub, you might just as well copy your directory, or your local git repository, to Dropbox instead of pushing to GitHub.
In addition to storage, GitHub provides a whole suite of tools which developers can use to design automated workflows and even deploy projects directly from GitHub. None of this really applies to Obsidian either, with potential exceptions which I'll describe below.
Synchronization
As we have seen, GitHub is designed for keeping multiple developers and project branches in sync, so it might seem to appropriate for keeping Obsidian vaults in sync. But, since there is only one author and only one branch, it is overkill at the very least. A developer pulls to see what others have done and pushes to share their work...neither relevant for Obsidian.
But my main objection is practical. Using GitHub requires manual interactions be performed every time you switch devices, namely a pull and a push to GitHub. Ideally, synchronization across devices should be automatic, instantaneous and real time. I wrote an article on synchronizing your vault across different devices using a tool called Syncthing. It is a free, fit for purpose tool which uses direct, device-to-device synchronization, requiring no intervention, and changes are immediately reflected across all devices.
Backups
Be honest: do you back up your computer regularly? If you are like the majority of people the answer is no. Given the number of articles I've seen on solutions for backing up vaults I can only conclude that many Obsidian users, like others, don't have a regular backup system, because, if you are backing up your computer, you are backing up your vault. Obsidian is just another directory, and doesn't need anything special. If you already perform regular backups, and have tested file recovery, then you don't need to read the next bit...unless you are using GitHub for your backups.
To understand why GitHub should not be used to back up your vault, consider the main features required for a good backup system:
-
Ease of recovery - even people who actually do backups often do not test how quickly and easily it is to restore files from the backups. Ideally, you should be able to navigate through backed up files and directories just like ordinary ones, and then simply copy what you want. Recovering lost or old versions of files is possible using git, but the process is much more cumbersome.
-
Rotation of daily, weekly, monthly backups - a good backup system will automatically rotate your backups, removing unnecessary versions as they age and new ones are made. That way you can find something from two days, two weeks, two months or two years ago. Git provide no such functionality.
-
Speed and Space efficiency - over time, the amount of data you need to back up can be many gigabytes, especially if you have videos or many images. When you do a backup, the tool must check for changes across the entire directory, so it needs to be fast. At any point, you will have dozens of backups each representing a snapshot in time. In order to maintain so many "copies", the system must be efficient in compressing the information. With git and GitHub there is no compression except for during file transfer.
-
Off-site backups - Best practice for backups means storing at least one copy of your backups in a different physical location. This could be a cloud server. A good tool should make off-site backups just as easy as on-site backups. GitHub does fulfill this requirement.
The best backup system around is called Restic. It is free, open source, cross-platform, and can be easily managed with a few simple commands. Most importantly, it is blazing fast, and creates surprisingly small repositories. This is basically because it breaks up your files into variable-length blobs, or chunks of bytes. I will explain the details in a subsequent article.
Sharing - use cases for git and GitHub
I have found some very good use cases for git and GitHub. They involve sharing content of my vault. Before describing them, let me point out that sharing of vault content should generally only be one way. Obsidian is not meant to be a collaborative tool. So you, and only you, control what goes into your vault.
Presentation
Git, and especially GitHub itself, are convenient for creating live, interactive presentations from content in my vault. I can create a formal presentation using the Advanced Slides plugin, or just make a section of my vault with specific content and put that part on GitHub. GitHub allows others to browse the vault or view the presentation on-line. Alternatively, they can download the content, open it locally with Obsidian, or simply copy it into their own vault.
In the context of a presentation, git itself is useful, because a presentation is a product. Like any product, version control is useful. A presentation can change and evolve. Sometimes one wants to see something from a prior iteration of a project, and git makes this simple. In other words, presentations have versions.
I'll give some tips on the mechanics of how to do this at the end, but it's simply a matter of collecting all the necessary files, including attachments, in a sub-directory of my vault. I copy this directory to a different location, open it as a vault and enable necessary plugins. GitHub can directly serve HTML files, so I convert the entire new vault to HTML with the Webpage HTML Export plugin. I create a README.md
with some sort of linked table of contents or at least a link to enter the html files. With that done, I can just push to GitHub, and everything goes live in minutes on a url GitHub creates for me.
At this point I can make the presentation and the the audience can follow along on their own computers and engage with the content, either on-line or locally by downloading the vault from GitHub. If they happen to be Obsidian users, they could also copy the files into their own vault to further interact with the material.
Class Management
As a teacher, it didn't take long to consider how Obsidian might be used in the classroom. The idea of making and distributing course content which is easily navigable, visually interesting, and incorporates multi-media, graphs and charts, and external resources is very attractive. Sharing of content can be done the same way as the presentation above, and students could get it either "live" or by making their own copy.
But, as I do go on about, Obsidian is a database too, so why not push this a step further, and run the whole course with Obsidian, including design, distribution of materials, receiving assignments from students, applying rubrics if appropriate, grading and evaluations. This sounds like a project, and git and GitHub are perfect for this use. In addition to facilitating the distribution of the course materials, you can update the materials from time to time, and students will always have access to the latest version. I segregate the course into a public/
and private/
directory, and only push public/
to GitHub. All completed work, grades, evaluations, and any identifying material is kept confidential. Students can submit their responses by emailing the single note, which I simply place in the private/
directory. The Properties take care of everything else (except the actual grading).
This is obviously a more complex example, involving metadata (properties), Dataview, and various templates to provide the metadata and compile grades. A full description would take too long for this article, but I intend to write a detailed article with a sample vault in the near future.
Usage of git and GitHub
This article is already long, so I can't go into details about using git and GitHub, but I want to show how simple it is for this purpose. You need to install git itself, and I use another program called gh (the GitHub Client), which allows me to manage everything from the command line. With these installed and a free GitHub account, all I need to do is create the repository locally with the command git init
. I then create the repository on GitHub itself with gh repo create my-vault-name --public --source=. --remote=upstream
.
After that, whenever I add, delete or change content, I just do
git add .
git commit -m "Some message"
git push
On the GitHub website, under the Settings menu there is a Pages option. Simply go there and you can deploy your vault (with HTML rendered) with a couple of clicks. It will provide you with a live URL, where the content will be kept up to date every time you push
.
That's all there is to it.