superposition-based-CRDT

Let's say Alice produces a version of our document. What state do we need to store?

If you ever used git or any other VCS, you know that having the content hash is a really good thing to have.

State(A_1) { hash: "8eeb4b4bde521" }

But what if Bob also produces a version without having a internet connection?

State(B_1) { hash: "5852850165af3" }

Now what if Bob reconnects?

We could merge them. But doesn't that loose information?

If we need a merged version, can't we produce one when we need it?

Superposition [
  State(A_1) { hash: "8eeb4b4bde521" },
  State(B_1) { hash: "5852850165af3" }
]

Now what if Alice wants to make a new version?
First we merge State(A_1) and State(B_1) (or let her manually do it).
Then she can make her changes.

Now what does her state look like?

Superposition [
  State(A_1) { hash: "8eeb4b4bde521" },
  State(A_2) { hash: "67d95b3b8c16a" },
  State(B_1) { hash: "5852850165af3" }
]

Wait what? Obviously State(A_1) has replaced State(A_1)!
(and also State(B_1) since she knew of it and edited the merged version)

Superposition [
  State(A_1) { hash: "8eeb4b4bde521" },
  State(B_1) { hash: "5852850165af3" },
  State(A_2) { after: [A_1, B_1], hash: "67d95b3b8c16a" }
]

If you want to clean this up a bit, you can move State(A_1) and State(B_1) to the annals of history, since they are implied by State(A_2).
(but lets keep them, so you dont have to scroll up to see the whole state.)

Let's say Bob does so too at the same time:
(Bob's internet is nasty, I know)

Superposition [
  State(A_1) { hash: "8eeb4b4bde521" },
  State(B_1) { hash: "5852850165af3" },
  State(A_2) { after: [A_1, B_1], hash: "67d95b3b8c16a" }
  State(B_2) { after: [A_1, B_1], hash: "c7da0ff0a39c2" }
]

And on and on it goes...

Except, we don't want to merge these manually all the time, do we?

So, what is this merge() function?
Let's write down our requirements.
merge() needs to:

Therefore its better to think of merging actions than states.
What are does actions, you might ask...
Well, it depends... on the type of the value we are trying to built an CRDT for.

Let's go with strings for now, since they are the most interesting.

Event then there are many options.

For instance, we could just have "set the string to a value".
That is stupid though, since that would defeat the whole purpose of using a CRDT.

So, let's go with insertion and deletion:

Insertion { index: 0, value: "foo" }
Deletion { index: 0, length: 5 }

Or, you merge them into replacement, which can do both:

Replacement { index: 0, length: 5, value: "foo" }