Accepted
Summary
Replace the heart of Regatta the gihub.com/lni/dragonboat
library with github.com/hashicorp/raft
library.
Motivation
As the Regatta evolved we run into problems with dragonboat
library.
- Library is maintained by just a single person.
- It is a closed design library (most stuff implemented in
internal
packages). - Has inadequate API, hides many details and takes the control away from the programmer.
- Snapshots
- Logging
- Configuration
- Transport
- Node discovery
- ID assignment
- Even though
dragonboat
has a lot of stars it does not seem to be used much (just tens of imports), in contrasthashicorp/raft
has over 1k imports. - Library is not really modular, replacing Transport is almost impossible, the same with LogDB and Snapshot store.
- The
hashicor/raft
is much simpler in design as well as in implementation (dragonboat vs raft sloc)
Design
Challenges
hashicorp/raft
is not multi-group raft implementation- that could be mitigated by multiplexing over custom transport ( see. raft-grpc-transport-mux)
- the advantage is that we could pick the group label of our liking (like table name)
hashicorp/raft
does not have support foron-disk
statemachine impl OOTB- every start of FSM is accompanied by applying the most recent snapshot
- in the case of our table FSM that would lead to large compute and space overhead
- mitigation lies in implementing own SnapshotManager that would serve lightweight snapshot out of the persisted data
- inspiration could be found in Vault
hashicorp/raft
does not automatically forward proposals to the leader node- that could be implemented using the same layer as the Raft multiplexing
- inspiration could be drawn from Consul
In nutshell
- Basics + Meta
- Adapt Metadata FSM to satisfy
raft.FSM
- Add and expose
raft-grpc-transport-mux
on internal GRPC server - Back the
raft.Raft
instance bytidwall/raft-wal
or other impl - Use default File based snapshot storage
- Adapt Metadata FSM to satisfy
- Table FSM
- Adapt Table FSM to satisfy
raft.FSM
and (optionally)raft.BatchFSM
- Implement Raft leader forward over internal GRPC server
- Implement readIndex (serializable) read
- Use default File based snapshot storage
- Adapt Table FSM to satisfy
- Optimization
- Implement
on-disk
snapshotting optimization - Implement readIndex and forwarding pipelining
- Use the aforementioned snapshot store to provide user requested snapshots
- Implement
Alternatives
- Replace
dragonboat
withgithub.io/etcd/raft
. ETCD raft is a base of Dragonboat library (dragonboat is to some degree a wrapper/fork of it) and as such does not make matters simpler, on the flip side though it is more powerful thanhashicorp/raft
. - Fork
drgaonboat
, the attempt was made ingithub.com/coufalja/tugboat
but it was soon discovered that the major overhaul would be needed nevertheless.