288 points by KirinDave a year ago
Member of the SyncFree Consortium (and committer to Antidote) here.
Here's a video from J On The Beach this year on the details around the Just-Right Consistency approach that might help answer some questions:
Here's links to other things in the SyncFree group.
Lasp (previously on HN):
SyncFree project review results:
Further follow-up: we're (members of the SyncFree consortium and it's follow up project for moving CRDTs, Antidote, and Lasp to the edge) considering raising money to industrialize Antidote (and it's associated tool chain) for building Just-Right Consistency applications.
If you'd be interested in any of the above, or just using the database and/or trying the tools, reach out to me via email christopher.meiklejohn at gmail.com or cmeik on Twitter.
That video is really good - perfect for the layperson such as myself!
See also this video comparing Antidote to other kinds of data stores: https://youtu.be/oWUNCsFy-r0
At first glance, this looks amazing. I truly believe CRDTs are the solution to lots of distributed systems problems, and that exposing their characteristics to developers directly, rather than trying to abstract them away in nicer, but leaky, abstractions, is the right way to go.
That said, a major part of why databases are hard is that reliable storage is hard. I see remarkably little about this on Antidote's homepage. I'd wish this was just a frontend to some battle tested storage engine but it appears that this is not the case.
(committer on Antidote, Riak and Lasp)
We're investigating a backend that works on LevelDB and RocksDB. We just haven't had the academic resources to get it implemented yet. However, it's largely an engineering resource problem and not a theoretical problem.
> However, it's largely an engineering resource problem and not a theoretical problem.
I appreciate your openness and honesty here, but the moment that an academic says that all the theory is solved and now "it's largely an engineering resource problem" is when people from industry tend to get nervous :-)
The majority of aspects that make databases robust aren't theoretical problems but "mere" engineering problems. Nevertheless there's an enormous variety in quality in this area. Pile sufficiently many engineering problems together and it becomes very hard to get right.
Sure, but if someone (or some company, rather) really cares, the "mere" engineering problems in an (early-stage, not-yet-a-disaster-of-a) codebase can be solved by throwing money at them—to hire, or contract, the best engineers.
But fundamental "academic" problems in the system architecture won't ever be solved, because the resulting codebase would be a different product targeting different use-cases.
CRDTs accumulate garbage, and need a global "sync" to GC. How do you mitigate this from having performance impacts?
[Disclaimer: I am an Antidote maintainer]
Some CRDTs support garbage collection directly - if you run them in a causally consistent environment. Antidote is causally consistent and has a Set and Map implementation that work like this; for these CRDTs you don't need a global sync.
Thanks for the reply. Don't you negate some of the advantages of CRDTs by mandating casually consistent environments? Can you speak to that more?
This was true of the early CRDT designs, but they have improved since then. You will find plenty of inspiration here: http://dblp.uni-trier.de/pers/hd/b/Baquero:Carlos
I think you know that’s a cliche in some circles and my intention is not to try call you on that without reason, but to ask about it practically speaking.
Would you agree that it seems for many disciplines, whether its cs or physics, the power of the statement “it’s solved theoretically” relates to the scope of the problem?
For example, if someone came up with a strong argument or proof that raises the upper bound of performance for a specific algorithm, I could be convinced to start celebrating right away, and would think nothing of the implementation being passed off to a student as a mere formality.
However, the scope doesn’t have to increase much before a formal proof, or even a conclusive argument, become impossible to make in a way that’s completely convincing.
It’s not a knock against either theory or engineering, it’s a simple matter of our inability to model or predict things accurately beyond a certain complexity.
I only wish I could explain this well to lay people, like if a friend asks me, “after over half a century of writing code why cant even the greatest software companies in the world reliably predict how long it takes to ship software?”
Maybe next time I’ll tell them, “it’s the same reason we need database reference implementations”. It will just add more confusion, but at least I can spit out some kind of answer.
It appears to use Riak as the storage layer.
It uses Riak Core as the underlying distribution layer, which is the same that Riak itself uses. For storage of the log, the built-in Erlang disk_log is used.
Ah great! That makes this very cool all of a sudden. How did I miss that?
This has that "too good to be true" vibe, and I can't find much information on the authors or the Syncfree Consortium organization that backs the project besides their own website.
Is this at the cost of fast writes or flexible schema? The pitch video doesn't seem to mention any cons, yet seems to avoid mentioning the type of data or mutations supported. I guess I'll go read their publications.
It's a government sponsored project so one would expect their publicity to be limited. That's part of the reason I posted it. Stuff like this is extremely exciting.
It's at the cost of playing by the rules of CRDTs. Making CRDTs consequence-free is ongoing research.
> Making CRDTs consequence-free is ongoing research.
What do you mean by that?
I found possibly related language on this page the other day :
CRDTs are "[t]ypically not suited for editing application with consequent UI."
Can you point me in the right direction? My googling got me nowhere. Thanks.
I dunno about that quote, Treedoc works pretty well.
In general there are tradeoffs to CRDTs and not everyone loves those tradeoffs. As time goes on we find CRDTs that make better tradeoffs.
For example, the earliest examples of CRDT sets grew linearly with the values added to the set, without regard for deletion. For all time. That's a pretty steep cost.
LSEQ appears to improve upon treedoc in ways, too: https://hal.archives-ouvertes.fr/hal-00921633/document
I get it, thank-you.
It's not described in the docs, but elswhere  in their github repository you can find the note about rga, which is Replicated Growable Array CRDT, that can be used for building indexed linear sequences - shortly speaking arrays of elements with insert/remove semantics.
An efficient implementation of RGA designed for concurrent editing: http://dx.doi.org/10.1145/2957276.2957300
I think the issue you're looking for is a limited number of supported datatypes. This doesn't appear to be a general-purpose database at the moment: don't expect to drop it in as a replacement for MySQL. Doesn't mean it can't be in the future, but it would have to fall back on other techniques for the datatypes that can't be built with CRDTs.
I think the data types look fine. If I'm using such a database, I don't think I want the CRDT data types to be abstracted away behind some general-purpose (SQL?) interface. Can that even be done? I'd rather just pay the small price of restructuring my data into maps and sets and have the CRDT database do the rest without any extra indirection.
Well, i've been working on an SQL interface for AntidoteDB for my master thesis, and yes, it can be done. However, there are some limitations. At this point we are mostly focusing in constraints (i.e. invariant preservation), and we've already figured out stuff like numeric bounds, entity integrity and referential integrity (this last one is quite interesting IMO).
I don't think it encodes a graph that you query (like, say, Neo4J or Postgres with an orm); it seems to give various replicated data structure primitives, like documents, basically a kv store? So for building something like google docs.
One glaring omission is indexes.
You can likely build something SQL-like to form complex queries coordinating related objects, but the efficiency of such queries is not a given. You can likely manually build something like an index to speed up lookups using e.g. the map primitive. With a plan for a complex query, you're on your own.
Another likely problematic part is PK generation. You can of course use a counter, but I'm not sure how efficient will it be with burst inserts. For that, you'd have to come up with client-generated PKs that are unlikely to clash, e.g. UUIDs.
(one of the maintainers of Antidote)
There is a lot of research still going on in the back, including indexing, access control, verification tools for apps, and some other really cool stuff.
Efficient support for queries is non-trivial, indeed.
A write is fast, because it happens directly at the closest replica, without any inter-replica synchronisation. There is no schema per se; rather the DB is object-oriented, and each application picks the object types it needs from the CRDT library. The library currently covers the basics: registers, flags, sets, maps, lists.
For anybody else wondering what CRDT means: conflict-free replication datatype.
CRDT: Conflict-Free Replicated Datatype
Related ideas exist in David Reed's 'Atomic Actions' which use pseudotime: http://www.cs.sfu.ca/~vaughan/teaching/431/papers/reed83.pdf:
"thinking about objects as sequences of unchangeable versions,
"the correct construction and
execution of a new atomic action can be accomplished without knowledge of all other atomic actions
in the system that might execute concurrently"
Is anyone aware of comparisons between these two streams of ideas (pseudo time based action and CRDT based)?
FYI, that paper was from 1983, CRDT was introduced 2007, conceptually they share the same idea, but most modern CRDT implementations tends to use Lamport timestamp to causally solve conflicts.
In short, antidote looks like a decent solution to this kind of problems.
There was also Datanet announced last year: http://highscalability.com/blog/2016/10/17/datanet-a-new-crd...
It's now rebranded as Kuhiro: http://highscalability.com/blog/2017/11/6/birth-of-the-nearc...
There's also CRDB, a CRDT based version of Redis (closed source, and disclosure - I work for Redis Labs) https://redislabs.com/redis-enterprise-documentation/adminis...
I wonder what it looks like in terms of resource usage.
I think there's a strong case for something like this for IoT type devices. Imagine the simple case of adding a name to a contact list on a mobile device, and wanting that to get synced up with a series of other devices.
(committer on Antidote, author of Lasp, both SyncFree projects)
Antidote is designed for DC deployments, in the hundreds of clusters. It provides causal consistency and transactions.
Lasp is designed for high-churn, edge deployments (with CRDTs) and provides eventual consistency: it's designed for 1,000+ nodes.
They share a bit of code, and you might want to evaluate both to determine which is a better use case: Lasp might be a better fit for the IoT application, because it was designed for that.
Here's a HN post on Lasp's scalability:
Here's a HN post on Lasp:
This is based on riak which is written in erlang.
Those nodes connect in a mesh, and actually the limit of scaling is about 100 nodes in one cluster.
See my comment above that you specifically replied to.
Lasp [in high-scalability mode] does not connect in a mesh, and the links I provided demonstrate that we do not connect in a mesh.
I specifically provided a link in the comment [with a link to an academic publication and a paper] you responded to about how we do non-full mesh and achieve 1000+ nodes.
Antidote only uses distributed Erlang inside of a single cluster limiting a single cluster to 100+ nodes. The multi-DC protocol is not based on distributed Erlang, and therefore, is not limited by distributed Erlang (but, is limited by other factors.)
Additionally, there are Riak Core modifications (authored by myself and others) that bypass distributed Erlang using the same semantics, and therefore, don't have the 100+ node cluster restriction.
And finally, the 100+ node cluster restriction is highly-debated, by myself and other academics, because the restrictions are based on use of the global module, where systems that don't use this module will experience higher scalability numbers (based on empirical evidence, not yet published.) For reference, Ericsson, has run 200+ node Erlang clusters.
It's written in erlang so should be better ram usage than a Java database.
I wouldn't say "better"; I'd probably say that the memory profile would be completely different. It all depends on the number of actors, the size of objects, and the workload. Remember: for large binary objects, references on a shared heap are sent and messages are not explicitly copied per actor. This means it's dependent on the workload and both the structure of how many processes are involved in a single request.
How would I do a "put if absent" in this database? Is it efficient?
If the thing you "put" into is a CRDT, then two concurrent "put" will be merged, if that's OK for your application. If however you want to disallow concurrent "put"s then you need to add some concurrency control to the CRDT layer. See the CISE tool https://youtu.be/HJjWqNDh-GA, http://dx.doi.org/10.1145/2911151.2911160. Antidote doesn't yet support concurrency control, it's work in progress.
The video lists pros/cons for strongly consistent and eventually consistent databases, but only has pros for a "just right consistency" database. What are the cons?
The idea of the "just right consistency" is that it brings the best of both worlds, without any drawbacks.
Your application works as well as if it was executed fully in strong consistency, but with improved scalability for the set of operations that can execute in an eventual consistent model.
The biggest drawback is that only some operations can be supported. E.g., without strong consistency you can detect double-spending from an account but you can't prevent it, because the validity of an operation can't depend on operations a datacenter hasn't seen.
Financial examples are bad because in fact the financial world IS eventually consistent. It's quite possible to withdraw the same $100 from an account via multiple ATM machines.
With ATMs and debit cards, I thinks it's generally not true, they seem to use the online mode and update the balance of a checking account within seconds.
With credit cards, you can indeed start more transactions against the same balance, and you're never sure in which order they will complete.
Gonna repeat, it is DEFINITELY possible to spend the same money multiple times with one ATM card, using old ATM machines with other payment methods.
No, I will not further detail how here.
I can verify this is possible.
Interesting! A bit or research in this area may literally pay :)
you can do this but the bank will know and the police will show up.
I don't mean to endorse such behavior, but the folks who used a loophole after stealing my ATM card details from a data breach never got caught using a variety of ApplePay based variants of this attack (now fixed, btw).
In general folks who get serious about it never get caught. Which is why folks who give a damn about the world don't talk about specifics on public forums.
When a payment processing system doesn't promptly cancel auth holds, customers definitely complain about being prevented from spending their money. This stuff should be table stakes but some retail banks are just way behind.
That is the reverse problem. Flippin' gas stations STILL have a problem with this that you feel acutely because the large holds. A similar problem exists with hotels, where they'll put a lock on a ton of money in your account.
Great video. I'm impressed you guys built the checker tool, that isn't an easy task - we built something similar called PANIC (https://github.com/gundb/panic-server) that lets us simulate failure cases and then run the test across a distributed system to see if it passes.
I had a question, in the video at a certain point you say that you must modify the code to disallow concurrent debits. This makes sense in theory, but wouldn't it fail in practice? If two machines in different regions are running this code, they would not know that there is a concurrent debit. How si that addressed?
Correct database implementations must be strongly consistent, unless your updates all commute with each other, in which case eventual consistency works.
E.g. db[key] += value and db[key].insert(value) commute with themselves, but db[key] = value obviously doesn't.
This just seems to be an attempt to implement a correct eventually consistent database, and the CRDTs are simply datatypes with commutative update operations.
Unfortunately, it seems to allow transactions that examine objects and then (conditionally) update them, which obviously are not guaranteed to be commutative even if the objects are commutative, so maybe it doesn't succeed at implementing a correct database.
Some conditional updates are safe; others require to add concurrency control. Our CISE analyser will tell precisely you which side a specific operation falls into. See https://youtu.be/HJjWqNDh-GA and http://dx.doi.org/10.1145/2911151.2911160. Antidote doesn't yet support concurrency control, it's work in progress.
Limited support for datatypes?
I wanted a database that would receive "events" asynchronously and stored that, but at the same time would process these events (from some piece of previously written code) to generate a queryable schema.
If I wanted to change the schema later, the database would let me just rewrite the code and it would reprocess all the received events since the beggining.
My use case is not anything high-performance or with thousands of writes -- it's the opposite.
Sounds like kappa architecture is something that will fit your request.
Don't you want a message queue like Sub/Pub?
One more idea about distributed consistencywithout synchronization: http://avodonosov.blogspot.com.by/2016/09/partitioned-availa...
Can anyone explain the advantages/disadvantages of using CRDTs over OT (operational transformation)?
OT requires a centralized server for intention resolution.
CRDT does not. It can be fully P2P.
This absolutely awesome!
I'm very excited for AntidoteDB, for its use cases but also for the underlying, pioneering work you're doing on CRDTs. Thank you for doing it! <3
Does anyone have this compiling on Erlang 20?
We tried a couple of weeks back, but then most dependencies have not been upgraded yet. The problem is that, to my knowledge, there is no riak_core for Erlang 20.
I suggest that you build on top of the Riak Core from Heinz Gies. That's what we are using in some of our new projects.
This is what we actually do.
I checked a couple of weeks ago with Heinz, but there wasn't a version available (but maybe I misunderstood him...).
Can you point me to an Erlang 20 compatible version?
Ah, we might be pulling a branch for our Riak Core work. I can look into it tomorrow and get back to you.
Looking forward to Kyle Kingsbury’s Jepsen review.
Jepsen goes into detail about general CRDTs here: https://aphyr.com/posts/285-jepsen-riak
Yes, but devil's in the details. The fact CRDTs are provably able to have specific consistency guarantees doesn't mean a particular implementation does it correctly in all contexts. Without testing, "We built this (city) on CRDTs" isn't very useful.
What's nice about building databases on compositions of CRDTs is that if you can validate the individual CRDTs via automated testing, you have a very high degree of confidence the composition of those CRDTs will do something similar.
No one's arguing that a Jepsen test shouldn't be done. Just that it'll probably be very different in character from more invented industry technology.
Yes; if you can prove a system was built atop an academically sound paradigm, it's more likely to adhere to the guarantees that paradigm is intended to provide than a system built atop an unproven paradigm. Kinda goes without saying, though. :P
Unsnarkily, I agree that implementation and composition of CRDTs is comparatively straightforward compared with other approaches people have taken. But if correct behavior is a requirement, full testing is too, regardless of how easy it -should- be in theory.
Sure. My point is that it's a rough idea of what you could expect.
If you look at the Jepsen posts for Cassandra and Riak, for example, you can see that the general characteristics are fairly similar.
yeah, this appears to be Riak, after Riak went kaboom, possibly with anti-customer CRDTs turned on by default rather than the other marketing-friendly modes.
Until we get John Galt to release his all-knowing database that can authoritatively resolve all conflicts and avoid the evil socialist resolution of AP read conflicts from his hidden valley of 1000 men and 1 woman, this will continue to be a hard problem.
(former Basho engineer and committer on Riak and Antidote)
No, it's not Riak.
[Full disclosure: work was started in 2013/2014, with Basho, as an alternative to Riak and shares Riak Core as a base. At the time, Basho was a member of SyncFree.]
It's built on Riak Core, but it's a lot different from Riak in many ways: no replication in the DC, transactions, more data-types, operation-based CRDTs and not state-based CRDTs, log vs. in-place replacement in the storage backend.
I mean, the analogy is poor: it's even less close than saying that Riak is Cassandra and vice-versa.
I don’t think it’s Riak. It just uses riak_core, which is a kind of distribution / sharding framework. Riak itself has new owners and is not going away.
Hey, author of GUN (the current most popular generalizable CRDT based database), and want to say I'm impressed. I'm often the first to nitpick things but this looks great:
- Built in Erlang
- Great explainer videos
- Well documented CRDTs that you accept
- Team of university related researches in CRDTs.
I'll be looking through your guys stuff more. But good job! We need more people like you guys out there.
We detached this comment from https://news.ycombinator.com/item?id=15863587 and marked it off-topic.
It's nice that there's some reasonable research going on and this stuff will probably help build some solid systems... but do you really have to call it "Cure" and "Antidote"? Think of what you are implying with those names...
It has this thick smell of academia arrogance all over it which I always find deeply disturbing. I think what these types of projects need the most is humility.
Please reconsider how you present yourself and your work...