Posts

Linearizability

Distributed/networked systems employ data replication to achieve availability and scalability. Consistency is concerned with the question of what should happen if a client modifies some data items and concurrently another client reads or modifies the same items possibly at a different replica. Linearizability is a strong form of consistency. (That is why it is also called as strong-consistency.) For a system to satisfy linearizability,  each operation must appear (from client perspective) to occur at an instantaneous point between its start time (when the client submits it) and finish time (when the client receives the response), and  execution at these instantaneous points should form a valid sequential execution (i.e., it should be as if operations are executed one at a time ---without concurrency, like they are being executed by a single node)  Let's simplify things further. In practice, consistency is often defined for systems that have two very specific operations: read and w

Using Lightweight Formal Methods to Validate a Key-Value Storage Node in Amazon S3 (SOSP21)

Image
This paper comes from my colleagues at AWS S3 Automated Reasoning Group, detailing their experience applying lightweight formal methods to a new class of storage node developed for S3 storage backend. Lightweight formal methods emphasize automation and usability. In this case, the approach involves three prongs: developing executable reference models as specifications, checking implementation conformance to those models, and building infrastructure to ensure the models remain accurate in the future. ShardStore ShardStore is a new append-only key-value storage node developed for AWS S3 backend. It is over 40K lines of Rust code. Shardstore is a log-structured merge tree (LSM tree) but with shard data stored outside the tree to reduce write amplification. ShardStore employs soft updates for avoiding the cost of redirecting writes through a write-ahead log while still being crash consistent. A soft updates implementation ensures that only writes whose dependencies are persisted are sent

Book review: Fountainhead

I had been hearing a lot of negative comments about Fountainhead on Twitter. These comments seemed politically motivated, how the book is a dumb libertarian propaganda. I had to check the book out myself to see what the deal is.  The book started out well. I was surprised to see that, first this is a novel, and second this is an easy read. I was waiting an impenetrable intellectual literary text. The book turned out to readable and initially charming, even though it was written in 1940.  The book felt like "The Goal" novel , communicating a concept/idea in story form. The main guy is this guy, Howard Roark, he is the best architect ever. Rather, he is the God of Architecture. He is architecture in flesh. He is like the best architect ever. The book is not at all subtle about this. He is 1000% dedicated to his craft, and he is the best ever, and he brings out his uncompromised gift in every building he designs. He is uncorruptable and willing to suffer for his art. The first

Matchmaker Paxos: A Reconfigurable Consensus Protocol

Image
Michael Whittaker , Joe Hellerstein's PhD student at UC Berkeley, has recently defended his PhD thesis on Compartmentalized Paxos . The thesis deconstructs Paxos and shows ways to reconstruct it to be more scalable by individually focusing on each component/role in Paxos. It is a simple but effective trick. Even after you learn the trick, you still keep getting surprised by how effective it is. When I say Michael defended his thesis, I am speaking loosely. Michael was not defensive about anything. He didn't have to be, his work spoke for him. Just watch his PhD presentation and you will understand what I mean.  Michael has a great talent for explaining things simply. Unfortunately this is an underappreciated talent, especially in academia. This may make it look too easy and effortless, and like you didn't do sophisticated work. This dude did not even prepare presentation slides for his PhD thesis (the heresy!), and winged it on the fly by drawing his protocols on an iPad. H

There is plenty of room at the bottom

Image
This is a pun on the saying "there is always room at the top". This is also the title of a famous Feynman lecture from 1959 , where he made a case for nanotechnology.  In this post, I will try to argue that there is plenty of room at the bottom for distributed algorithms. Most work on distributed algorithms are done at a high-level abstraction plane. These high level solutions do not transfer well to the implementation level, and this opens a lot of space to explore at the implementation level. But this is not just an opportunistic argument. It is imperative to target the implementation level with our distributed algorithms work, otherwise they remain as theoretical, unused, inapplicable.  Let me try to demonstrate using consensus protocols as examples. Someone else can make the same case using another subdomain. Be mindful of what is swept under the rug What is succinct at the high level could be very hard to implement and get right at the low level. Paxos seems simple , bu

Do tightly synchronized clocks help consensus?

Image
Let's start with the impossibility results, a good place to start distributed systems discussion. The coordinated attack result says that if the communication channels can drop messages arbitrarily and undetectably you cannot solve distributed consensus using a deterministic protocol in finite rounds. This result is all about *the curse of asymmetry of information*. Reliable channels are needed for two parties to agree on something. Otherwise the two parties are stuck in a Zeno's paradox of "You may not know that I know that you know that I know" kind of information chaining. A classic paper to checkout on the topic is " Knowledge and common knowledge in a distributed environment ". Assuming mostly reliable channels and demanding majority acks, it is possible to solve consensus in a partially synchronous system. Paxos shows us how to do this. Paxos uses quorum of acks to get around the problem of "the agreement having to occur in one shot" as in

Progress beats perfect

Image
This is a favorite saying of mine. I use it to motivate myself when I feel disheartened about how much I have to learn and improve. If I do a little every day or every week, I will get there. If I get one percent better each day for one year, I'll end up thirty-seven times better by the end of the year.   $1.01^{365}=37.78$ Years ago I had read this idea in one of John Ousterhouts life lessons, and it stuck with me.  "A little bit of slope makes up for a lot of y-intercept" Recently I noticed another advantage of progress over perfect. The emotional  advantage. Progress is better because it makes you feel better as you see improvement. You are getting there, you are making ... progress. Progress is growth mindset . You have an opportunity ahead of you. Perfect feels bad.. It puts you on defense. You have to defend the perfect, you have to keep the appearances. You can only go downwards from perfect, or maintain status quo. Progress gives you momentum. As long as you manag