Why Argus Uses Sharding to Build a Full-Chain Game Infrastructure

Original Title: How I Learned to Stop Worrying & Love Execution Sharding
Video Link: https://www.youtube.com/watch?v=A0OXif6r-Qk
Speaker: Scott Sunarto (@smsunarto) on Research Day
Article Edited by: Justin Zhao (@hiCaptainZ)

Hello everyone, I am Scott (@smsunarto), the founder of Augus Labs (@ArgusLabs_). Today, I want to discuss a topic that we haven't touched on for a while. As roll-ups have become mainstream, our discussions about execution sharding have not been as frequent as those about data sharding. So, let's revisit this somewhat overlooked topic—execution sharding.

This will be a light-hearted talk. I know everyone has been listening to complex concepts all day, so I will try to make this discussion as practical as possible. I have prepared an appropriate slide design for this presentation.

For those who don't know me, interestingly, I am known as an anime girl on Twitter. I also missed my college graduation to be here, which has made my parents very unhappy. Currently, I am the founder of Argus Labs. We consider ourselves a gaming company rather than an infrastructure or cryptocurrency company. One of my biggest frustrations is that everyone in the crypto gaming space wants to build tools, but no one wants to create content or applications. We need more applications that users can actually use.

Previously, I co-created Dark Forest (@darkforest_eth) with my smart friends Brian Gu (@gubsheep) and Alan Luo (@alanluo_0). Brian is now running 0xPARC (@0xPARC), and he is much smarter than I am.

Today's discussion will focus on execution sharding, but in a context where most people are not familiar with discussions about execution sharding. We usually discuss execution sharding in the context of Layer 1, such as Ethereum sharding or Near sharding. But today, I want to change the context. Let's think about what sharding would look like in a roll-up environment.

The fundamental question here is, why would a gaming company want to build its own roll-up, and what can we learn from World of Warcraft to design roll-ups? Additionally, we will explore how the design space of roll-ups far exceeds the current reality.

To answer these questions, let's go back to 2020 when the idea of Dark Forest was first conceived. We asked ourselves, what if we created a game where every game action is an on-chain transaction? At that time, this premise seemed absurd, and it still does for many today. But it was an interesting hypothesis, so we built it, and thus Dark Forest was born.

Dark Forest is a fully Ethereum-based on-chain space exploration MMORTS game powered by ZK-Snarks. Back in 2020, ZK was not as popular as it is today, as there was almost no documentation. The only available documentation for Circom was Jordi Baylina's (@jbaylina) Google Docs. Despite the challenges, we learned a lot from the process, and Dark Forest is a manifestation of that learning.

Dark Forest is a larger experiment than we imagined. We have over 10,000 players who have spent trillions of gas, and the game is filled with chaos, with people backstabbing each other on-chain. The most intriguing aspect of Dark Forest and on-chain gaming is its platformization nature. By having a fully on-chain game, you open the door to designing space for emerging behaviors, allowing people to build smart contracts that interact with the game, as well as alternative clients and game modes, such as Dark Forest Arena and GPU miners.

However, with great power comes great responsibility. When we launched Dark Forest on what is now known as Gnosis Chain (formerly xDai), we ultimately filled the entire block space of the chain. This made the chain essentially unusable for anything else, including DeFi, NFTs, or any other xDAI-related activities.

So, what now? Are we at a dead end? Will fully on-chain games never become a reality? Or are we going to revert to making games that only have JPEGs on-chain and convince people that money grows on trees? The answer is that we let software do the work. Many of us have a very rigid view of blockchains and roll-ups, as if there is not much room for improvement. But I disagree. We can experiment and find new possibilities.

We asked ourselves a question: if we were to design a blockchain from scratch just for games, what would it look like? We need high throughput, so we need to scale reads and writes. Most blockchains are designed to handle a lot of writes. Transactions per second (TPS) is a metric that people boast about, but reading is equally important. If you can't read from a blockchain node, how do you know where the players are? This is actually the first bottleneck we discovered in blockchain construction.

Dark Forest encountered a problem where full nodes were heavily utilized, and I/O exploded because we needed to read data from the on-chain state. This led to thousands of dollars in server costs, which the xDAI team generously covered for us. However, this is not ideal in the long run. We need high throughput not only for transactions written per second but also for reads, such as fetching data from the blockchain itself.

We also need a horizontally scalable blockchain to avoid the Noisy Neighbor problem. We don't want a popular game to suddenly start having issues on the blockchain, halting all work. We also need flexibility and customizability so that we can modify the state machine to make it suitable for game design. This includes having a game loop that is self-executing, and so on.

Last but not least, for those unfamiliar with online game architecture, this may be a bit vague, but we need a high tick rate. Ticks are the atomic units of time in the game world. In the context of blockchain, we have blocks as atomic units of time. In games, we have ticks. This is almost analogous when you build a fully on-chain game, where the tick or block generation speed of your blockchain equals the tick of the game itself.

So, what we need is a blockchain that is high throughput, horizontally scalable, flexible and customizable, and has a high tick rate. Such a design is necessary to meet the needs of a blockchain designed from scratch for games.

If you have a higher tick rate or more blocks per second, the game will feel more responsive. Conversely, if your tick rate is low, the game will feel sluggish. One key thing to remember is that if there is a delay in block generation, you will feel a noticeable delay in the game. This is a terrible experience. If you have ever dealt with angry players shouting at their computers because they lost a game, that is an absolutely terrible situation.

Currently, our rollups generate one block per second, which corresponds to one tick. If we want to have cooler games, we need a higher tick rate. For example, Minecraft, a simple pixel art game, has 26 ticks per second. We still have a long way to go to build games that are as responsive as Minecraft.

One possible solution is to deploy our own rollup. While this may superficially seem to solve the problem, it does not actually address the root cause of the issue. For instance, you would have higher write throughput, but it wouldn't fully meet the needs of the game. Of course, if your game has a hundred players, that would be sufficient. However, if you want to build a game that requires higher throughput, there will be very strict limitations due to the current I/O methods in construction.

In terms of reading, you don't really get a performance boost. You still need to rely on indexers. You don't have true horizontal scalability. If you try to launch a new rollup to horizontally scale your game, you will break your existing smart contract ecosystem. The markets deployed by players will not be able to work with other chains you launch to horizontally scale the game. This will raise a lot of issues.

Finally, achieving a high tick rate and blocks per second is still somewhat challenging. While we can push for it, we might get two blocks per second, maybe three, but that is really the farthest these blockchains can go because there are a bunch of things like re-marshalling that heavily rely on computational cycles.

To address this issue, we looked back to the early 21st century and the late 90s when online games like MMOs were just emerging. They had a concept called sharding. This is not a new concept; it has existed in the past. The term "sharding" that we use in database architecture actually comes from a reference in Ultima Online. They were the first to use the term "sharding" to explain their different servers.

So, how does sharding work in games? It is not a one-size-fits-all solution. It is a tool in the toolbox, and how you adapt it to your game depends on the specific circumstances. For example, the first sharding construct is what I like to call location-based sharding. A good mental model is to imagine a Cartesian coordinate system divided into four quadrants, each with its own game shard. Every time you want to traverse a shard, you have to send a communication to another shard saying, "Hey, I want to move there," and then you are teleported to your shard, leaving your previous player body behind. By doing this, you distribute the server workload across multiple physical instances instead of forcing one server to do all the computations for the entire game world. The second construct is now more popular. It is called multi-universe sharding, where you have multiple game instances mirroring each other. You can choose any shard you want to go to, and it is load-balanced by default, so that no server becomes overcrowded.

Now, the key question is, how do we bring this concept into rollups? This is why we created World Engine. World Engine is our flagship infrastructure, essentially a sharding sorter designed for launches. Compared to many sharding sorter designs we have seen in previous discussions, our design is different and better suited to our needs. Our optimization direction is: A, throughput, B, we want to ensure that there are no locks preventing runtime to ensure that the tick rate and block time are as efficient as possible, so it is synchronous by default, and we design the sorter in a way that is partially ordered rather than enforcing total ordering (where each transaction needs to happen after another transaction).

The key components here are that we have two main things. We have EVM-based sharding, which is like a pure EVM chain where players can deploy smart contracts, combine with the game, create markets with taxes, and so on. It is like a normal chain, right? Like one block per second or something, just enough for you to do all your typical device and market stuff.

The secret ingredient here is that we also use a game shard, which is essentially a mini-blockchain designed as a high-performance game server. We have a bring-your-own-implementation interface, so you can customize this shard according to your preferences. You can build your own shard and inject it into the base shard. You just need to implement a standard set of interfaces, similar to what you are familiar with in Cosmos, which has an ABC interface. You can essentially integrate this into a similar specification and bring your own shard into the World Engine stack.

The key here is that we have a high tick rate that we currently cannot achieve with the existing sharding constructs. This is where I want to introduce Cardinal. Cardinal is the first game shard implementation of World Engine. It uses an entity-component-system (ECS) with a data-oriented architecture. This allows us to parallelize the game and increase the throughput of game computations. It has a configurable tick rate of up to 20 ticks per second. For blockchain people here, that means 20 blocks per second.
We can also geo-locate it to reduce latency. For example, you might have a sorter in the US, and then someone in Asia has to wait 300 milliseconds of latency for the transaction to reach the sorter. This is a huge problem in gaming because 300 milliseconds is a long time. If you try to play an FPS game with 200 milliseconds of latency, that basically means you are already dead.

Another key point that is also very important to us is that it is self-indexing. We no longer need external indexers. We don't need these frameworks to cache game states. This also allows us to build more real-time games without latency issues caused by indexers still trying to catch up with the sorter blocks.

We also have a plugin system that allows people to parallelize ZK verification, etc. The best part, at least for me, is that you can write your code in Go. No more needing to use Solidity to make your game work. If you have ever tried to build a blockchain game with Solidity, it was a nightmare.

However, the key point of our sharding construct is that you can build anything as a shard. They are essentially an infinite design space for what a shard can be.

Suppose you don't like writing your game code in Go; you can absolutely choose another way. However, we are developing a Solidity game shard that allows you to implement games in Solidity, which retains many of the advantages of Cardinal while providing the possibility of writing code. You can also create an NFT minting shard with a unique memory pool and sorting construct to solve the Noisy Neighbor problem similar to basic minting. You can even create a game identity shard that represents your game identity with NFTs, allowing you to trade game identities easily through NFTs instead of sharing private keys.

This is an advanced architecture, and due to time constraints, I won't go into too many in-depth details today. The key is that we allow EVM smart contracts to combine with game shards through custom pick and pass. We created a wrapper around Geth that allows them to communicate with each other, opening up a lot of design space in both directions. We are synchronous by default and can interoperate seamlessly between shards without locking.

Our shared sorter is unique because it does not use a shared sequence construct that prioritizes global ordering with atomic bundles, which requires a locking mechanism and leads to issues like blocking the main thread, resulting in unstable tick rates and block times, causing delays in games. It also limits the block time for each shard and requires various cryptoeconomics and constructs to prevent denial of service. There is also a big problem that I haven't seen mentioned in many VCR sorter constructs: if you have different shards that depend on each other and cause deadlocks, how should you resolve that? With asynchronous design, this is not an issue because everyone is doing what they want and then letting it go.

In fact, cross-shard atomic bundles and roll-ups are often unnecessary. For our use case, we don't need anything that requires atomic bundles, and we don't think this is something we should design our Roll-Ups around for use case purity. This also brings many other interesting features. For example, each game shard can have a separate DA layer for the base chain. For instance, you can use the base shard to push data to Ethereum, while the game shard can push data to Celestia (similar to a data availability committee). You can also reduce the hardware requirements for running full nodes because you can run the base shard Geth full node separately without running the game shard node, making it easier to integrate with things like Alchemy.

To summarize, I want to be candid here that many people hope their constructs can solve all problems, but we are not like that. We believe our construct is useful for us, but it may not be applicable to your use case. Assuming our construct can apply to everyone is unrealistic. For us, it meets our needs, providing high throughput, horizontal scalability, flexibility, and high tick rate, but it does not cure cancer. If you need a DeFi protocol that requires synchronous composability, then this construct may not be suitable for you.

Overall, I genuinely believe in the concept of human-centric blockchain architecture. By designing around specific user roles and use cases, you can make better trade-offs instead of trying to solve everyone's problems. The Renaissance has arrived, and everyone can design their own Roll-Ups to meet their specific needs instead of relying on generic solutions. I think we should embrace the Cambrian explosion. Don't build roll-ups like one-size-fits-all layer ones, because it doesn't work to solve the same problems at all. Personally, I look forward to seeing more people explore more Roll-Up design spaces that are tailored to use cases. For example, what would a Roll-Up designed specifically for asset exchange look like? Would it be based on intent? What would a Roll-Up designed specifically for on-chain CLOBs (Central Limit Order Books) look like? With that, I will hand the microphone over to MJ. Thank you for your invitation.

English Version:
https://captainz.xlog.app/Why-does-Argus-Build-FOC-Gaming-INFRA-Using-Sharding