Until recently, I was a crypto skeptic and believed Bitcoin-like proof-of-work was as good as it could get in the crypto world, because it draws the closest parallels to the supply-demand dynamics of the gold, silver, and commodities markets. I was somewhat traditionally fiscally conservative and a believer in the Austrian School of Economics for a very long time. While I still believe the principles of the Austrian School will hold true in this physical Universe, I’ve extended my school of thought to explain network effects. I treat these extensions as "overrides" in this ever-so-dynamically-evolving world. I admire Ethereum as a network and have come to believe that networks are the core aspect of economics itself. Gold, Silver, Dollars, BTC, ETH—they all proliferate, but only because of networks.
Invented in 2013, Ethereum allows organizations to create smart contracts, which enable transacting entities to codify agreements directly into the blockchain. Despite its many scaling challenges, Ethereum has remained relevant, clocking in at a ~$200B market cap at the time of writing. There’s already plenty out there on Ethereum, the Austrian School, money, and economics—so I won’t dwell on that here. But at some point, I’ll write about my own interpretation of economics, substantiating my networks-are-everything belief and adding my voice to the Internet’s collective ontology.
My focus in this article is on the Ethereum blockchain—and specifically, from the point of view of a developer, not someone interested in running a node to earn staking rewards. As a developer: 1. I want to communicate with the network over APIs. Being a stingy man, I’m not keen on paid full-node solutions, of which there are many (Infura, Alchemy, QuickNode). 2. I like having control over the hardware and software updates. 3. I’m taking this as an opportunity to dive into the internals of go-ethereum (Geth) and Prysm. When I say "full-node", I’m deliberately excluding archive nodes. Those are a different beast, and may pose a different set of challenges. I only need a full-node for now.
From a future-focused perspective, I think the current set of Ethereum APIs isn't exhaustive or ergonomic for power users. These APIs are designed to be foundational—and that’s fine—but it falls to operators and service providers to build composite solutions on top of them. By composite, I don’t mean L2s. In fact, I imagine this category of solutions as one that spans across layers: giving me the ability to query not just Ethereum, but also L2 chains like Base, Arbitrum, and others. What we lack today is a repertoire of composite tooling for reliable, performant, and exhaustive querying of blockchain state. What these would manifest as would be materializations, map-reduce solutions, APIs, novel storage, and observability tools on top of current APIs. I’ll talk more about extra-API extensibility in a future post.
A quick primer before we jump into setup instructions. Ethereum has two main components since the shift to Proof of Stake: the execution layer and the consensus layer. go-ethereum (Geth)_
Install the prerequisite software to run the Ethereum and Prysm code inside Docker containers.
sudo apt-get update
sudo apt-get install ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/debian/gpg | \
sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/debian \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
sudo apt-get install \
docker-ce \
docker-ce-cli \
containerd.io \
docker-buildx-plugin \
docker-compose-plugin
# Verify Installation
sudo systemctl status docker
Set up Geth and Prysm:
# Create the directories for Geth and Prysm data
mkdir -p $HOME/ethereum/geth $HOME/ethereum/prysm
# Generate JWT secret and copy it to both directories
openssl rand -hex 32 | sudo tee $HOME/ethereum/geth/jwtsecret
cp $HOME/ethereum/geth/jwtsecret $HOME/ethereum/prysm/jwtsecret
# Set correct permissions for JWT secret files
chmod 644 $HOME/ethereum/geth/jwtsecret
chmod 644 $HOME/ethereum/prysm/jwtsecret
# Create Docker network for both containers
sudo docker network create eth-net
# Stop and remove existing containers if they exist
sudo docker stop eth-node prysm || true
sudo docker rm eth-node prysm || true
Run the eth-node and Prysm containers
sudo docker run -d --name eth-node --network eth-net --restart unless-stopped \
-p 8545:8545 -p 8551:8551 -p 30303:30303 -p 30303:30303/udp \
-v $HOME/ethereum/geth:/root/.ethereum \
ethereum/client-go \
--syncmode "snap" \
--http \
--http.addr 0.0.0.0 \
--http.port 8545 \
--authrpc.port 8551 \
--authrpc.addr 0.0.0.0 \
--authrpc.vhosts="eth-node"
# Wait for Geth to start up
sleep 30
# Run Prysm with appropriate settings
sudo docker run -d --name prysm --network eth-net --restart unless-stopped \
-v $HOME/ethereum/prysm:/data \
-p 4000:4000 -p 13000:13000 -p 12000:12000/udp \
gcr.io/prysmaticlabs/prysm/beacon-chain:latest \
--datadir=/data \
--jwt-secret=/data/jwtsecret \
--accept-terms-of-use \
--execution-endpoint=http://eth-node:8551
# Check if containers are up and running
echo "Checking container status:"
sudo docker ps --format "table {{.Names}}\t{{.Status}}"
# Check Geth's logs for confirmation of HTTP server start
echo "Geth's HTTP server status:"
sudo docker logs eth-node | grep "HTTP server started"
# Check Prysm's logs for connection confirmation
echo "Prysm's connection status:"
sudo docker logs prysm | grep "Connected to new endpoint"
# Monitor both logs for any errors or warnings
echo "Monitoring Geth logs for errors:"
sudo docker logs eth-node -f &
echo "Monitoring Prysm logs for errors:"
sudo docker logs prysm -f &
len()
in CPython
The len()
function in CPython returns the length of the container argument passed to it.
len()
responds to all containers. Comments on the Internet claim the time-complexity of the
len()
function is O(1)
, but they fail to substantiate. This is not very
well-documented officially either and leaves room open for disagreement.
I've been on interview debriefs where interviewers have confidently claimed the
time-complexity to be O(n)
, which, of course, is wrong. The time-complexity of len()
is
indeed O(1)
, always.
Under the hood, CPython is a C program (hence the name CPython for this version of Python. There's also Jython, which, as you can guess, is Python on the JVM, which, apart from subscribing to Python language spec, has little to do with the CPython implementation or the runtime.):
int
Py_Main(int argc, wchar_t **argv)
{
…
return pymain_main(&args);
}
All native Python objects including containers have representations and behaviours defined by C
structs and C functions respectively.
listobject.h, for instance, is the header file for the
CPython list object. All objects in CPython are of type PyObject
. The length of an object is
represented by its Py*_Size()
function. For the list object, that's
PyList_Size()
.
Py_ssize_t
PyList_Size(PyObject *op)
{
…
return PyList_GET_SIZE(op);
}
The PyList_Size()
function merely calls its super function
Py_SIZE()
function defined
in object.h where it performs a look-up of the attribute, ob_size
, on the generic
PyVarObject
.
static inline Py_ssize_t Py_SIZE(PyObject *ob) {
…
return var_ob->ob_size;
}
We can see from the definition of PyVarObject that ob_size is a Py_ssize_t
object
which under most conditions is a wrapper around ssize_t
.
We can see from the definition of PyVarObject
that ob_size is a Py_ssize_t
object
which under most conditions is a wrapper around ssize_t
.
typedef struct {
PyObject ob_base;
Py_ssize_t ob_size; /* Number of items in variable part */
} PyVarObject;
The determination of size/length is a look-up and an O(1)
operation for lists
(including strings).
The time-complexity analysis for
PyDict_Size
is similar.
Py_ssize_t
PyDict_Size(PyObject *mp)
{
…
return ((PyDictObject *)mp)->ma_used;
}
ma_used
, which represents the number of items in the dict, is incremented
every time a new item is added.
static int
insertdict(PyInterpreterState *interp, PyDictObject *mp,
PyObject *key, Py_hash_t hash, PyObject *value)
{
…
mp->ma_used++;
…
Similarly, item deletions
decrement ma_used
.
static int
delitem_common(PyDictObject *mp, Py_hash_t hash, Py_ssize_t ix,
PyObject *old_value, uint64_t new_version)
{
…
mp->ma_used--;
…
So, len()
invoked with dict objects too is an O(1)
operation. You can perform similar
exercises with sets, tuples, bytes, bytearrays. Length lookups are always O(1)
.