<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Bruno Calza - Blog</title>
  <link href="https://bcalza.b-cdn.net/blog/" />
  <link type="application/atom+xml" rel="self" href="https://bcalza.b-cdn.net/blog-feed.xml" />
  <updated>2026-04-13T15:15:07+00:00</updated>
  <id>https://bcalza.b-cdn.net/blog/</id>
  <author>
    <name></name>
  </author>
  
    <entry>
      <title>Building a Grow-Only Counter on a Sequentially Consistent KV Store</title>
      <link href="https://bcalza.b-cdn.net/blog/2026/04/13/building-a-grow-only-counter-on-a-sequentially-consistent-kv-store.html" />
      <id>https://bcalza.b-cdn.net/blog/2026/04/13/building-a-grow-only-counter-on-a-sequentially-consistent-kv-store</id>
      <updated>2026-04-13T15:15:07+00:00</updated>
      <content type="html">
        &lt;blockquote class=&quot;callout&quot;&gt;
  &lt;p&gt;This post is part of a series on Fly.io’s &lt;a href=&quot;https://fly.io/dist-sys/&quot;&gt;distributed systems challenges&lt;/a&gt;:&lt;/p&gt;

  &lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;/blog/2026/04/07/implementing-snowflake-unique-id-generation&quot;&gt;Implementing Snowflake Unique ID Generation&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;/blog/2026/04/08/generating-unique-ids-with-raft-consensus&quot;&gt;Generating Unique IDs with Raft Consensus&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;/blog/2026/04/09/flyio-broadcast-challenges&quot;&gt;Fly.io’s Broadcast Challenges&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;Building a Grow-Only Counter on a Sequentially Consistent KV Store (this post)&lt;/li&gt;
  &lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt;We’re going to discuss &lt;a href=&quot;https://fly.io/dist-sys/4/&quot;&gt;Challenge #4: Grow-Only Counter&lt;/a&gt;. This challenge is particularly tricky. I wouldn’t say it’s hard, but if the goal is to learn, there’s a lot to unwrap.&lt;/p&gt;

&lt;p&gt;The task is to build a grow-only counter. Nothing strange so far. However, the specification says to build the counter on top of Maelstrom’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SeqKV&lt;/code&gt; built-in service. And that’s where things get weird. In this post I try to explore that weirdness to the best of my ability. Moreover, I briefly touch on CRDTs (&lt;a href=&quot;https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type&quot;&gt;Conflict-free replicated data type&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;First, let’s understand what is going on. The following test needs to pass:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;./maelstrom &lt;span class=&quot;nb&quot;&gt;test&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-w&lt;/span&gt; g-counter &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--bin&lt;/span&gt; ~/go/bin/maelstrom-counter &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--node-count&lt;/span&gt; 3 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--rate&lt;/span&gt; 100 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--time-limit&lt;/span&gt; 20 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--nemesis&lt;/span&gt; partition
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;g-counter&lt;/code&gt; workload sends two types of request: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;add&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;read&lt;/code&gt;, that our nodes will need to accept.&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# add request&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;s2&quot;&gt;&quot;type&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;add&quot;&lt;/span&gt;,
  &lt;span class=&quot;s2&quot;&gt;&quot;delta&quot;&lt;/span&gt;: 123
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# add response&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;s2&quot;&gt;&quot;type&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;add_ok&quot;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# read request&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;s2&quot;&gt;&quot;type&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;read&quot;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# read response&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;s2&quot;&gt;&quot;type&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;read_ok&quot;&lt;/span&gt;,
  &lt;span class=&quot;s2&quot;&gt;&quot;value&quot;&lt;/span&gt;: 1234
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Essentially, the test checks that after all &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;add&lt;/code&gt;s are done, every node’s final read sees the full sum.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/jepsen-io/maelstrom/blob/main/doc/services.md#seq-kv&quot;&gt;SeqKV&lt;/a&gt; is a key-value store that the node can use to build the algorithm. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SeqKV&lt;/code&gt; offers the following API:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ReadInt(key string) -&amp;gt; int&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Write(key string, value any)&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CompareAndSwap(key string, from any, to any, createIfNotExists bool)&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And that’s all we need to get started.&lt;/p&gt;

&lt;h2 id=&quot;starting-simple&quot;&gt;Starting simple&lt;/h2&gt;

&lt;p&gt;Let’s use the key &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;counter&lt;/code&gt; in our key-value store to store the value of our counter. So we initialize our nodes with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;counter&lt;/code&gt; set to 0.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Handle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;init&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maelstrom&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Message&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;error&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kv&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Background&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;counter&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Then whenever an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;add&lt;/code&gt; request comes in, we read the value, update it, and write it back:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Handle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;add&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maelstrom&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Message&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;error&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;req&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;Type&lt;/span&gt;  &lt;span class=&quot;kt&quot;&gt;string&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;`json:&quot;type&quot;`&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;Delta&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;    &lt;span class=&quot;s&quot;&gt;`json:&quot;delta&quot;`&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;json&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Unmarshal&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Body&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;req&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;old&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kv&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ReadInt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Background&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;counter&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;old&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;req&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Delta&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kv&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Background&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;counter&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;response&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;Type&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;string&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;`json:&quot;type&quot;`&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Reply&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;response&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;add_ok&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And when a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;read&lt;/code&gt; request arrives:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Handle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;read&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maelstrom&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Message&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;error&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kv&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ReadInt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Background&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;counter&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;response&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;Type&lt;/span&gt;  &lt;span class=&quot;kt&quot;&gt;string&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;`json:&quot;type&quot;`&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;    &lt;span class=&quot;s&quot;&gt;`json:&quot;value&quot;`&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Reply&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;response&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;read_ok&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Well, if you’re experienced with this kind of thing, you probably know this won’t work. And that is really the case. But it’s a good starting point and a way of validating that Maelstrom checks are solid. If you’re not experienced, the reason this does not work is that the read-modify-write inside &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;add&lt;/code&gt; is being executed by concurrent nodes. If two nodes both read &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;counter = 5&lt;/code&gt; at the same time, they’ll both write &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;5 + delta&lt;/code&gt;, and one write will be lost.&lt;/p&gt;

&lt;p&gt;There are two solutions to this problem: one is making that operation atomic, and the other is by making sure the node’s writes don’t conflict with one another. In the latter approach, we enter into the CRDT world.&lt;/p&gt;

&lt;h2 id=&quot;solving-read-modify-write-with-cas&quot;&gt;Solving read-modify-write with CAS&lt;/h2&gt;

&lt;p&gt;We saw that our key-value store offers a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CompareAndSwap&lt;/code&gt; method. With that, we can make our operation atomic:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;old&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kv&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ReadInt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Background&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;counter&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kv&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CompareAndSwap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Background&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;counter&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;old&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;old&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;req&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Delta&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;c&quot;&gt;// CAS failed retry&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;continue&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;break&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I’m being a bit sloppy here with this infinite loop. Ideally, we’d have a timeout and would also check for specific &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CompareAndSwap&lt;/code&gt; errors, so we don’t retry all of them. But you get the idea.&lt;/p&gt;

&lt;p&gt;If you run the test, you may get a valid result. If you’re not really that curious to understand why you got a valid result, you’ll probably move on to the next challenge and miss many extra learnings. And that is because you may also get an invalid result for this solution. And that’s where the weirdness starts. This solution is not deterministic.&lt;/p&gt;

&lt;p&gt;Before exploring that weirdness, I’d like to discuss an alternative solution that touches a bit on what a CRDT looks like.&lt;/p&gt;

&lt;h2 id=&quot;crdt-like-solution&quot;&gt;CRDT-like solution&lt;/h2&gt;

&lt;p&gt;There are different kinds of CRDTs, and G-Counter is one of them. If we look at the mathematical definition of the G-Counter CRDT on Wikipedia (&lt;a href=&quot;https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type#G-Counter_\(Grow-only_Counter\)&quot;&gt;link&lt;/a&gt;), you’ll see:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;payload integer[n] P
    initial &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;0,0,...,0]

update increment&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;let &lt;/span&gt;g &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; myId&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt;
    P[g] :&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; P[g] + 1

query value&lt;span class=&quot;o&quot;&gt;()&lt;/span&gt; : integer v
    &lt;span class=&quot;nb&quot;&gt;let &lt;/span&gt;v &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; Σi P[i]

compare &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;X, Y&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; : boolean b
    &lt;span class=&quot;nb&quot;&gt;let &lt;/span&gt;b &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;∀i ∈ &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;0, n - 1] : X.P[i] ≤ Y.P[i]&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;

merge &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;X, Y&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; : payload Z
    &lt;span class=&quot;nb&quot;&gt;let&lt;/span&gt; ∀i ∈ &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;0, n - 1] : Z.P[i] &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; max&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;X.P[i], Y.P[i]&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Let’s ignore &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;compare&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;merge&lt;/code&gt; for a moment. If we squint and think a bit, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;payload integer&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;update increment&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;query value&lt;/code&gt; kind of map to the responsibilities of our node: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;init&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;add&lt;/code&gt;, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;read&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;What is roughly being said in there is that we’ll have as an initial state a vector &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[0,0,...,0]&lt;/code&gt;. When a node needs to increment the counter, it only updates its counter (Wikipedia shows &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;+1&lt;/code&gt;; we generalize to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;+delta&lt;/code&gt;). See the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;myId&lt;/code&gt;, and the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;P[g] := P[g] + 1&lt;/code&gt; operation? And when we read the counter value, we actually sum all values of the vector. Interesting.&lt;/p&gt;

&lt;p&gt;Let’s map this to what we have. Our state is stored on a key-value store, and it does not offer a vector data structure. However, if a node is only updating its part of the vector, we can create a key-value counter for each node and consider that to be a vector. Using the node ID as the key, our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;init&lt;/code&gt; becomes:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Handle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;init&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maelstrom&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Message&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;error&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kv&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Background&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Our read-modify-write becomes:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;old&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kv&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ReadInt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Background&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;old&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;req&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Delta&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kv&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Background&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Our counter is now spread into different counters, and the solution we saw on Wikipedia says that we should sum them to get the real value. Let’s do that:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sum&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nodeID&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;range&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;NodeIDs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kv&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ReadInt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Background&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nodeID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;sum&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And, as before, if we run the test, you may get a valid or invalid result.&lt;/p&gt;

&lt;p&gt;So, what happened here is that by making each node work on its counter, they don’t conflict with one another when writing, eliminating the race condition we had.&lt;/p&gt;

&lt;p&gt;So is this a CRDT? Not quite. The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;merge&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;compare&lt;/code&gt; pieces, part of the definition that we ignored, are what would make this a CRDT. But what are those? Well, CRDT is a decentralized algorithm. But the fact we’re using a centralized key-value store to build our solution means we have eliminated the need for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;merge&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;compare&lt;/code&gt;. If we were to remove the key-value store, we would need a way for nodes to know which values the other nodes have. Currently, it is the key-value store that is doing this job. A node can always read from the key-value store to know what the other node has; that’s how our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;read&lt;/code&gt; works. So, what happens in a real CRDT algorithm is that the key-value store would be replaced by a gossip algorithm used to broadcast local node state, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;compare&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;merge&lt;/code&gt; used to reconcile with others.&lt;/p&gt;

&lt;p&gt;I did not explore this approach on this challenge but wanted to share that to make the learnings from this more complete. In the future, I’ll see if I can use the approaches discussed on &lt;a href=&quot;/blog/2026/04/09/flyio-broadcast-challenges&quot;&gt;Fly.io’s Broadcast Challenges&lt;/a&gt;, add the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;merge&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;compare&lt;/code&gt;, and pass the test.&lt;/p&gt;

&lt;h2 id=&quot;on-consistency-models&quot;&gt;On consistency models&lt;/h2&gt;

&lt;h3 id=&quot;sequential-consistency&quot;&gt;Sequential consistency&lt;/h3&gt;

&lt;p&gt;So, we discussed two solutions, but none of them deterministically pass the test. We need to figure out why and fix our code. The fact that the problem specification suggests the use of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SeqKV&lt;/code&gt; gives us a hint where our issue may lie. And proposing its use may indicate something about the author’s pedagogical intentions.&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SeqKV&lt;/code&gt; is a &lt;strong&gt;sequentially consistent&lt;/strong&gt; key-value store. To understand what that means, we have to understand what a &lt;strong&gt;consistency model&lt;/strong&gt; is, because &lt;strong&gt;sequential consistency&lt;/strong&gt; is one of many. One way of thinking about the consistency model of a data storage is to think that there’s a contract between clients and the data storage. If clients agree with certain rules, the data storage promises that it will behave in a certain way. You might be thinking, what are these behaviors, and why doesn’t all data storage behave the same way? Well, in a distributed system with concurrent clients where nodes might fail, messages can get delayed, and clocks are unreliable, weird things can happen, and different kinds of behaviors emerge. We can call these odd behaviors anomalies. These anomalies were categorized, and a consistency model type is essentially defined by the set of anomalies that it allows to happen. The stronger the model, the more restrictive it is in terms of what kinds of anomalies it allows. And the reason every data storage doesn’t pick the stronger model when designing a system to achieve a certain consistency model is that there’s a compromise around availability and performance that you might not be willing to give up. &lt;a href=&quot;https://jepsen.io/consistency/models&quot;&gt;Consistency Models&lt;/a&gt; is a good reference on the different consistency models.&lt;/p&gt;

&lt;p&gt;Now we need to understand the behavior that a &lt;strong&gt;sequentially consistent&lt;/strong&gt; key-value store promises its clients.&lt;/p&gt;

&lt;p&gt;In a &lt;strong&gt;sequentially consistent model&lt;/strong&gt;, a total order of all operations is required, and that must be consistent with each client’s program order. However, the total order of events may not be what in fact happened in reality. This is the anomaly it allows.&lt;/p&gt;

&lt;p&gt;I’ll use Maelstrom’s &lt;a href=&quot;https://github.com/jepsen-io/maelstrom/blob/main/doc/services.md#seq-kv&quot;&gt;example&lt;/a&gt; to explain further. Suppose two clients execute the following operations in &lt;strong&gt;real-time order&lt;/strong&gt;:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;1. client1: write x = 1
2. client2: CAS x (1 -&amp;gt; 2)
3. client1: write x = 1
4. client2: read x = 2
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;You might be thinking that there’s something wrong with the fact that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;client2&lt;/code&gt; read &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x = 2&lt;/code&gt;. But that anomaly is totally valid under &lt;strong&gt;sequential consistency&lt;/strong&gt;. Maybe the third operation got delayed, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;client2&lt;/code&gt; is doing a &lt;strong&gt;stale read&lt;/strong&gt;. Or the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CAS&lt;/code&gt; got delayed and happened after the second write. The following two orders are valid:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;1. client1: write x = 1
3. client1: write x = 1
2. client2: CAS x (1 -&amp;gt; 2)
4. client2: read x = 2

1. client1: write x = 1
2. client2: CAS x (1 -&amp;gt; 2)
4. client2: read x = 2
3. client1: write x = 1
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We now understand &lt;strong&gt;sequential consistency&lt;/strong&gt;. Let’s see if that is enough to understand our results.&lt;/p&gt;

&lt;h3 id=&quot;does-sequential-consistency-explain-our-results&quot;&gt;Does sequential consistency explain our results?&lt;/h3&gt;

&lt;p&gt;Here’s one of the results I got in one of my runs:&lt;/p&gt;

&lt;div class=&quot;language-json highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;err&quot;&gt;:workload&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;:valid?&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;:errors&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;(#jepsen.history.Op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;:index&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3938&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                                &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;:time&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;30006761427&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                                &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;:type&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;:ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                                &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;:process&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                                &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;:f&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;:read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                                &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;:value&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1343&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
                                &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;:final?&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;:final-reads&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1345&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1343&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1345&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;:acceptable&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1345&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1345&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Process &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0&lt;/code&gt; read &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1343&lt;/code&gt; and it was expected &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1345&lt;/code&gt;. Okay, given our understanding of sequential consistency, it might be the case that, for this run, the final read happened before the last write, so we’re getting a stale value. Reasonable. It’s a valid anomaly. And for some runs the anomaly does not happen, and the test passes.&lt;/p&gt;

&lt;p&gt;That kind of explains our results. But there’s something deeper going on. To realize there’s something deeper going on, we need to look at how the final results are checked. If we look at the Maelstrom &lt;a href=&quot;https://github.com/jepsen-io/maelstrom/blob/main/src/maelstrom/core.clj#L74-L80&quot;&gt;source code of the test run&lt;/a&gt;, we’ll see the following pattern:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Run the main workload&lt;/li&gt;
  &lt;li&gt;Heal partitions&lt;/li&gt;
  &lt;li&gt;Sleep for 10 seconds. Waiting for recovery…&lt;/li&gt;
  &lt;li&gt;Send the final reads to all nodes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So, there’s a cooldown period of 10 seconds before the final reads happen. That is strange. Our read continues stale even after 10 seconds. You could be asking, how long do we have to wait for writes to converge?&lt;/p&gt;

&lt;p&gt;I asked that question. And the answer to that leads to a very important realization about consistency models. Maelstrom is a platform built for learning purposes. So &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SeqKV&lt;/code&gt; was built as a way to illustrate what sequential consistency might look like in practice. It is a key-value store that simulates a sequential-consistency-ish behavior. But the thing is that &lt;strong&gt;sequential consistency only defines what anomalies are allowed, not what anomalies must occur and how they occur.&lt;/strong&gt; Our final stale read that never seems to get the most recent value, although it’s a valid anomaly, is a behavior that has nothing to do with &lt;strong&gt;sequential consistency&lt;/strong&gt;, in the sense that different databases, under the same consistency model, might have a different manifestation of that anomaly.&lt;/p&gt;

&lt;h3 id=&quot;the-hacky-solution&quot;&gt;The hacky solution&lt;/h3&gt;

&lt;p&gt;So, the kind of unfortunate conclusion is that to solve the challenge, we really need to understand some implementation details of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SeqKV&lt;/code&gt;. Just understanding sequential consistency is not enough.&lt;/p&gt;

&lt;p&gt;By reading &lt;a href=&quot;https://github.com/jepsen-io/maelstrom/issues/39&quot;&gt;Compare-and-swap on seq-kv&lt;/a&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SeqKV&lt;/code&gt; &lt;a href=&quot;https://github.com/jepsen-io/maelstrom/blob/main/src/maelstrom/service.clj#L157-L210&quot;&gt;source code&lt;/a&gt;, you get a sense of what is going on. The two crucial parts: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SeqKV&lt;/code&gt; has some &lt;a href=&quot;https://github.com/jepsen-io/maelstrom/blob/main/src/maelstrom/service.clj#L169-L174&quot;&gt;randomness&lt;/a&gt; on reads allowing the node to pick state from history, and &lt;a href=&quot;https://github.com/jepsen-io/maelstrom/blob/main/src/maelstrom/service.clj#L183C1-L198C1&quot;&gt;if a change of state occurs&lt;/a&gt; the client is forced to have the most up-to-date view of the data store.&lt;/p&gt;

&lt;p&gt;And with that we can come up with a solution for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;g-counter&lt;/code&gt; that uses &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SeqKV&lt;/code&gt;. If clients are forced to have the most up-to-date view when a change of state occurs, we can simply do a write of a unique value before reads:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;kv&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Background&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;rand&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Now&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;UnixMilli&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Any read that comes after the write will not be stale anymore. It is important that the value be unique. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SeqKV&lt;/code&gt; only forces clients to the latest state when the write actually changes state. If the value matches what’s already there, nothing changes and the reads stay stale.&lt;/p&gt;

&lt;p&gt;The “write forces clients to the latest state” behavior is just how Maelstrom’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SeqKV&lt;/code&gt; works internally to keep the total order well-defined. Do not take this additional write as a workaround for the anomalies of any sequentially consistent database. You’re not fixing the anomaly. There’s not really any fixing to be done. Your data store operates under a certain consistency model, and you have to deal with that. Of course, in this case, by knowing how &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SeqKV&lt;/code&gt; works internally, we could make a change to our program to behave the way we want it to behave. That’s why it can feel hacky.&lt;/p&gt;

&lt;h2 id=&quot;the-non-hacky-solutions&quot;&gt;The non-hacky solutions&lt;/h2&gt;

&lt;p&gt;If the hack bothers you, there are a couple of approaches to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;g-counter&lt;/code&gt; workload. The more appropriate solution is to implement a CRDT with the gossip mechanism as we discussed. A G-Counter is a CRDT, and the workload is made for that. But if you’re willing to explore the consistency model path, you can switch from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SeqKV&lt;/code&gt; to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LinKV&lt;/code&gt; to get a linearizable key-value store. A linearizable key-value store operates under the linearizable consistency model. Now the total order observed by clients must match real-time order, and we don’t get stale reads. And the tests pass deterministically. If you’re wondering what is required to implement linearizability, you may be able to use the solution we discussed in &lt;a href=&quot;/blog/2026/04/08/generating-unique-ids-with-raft-consensus&quot;&gt;Generating Unique IDs with Raft Consensus&lt;/a&gt; to add consensus on the nodes of your G-Counter. Exploring alternatives makes it easier to understand the trade-offs.&lt;/p&gt;

&lt;p&gt;This challenge was brutal for me. Not in terms of coding. But it took a while to understand all the nuances of what was going on. In any case, it was fun, and I learned a lot. That’s all for this one.&lt;/p&gt;

&lt;p&gt;Solution can be found at &lt;a href=&quot;https://github.com/brunocalza/gossip-glomers/tree/main/maelstrom-g-counter&quot;&gt;maelstrom-g-counter&lt;/a&gt;.&lt;/p&gt;

      </content>
    </entry>
  
    <entry>
      <title>Fly.io&apos;s Broadcast Challenges</title>
      <link href="https://bcalza.b-cdn.net/blog/2026/04/09/flyio-broadcast-challenges.html" />
      <id>https://bcalza.b-cdn.net/blog/2026/04/09/flyio-broadcast-challenges</id>
      <updated>2026-04-09T18:33:59+00:00</updated>
      <content type="html">
        &lt;blockquote class=&quot;callout&quot;&gt;
  &lt;p&gt;This post is part of a series on Fly.io’s &lt;a href=&quot;https://fly.io/dist-sys/&quot;&gt;distributed systems challenges&lt;/a&gt;:&lt;/p&gt;

  &lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;/blog/2026/04/07/implementing-snowflake-unique-id-generation&quot;&gt;Implementing Snowflake Unique ID Generation&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;/blog/2026/04/08/generating-unique-ids-with-raft-consensus&quot;&gt;Generating Unique IDs with Raft Consensus&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;Fly.io’s Broadcast Challenges (this post)&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;/blog/2026/04/13/building-a-grow-only-counter-on-a-sequentially-consistent-kv-store&quot;&gt;Building a Grow-Only Counter on a Sequentially Consistent KV Store&lt;/a&gt;&lt;/li&gt;
  &lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is another post about the &lt;a href=&quot;https://fly.io/dist-sys/&quot;&gt;series of distributed systems challenges&lt;/a&gt; by Fly.io. I’ve talked about using Snowflake and Raft consensus to solve &lt;a href=&quot;https://fly.io/dist-sys/2/&quot;&gt;Challenge #2: Unique ID Generation&lt;/a&gt;, in &lt;a href=&quot;/blog/2026/04/07/implementing-snowflake-unique-id-generation&quot;&gt;Implementing Snowflake Unique ID Generation&lt;/a&gt; and &lt;a href=&quot;/blog/2026/04/08/generating-unique-ids-with-raft-consensus&quot;&gt;Generating Unique IDs with Raft Consensus&lt;/a&gt;. Now we’re going to discuss a set of Broadcast challenges that start at &lt;a href=&quot;https://fly.io/dist-sys/3a/&quot;&gt;Challenge #3a: Single-Node Broadcast&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In this set, we are given the task of implementing a gossip algorithm that is fault-tolerant and efficient. We start at a simple single-node implementation, move towards a multi-node one, then implement fault tolerance, and last, try to improve its efficiency. It is really nice how the challenges are set up and how you can build on top of the previous solution.&lt;/p&gt;

&lt;h2 id=&quot;single-node&quot;&gt;Single-node&lt;/h2&gt;

&lt;p&gt;There’s not much to discuss here. The solution is straightforward: receive the message, record it, and send &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;broadcast_ok&lt;/code&gt; back. The only thing to be aware of is to make sure there are no duplicates.&lt;/p&gt;

&lt;p&gt;code: &lt;a href=&quot;https://github.com/brunocalza/gossip-glomers/tree/main/maelstrom-broadcast/single-node&quot;&gt;single-node&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;multi-node&quot;&gt;Multi-node&lt;/h2&gt;

&lt;p&gt;Keeping it simple is enough for this challenge. We can simply broadcast any new message to every neighbor we know. The interesting thing about this challenge is the fact that only new messages can be sent. If that check is not done, messages will be in a loop, hopping from node to node, congesting the network. This hints to us at the importance of trying to keep the network congestion low.&lt;/p&gt;

&lt;p&gt;This algorithm is enough to pass the challenge, but what happens if we add network partition?&lt;/p&gt;

&lt;p&gt;code: &lt;a href=&quot;https://github.com/brunocalza/gossip-glomers/tree/main/maelstrom-broadcast/multi-node&quot;&gt;multi-node&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;fault-tolerance&quot;&gt;Fault tolerance&lt;/h2&gt;

&lt;p&gt;Network partitions are introduced. It means that for a brief period of time, some nodes will not be available. We need some kind of mechanism to make sure messages always make their way through all nodes eventually, after the partition is healed.&lt;/p&gt;

&lt;p&gt;Two possible approaches:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Add a retry queue for timed-out messages&lt;/strong&gt;&lt;/p&gt;

    &lt;p&gt;When you try to send a message to a neighbor and it times out, you add that message to a retry queue. Then a background job that listens to the queue periodically sends those messages. The messages only leave the queue when the neighbor acks. There are multiple flavors of this approach, actually. What is important is the strategy: make sure every new message you receive is acknowledged by your neighbors. If we can make sure every message sent is acknowledged, all nodes will eventually have all messages.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Periodic anti-entropy&lt;/strong&gt;&lt;/p&gt;

    &lt;p&gt;We don’t have to immediately forward the new messages. An alternative to that: nodes can periodically exchange messages with their neighbors, detect missing entries, and fill the gaps. Because nodes are periodically running this repair mechanism, eventually, everyone catches up.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For this challenge I decided on using the first approach. It is not a very robust retry queue with adequate retry intervals, but it is good enough for us.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;retries&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;make&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;chan&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RetryMessage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;// have 5 background jobs periodically retrying messages&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;go&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;job&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;range&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;retries&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;req&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;Type&lt;/span&gt;    &lt;span class=&quot;kt&quot;&gt;string&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;`json:&quot;type&quot;`&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;Message&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;    &lt;span class=&quot;s&quot;&gt;`json:&quot;message&quot;`&lt;/span&gt;
            &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;req&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;broadcast&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Message&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;job&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;message&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

            &lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cancel&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;WithTimeout&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Background&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RPCTimeout&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SyncRPC&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;job&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;destination&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;cancel&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;job&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;job&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AfterFunc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RetryInterval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
                    &lt;span class=&quot;n&quot;&gt;retries&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;job&lt;/span&gt;
                &lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
            &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}()&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We create 5 goroutines that listen to the retry queue and resend failed messages. If it fails again, it re-enqueues. The issue with this approach is the infinite retry. Having an infinite retry assumes nodes can’t crash and topology is stable, which is not true in real life. If we add a limit, we violate the broadcast guarantee that all nodes eventually see all messages. So, adding anti-entropy to the mix is probably a good idea, and something I definitely would like to explore in the future.&lt;/p&gt;

&lt;p&gt;code: &lt;a href=&quot;https://github.com/brunocalza/gossip-glomers/tree/main/maelstrom-broadcast/fault-tolerant&quot;&gt;fault-tolerant&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;efficiency&quot;&gt;Efficiency&lt;/h2&gt;

&lt;p&gt;When we’re building an efficient broadcast algorithm, we’re usually trying to optimize for low latency and low messages per operation.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Latency&lt;/strong&gt; is how long it takes for a message to be seen in all nodes. An algorithm that is fast to converge is desirable.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Messages per operation&lt;/strong&gt; is the average number of network messages generated per broadcast operation. The lower the &lt;strong&gt;msgs/op&lt;/strong&gt; is, the less congested the network is, meaning a lower amount of resources is needed by the node. That is a more scalable solution.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Essentially there are two major things that impact them that need further investigation: &lt;strong&gt;topology&lt;/strong&gt; and &lt;strong&gt;batching&lt;/strong&gt;.&lt;/p&gt;

&lt;h3 id=&quot;topology&quot;&gt;Topology&lt;/h3&gt;

&lt;p&gt;Let’s look at the following 7-node topology:&lt;/p&gt;

&lt;div class=&quot;graphviz-wrapper&quot;&gt;

&lt;!-- Generated by graphviz version 2.43.0 (0)
 --&gt;
&lt;!-- Title: %3 Pages: 1 --&gt;
&lt;svg role=&quot;img&quot; aria-label=&quot;graphviz-b2b5a5fd099a1f659b5ca26cffda57ac&quot; width=&quot;329pt&quot; height=&quot;318pt&quot; viewBox=&quot;0.00 0.00 328.85 317.71&quot;&gt;
&lt;title&gt;graphviz-b2b5a5fd099a1f659b5ca26cffda57ac&lt;/title&gt;
&lt;desc&gt;
graph {
  layout=circo
  node [shape=ellipse]
  n0 -- n1
  n0 -- n2
  n0 -- n3
  n0 -- n4
  n0 -- n5
  n0 -- n6
  n1 -- n2
  n1 -- n3
  n1 -- n4
  n1 -- n5
  n1 -- n6
  n2 -- n3
  n2 -- n4
  n2 -- n5
  n2 -- n6
  n3 -- n4
  n3 -- n5
  n3 -- n6
  n4 -- n5
  n4 -- n6
  n5 -- n6
}
&lt;/desc&gt;

&lt;g id=&quot;graph0&quot; class=&quot;graph&quot; transform=&quot;scale(1 1) rotate(0) translate(4 313.71)&quot;&gt;
&lt;title&gt;%3&lt;/title&gt;
&lt;polygon fill=&quot;white&quot; stroke=&quot;transparent&quot; points=&quot;-4,4 -4,-313.71 324.85,-313.71 324.85,4 -4,4&quot; /&gt;
&lt;!-- n0 --&gt;
&lt;g id=&quot;node1&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n0&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;241&quot; cy=&quot;-45.11&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;241&quot; y=&quot;-41.41&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n0&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n1 --&gt;
&lt;g id=&quot;node2&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n1&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;293.85&quot; cy=&quot;-154.86&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;293.85&quot; y=&quot;-151.16&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n1&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n0&amp;#45;&amp;#45;n1 --&gt;
&lt;g id=&quot;edge1&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n0&amp;#45;&amp;#45;n1&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M249.25,-62.25C259.12,-82.75 275.55,-116.86 285.47,-137.47&quot; /&gt;
&lt;/g&gt;
&lt;!-- n2 --&gt;
&lt;g id=&quot;node3&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n2&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;27&quot; cy=&quot;-93.95&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;27&quot; y=&quot;-90.25&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n2&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n0&amp;#45;&amp;#45;n2 --&gt;
&lt;g id=&quot;edge2&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n0&amp;#45;&amp;#45;n2&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M215.18,-51C173.94,-60.41 93.66,-78.73 52.57,-88.11&quot; /&gt;
&lt;/g&gt;
&lt;!-- n3 --&gt;
&lt;g id=&quot;node4&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n3&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;27&quot; cy=&quot;-215.76&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;27&quot; y=&quot;-212.06&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n3&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n0&amp;#45;&amp;#45;n3 --&gt;
&lt;g id=&quot;edge3&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n0&amp;#45;&amp;#45;n3&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M223.48,-59.07C183.26,-91.15 84.63,-169.81 44.45,-201.84&quot; /&gt;
&lt;/g&gt;
&lt;!-- n4 --&gt;
&lt;g id=&quot;node5&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n4&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;122.24&quot; cy=&quot;-291.71&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;122.24&quot; y=&quot;-288.01&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n4&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n0&amp;#45;&amp;#45;n4 --&gt;
&lt;g id=&quot;edge4&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n0&amp;#45;&amp;#45;n4&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M232.64,-62.45C210.84,-107.73 152.37,-229.13 130.58,-274.39&quot; /&gt;
&lt;/g&gt;
&lt;!-- n5 --&gt;
&lt;g id=&quot;node6&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n5&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;241&quot; cy=&quot;-264.6&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;241&quot; y=&quot;-260.9&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n5&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n0&amp;#45;&amp;#45;n5 --&gt;
&lt;g id=&quot;edge5&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n0&amp;#45;&amp;#45;n5&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M241,-63.44C241,-104.86 241,-205.25 241,-246.47&quot; /&gt;
&lt;/g&gt;
&lt;!-- n6 --&gt;
&lt;g id=&quot;node7&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n6&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;122.24&quot; cy=&quot;-18&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;122.24&quot; y=&quot;-14.3&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n6&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n0&amp;#45;&amp;#45;n6 --&gt;
&lt;g id=&quot;edge6&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n0&amp;#45;&amp;#45;n6&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M215.23,-39.22C195.31,-34.68 167.86,-28.41 147.96,-23.87&quot; /&gt;
&lt;/g&gt;
&lt;!-- n1&amp;#45;&amp;#45;n2 --&gt;
&lt;g id=&quot;edge7&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n1&amp;#45;&amp;#45;n2&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M268.26,-149.02C217.39,-137.4 104.03,-111.53 52.9,-99.86&quot; /&gt;
&lt;/g&gt;
&lt;!-- n1&amp;#45;&amp;#45;n3 --&gt;
&lt;g id=&quot;edge8&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n1&amp;#45;&amp;#45;n3&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M268.26,-160.7C217.39,-172.31 104.03,-198.18 52.9,-209.85&quot; /&gt;
&lt;/g&gt;
&lt;!-- n1&amp;#45;&amp;#45;n4 --&gt;
&lt;g id=&quot;edge9&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n1&amp;#45;&amp;#45;n4&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M276.45,-168.73C243.56,-194.96 172.61,-251.54 139.68,-277.8&quot; /&gt;
&lt;/g&gt;
&lt;!-- n1&amp;#45;&amp;#45;n5 --&gt;
&lt;g id=&quot;edge10&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n1&amp;#45;&amp;#45;n5&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M285.59,-172C275.72,-192.5 259.29,-226.61 249.37,-247.21&quot; /&gt;
&lt;/g&gt;
&lt;!-- n1&amp;#45;&amp;#45;n6 --&gt;
&lt;g id=&quot;edge11&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n1&amp;#45;&amp;#45;n6&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M276.45,-140.98C243.56,-114.75 172.61,-58.17 139.68,-31.91&quot; /&gt;
&lt;/g&gt;
&lt;!-- n2&amp;#45;&amp;#45;n3 --&gt;
&lt;g id=&quot;edge12&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n2&amp;#45;&amp;#45;n3&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M27,-112.19C27,-135.15 27,-174.38 27,-197.4&quot; /&gt;
&lt;/g&gt;
&lt;!-- n2&amp;#45;&amp;#45;n4 --&gt;
&lt;g id=&quot;edge13&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n2&amp;#45;&amp;#45;n4&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M35.29,-111.15C53.3,-148.55 95.85,-236.91 113.9,-274.41&quot; /&gt;
&lt;/g&gt;
&lt;!-- n2&amp;#45;&amp;#45;n5 --&gt;
&lt;g id=&quot;edge14&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n2&amp;#45;&amp;#45;n5&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M44.51,-107.91C84.74,-140 183.37,-218.65 223.54,-250.69&quot; /&gt;
&lt;/g&gt;
&lt;!-- n2&amp;#45;&amp;#45;n6 --&gt;
&lt;g id=&quot;edge15&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n2&amp;#45;&amp;#45;n6&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M44.47,-80.02C61.65,-66.32 87.73,-45.51 104.88,-31.84&quot; /&gt;
&lt;/g&gt;
&lt;!-- n3&amp;#45;&amp;#45;n4 --&gt;
&lt;g id=&quot;edge16&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n3&amp;#45;&amp;#45;n4&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M44.47,-229.69C61.65,-243.4 87.73,-264.2 104.88,-277.87&quot; /&gt;
&lt;/g&gt;
&lt;!-- n3&amp;#45;&amp;#45;n5 --&gt;
&lt;g id=&quot;edge17&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n3&amp;#45;&amp;#45;n5&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M52.81,-221.65C94.05,-231.07 174.34,-249.39 215.42,-258.77&quot; /&gt;
&lt;/g&gt;
&lt;!-- n3&amp;#45;&amp;#45;n6 --&gt;
&lt;g id=&quot;edge18&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n3&amp;#45;&amp;#45;n6&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M35.29,-198.56C53.3,-161.16 95.85,-72.8 113.9,-35.31&quot; /&gt;
&lt;/g&gt;
&lt;!-- n4&amp;#45;&amp;#45;n5 --&gt;
&lt;g id=&quot;edge19&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n4&amp;#45;&amp;#45;n5&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M148.01,-285.83C167.93,-281.28 195.37,-275.02 215.28,-270.47&quot; /&gt;
&lt;/g&gt;
&lt;!-- n4&amp;#45;&amp;#45;n6 --&gt;
&lt;g id=&quot;edge20&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n4&amp;#45;&amp;#45;n6&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M122.24,-273.33C122.24,-223.46 122.24,-86.07 122.24,-36.3&quot; /&gt;
&lt;/g&gt;
&lt;!-- n5&amp;#45;&amp;#45;n6 --&gt;
&lt;g id=&quot;edge21&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n5&amp;#45;&amp;#45;n6&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M232.64,-247.26C210.84,-201.99 152.37,-80.58 130.58,-35.32&quot; /&gt;
&lt;/g&gt;
&lt;/g&gt;
&lt;/svg&gt;
&lt;/div&gt;

&lt;p&gt;Every node is connected to every node. What happens when a new message arrives at &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n0&lt;/code&gt;? It broadcasts it to all other 6 nodes. And when that message arrives at &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n1&lt;/code&gt;, it will broadcast the same message to all other nodes (except &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n0&lt;/code&gt;). But that was unnecessary, because &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n0&lt;/code&gt; already sent the message to the other nodes. There’s some redundancy going on, generating more messages than required.&lt;/p&gt;

&lt;p&gt;In contrast to that, let’s look at the other topology:&lt;/p&gt;

&lt;div class=&quot;graphviz-wrapper&quot;&gt;

&lt;!-- Generated by graphviz version 2.43.0 (0)
 --&gt;
&lt;!-- Title: %3 Pages: 1 --&gt;
&lt;svg role=&quot;img&quot; aria-label=&quot;graphviz-7539c5d04197b8ff7b95a2cc2311b5fa&quot; width=&quot;278pt&quot; height=&quot;188pt&quot; viewBox=&quot;0.00 0.00 278.00 188.00&quot;&gt;
&lt;title&gt;graphviz-7539c5d04197b8ff7b95a2cc2311b5fa&lt;/title&gt;
&lt;desc&gt;
graph {
  rankdir=TB
  node [shape=ellipse]
  n0 -- n1
  n0 -- n2
  n1 -- n3
  n1 -- n4
  n2 -- n5
  n2 -- n6
}
&lt;/desc&gt;

&lt;g id=&quot;graph0&quot; class=&quot;graph&quot; transform=&quot;scale(1 1) rotate(0) translate(4 184)&quot;&gt;
&lt;title&gt;%3&lt;/title&gt;
&lt;polygon fill=&quot;white&quot; stroke=&quot;transparent&quot; points=&quot;-4,4 -4,-184 274,-184 274,4 -4,4&quot; /&gt;
&lt;!-- n0 --&gt;
&lt;g id=&quot;node1&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n0&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;135&quot; cy=&quot;-162&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;135&quot; y=&quot;-158.3&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n0&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n1 --&gt;
&lt;g id=&quot;node2&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n1&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;99&quot; cy=&quot;-90&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;99&quot; y=&quot;-86.3&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n1&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n0&amp;#45;&amp;#45;n1 --&gt;
&lt;g id=&quot;edge1&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n0&amp;#45;&amp;#45;n1&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M126.65,-144.76C120.83,-133.46 113.11,-118.44 107.3,-107.15&quot; /&gt;
&lt;/g&gt;
&lt;!-- n2 --&gt;
&lt;g id=&quot;node3&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n2&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;171&quot; cy=&quot;-90&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;171&quot; y=&quot;-86.3&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n2&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n0&amp;#45;&amp;#45;n2 --&gt;
&lt;g id=&quot;edge2&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n0&amp;#45;&amp;#45;n2&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M143.35,-144.76C149.17,-133.46 156.89,-118.44 162.7,-107.15&quot; /&gt;
&lt;/g&gt;
&lt;!-- n3 --&gt;
&lt;g id=&quot;node4&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n3&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;27&quot; cy=&quot;-18&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;27&quot; y=&quot;-14.3&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n3&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n1&amp;#45;&amp;#45;n3 --&gt;
&lt;g id=&quot;edge3&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n1&amp;#45;&amp;#45;n3&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M84.43,-74.83C72.02,-62.77 54.27,-45.51 41.8,-33.38&quot; /&gt;
&lt;/g&gt;
&lt;!-- n4 --&gt;
&lt;g id=&quot;node5&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n4&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;99&quot; cy=&quot;-18&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;99&quot; y=&quot;-14.3&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n4&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n1&amp;#45;&amp;#45;n4 --&gt;
&lt;g id=&quot;edge4&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n1&amp;#45;&amp;#45;n4&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M99,-71.7C99,-60.85 99,-46.92 99,-36.1&quot; /&gt;
&lt;/g&gt;
&lt;!-- n5 --&gt;
&lt;g id=&quot;node6&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n5&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;171&quot; cy=&quot;-18&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;171&quot; y=&quot;-14.3&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n5&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n2&amp;#45;&amp;#45;n5 --&gt;
&lt;g id=&quot;edge5&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n2&amp;#45;&amp;#45;n5&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M171,-71.7C171,-60.85 171,-46.92 171,-36.1&quot; /&gt;
&lt;/g&gt;
&lt;!-- n6 --&gt;
&lt;g id=&quot;node7&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n6&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;243&quot; cy=&quot;-18&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;243&quot; y=&quot;-14.3&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n6&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n2&amp;#45;&amp;#45;n6 --&gt;
&lt;g id=&quot;edge6&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n2&amp;#45;&amp;#45;n6&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M185.57,-74.83C197.98,-62.77 215.73,-45.51 228.2,-33.38&quot; /&gt;
&lt;/g&gt;
&lt;/g&gt;
&lt;/svg&gt;
&lt;/div&gt;

&lt;p&gt;When a message arrives at &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n0&lt;/code&gt;, it sends to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n1&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n2&lt;/code&gt;, then &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n1&lt;/code&gt; sends to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n3&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n4&lt;/code&gt;, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n2&lt;/code&gt; sends &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n5&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n6&lt;/code&gt;. No redundancy.&lt;/p&gt;

&lt;p&gt;What is happening here is that in the first topology, there are multiple paths between any pair of nodes. So a message will travel through all of those paths. In the second one, there’s only one path between any pair of nodes, meaning there’s no way for a node to ever receive the same message from a different path. The second topology is a graph with no cycles, that is, a tree.&lt;/p&gt;

&lt;p&gt;From that we can conclude that a tree is the minimum needed to reach everyone, and adding extra edges will only increase &lt;strong&gt;msgs/op&lt;/strong&gt;. So, we should aim to work with a topology as close as possible to a tree.&lt;/p&gt;

&lt;p&gt;Within a tree topology, there are multiple configurations. For example, at one extreme we have the line topology:&lt;/p&gt;

&lt;div class=&quot;graphviz-wrapper&quot;&gt;

&lt;!-- Generated by graphviz version 2.43.0 (0)
 --&gt;
&lt;!-- Title: %3 Pages: 1 --&gt;
&lt;svg role=&quot;img&quot; aria-label=&quot;graphviz-8f54c4d93b9b82d28b6ce989590042a2&quot; width=&quot;602pt&quot; height=&quot;44pt&quot; viewBox=&quot;0.00 0.00 602.00 44.00&quot;&gt;
&lt;title&gt;graphviz-8f54c4d93b9b82d28b6ce989590042a2&lt;/title&gt;
&lt;desc&gt;
graph {
  rankdir=LR
  node [shape=ellipse]
  n0 -- n1 -- n2 -- n3 -- n4 -- n5 -- n6
}
&lt;/desc&gt;

&lt;g id=&quot;graph0&quot; class=&quot;graph&quot; transform=&quot;scale(1 1) rotate(0) translate(4 40)&quot;&gt;
&lt;title&gt;%3&lt;/title&gt;
&lt;polygon fill=&quot;white&quot; stroke=&quot;transparent&quot; points=&quot;-4,4 -4,-40 598,-40 598,4 -4,4&quot; /&gt;
&lt;!-- n0 --&gt;
&lt;g id=&quot;node1&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n0&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;27&quot; cy=&quot;-18&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;27&quot; y=&quot;-14.3&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n0&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n1 --&gt;
&lt;g id=&quot;node2&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n1&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;117&quot; cy=&quot;-18&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;117&quot; y=&quot;-14.3&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n1&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n0&amp;#45;&amp;#45;n1 --&gt;
&lt;g id=&quot;edge1&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n0&amp;#45;&amp;#45;n1&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M54.4,-18C65.64,-18 78.72,-18 89.92,-18&quot; /&gt;
&lt;/g&gt;
&lt;!-- n2 --&gt;
&lt;g id=&quot;node3&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n2&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;207&quot; cy=&quot;-18&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;207&quot; y=&quot;-14.3&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n2&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n1&amp;#45;&amp;#45;n2 --&gt;
&lt;g id=&quot;edge2&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n1&amp;#45;&amp;#45;n2&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M144.4,-18C155.64,-18 168.72,-18 179.92,-18&quot; /&gt;
&lt;/g&gt;
&lt;!-- n3 --&gt;
&lt;g id=&quot;node4&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n3&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;297&quot; cy=&quot;-18&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;297&quot; y=&quot;-14.3&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n3&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n2&amp;#45;&amp;#45;n3 --&gt;
&lt;g id=&quot;edge3&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n2&amp;#45;&amp;#45;n3&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M234.4,-18C245.64,-18 258.72,-18 269.92,-18&quot; /&gt;
&lt;/g&gt;
&lt;!-- n4 --&gt;
&lt;g id=&quot;node5&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n4&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;387&quot; cy=&quot;-18&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;387&quot; y=&quot;-14.3&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n4&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n3&amp;#45;&amp;#45;n4 --&gt;
&lt;g id=&quot;edge4&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n3&amp;#45;&amp;#45;n4&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M324.4,-18C335.64,-18 348.72,-18 359.92,-18&quot; /&gt;
&lt;/g&gt;
&lt;!-- n5 --&gt;
&lt;g id=&quot;node6&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n5&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;477&quot; cy=&quot;-18&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;477&quot; y=&quot;-14.3&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n5&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n4&amp;#45;&amp;#45;n5 --&gt;
&lt;g id=&quot;edge5&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n4&amp;#45;&amp;#45;n5&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M414.4,-18C425.64,-18 438.72,-18 449.92,-18&quot; /&gt;
&lt;/g&gt;
&lt;!-- n6 --&gt;
&lt;g id=&quot;node7&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n6&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;567&quot; cy=&quot;-18&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;567&quot; y=&quot;-14.3&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n6&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n5&amp;#45;&amp;#45;n6 --&gt;
&lt;g id=&quot;edge6&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n5&amp;#45;&amp;#45;n6&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M504.4,-18C515.64,-18 528.72,-18 539.92,-18&quot; /&gt;
&lt;/g&gt;
&lt;/g&gt;
&lt;/svg&gt;
&lt;/div&gt;

&lt;p&gt;An operation at node &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n0&lt;/code&gt; will hop from node to node until it reaches the last one. It takes quite a while for an operation to travel through the network.&lt;/p&gt;

&lt;p&gt;At the other extreme, we have the star topology:&lt;/p&gt;

&lt;div class=&quot;graphviz-wrapper&quot;&gt;

&lt;!-- Generated by graphviz version 2.43.0 (0)
 --&gt;
&lt;!-- Title: %3 Pages: 1 --&gt;
&lt;svg role=&quot;img&quot; aria-label=&quot;graphviz-c41bdccf1b26dd42e0a181e45104d297&quot; width=&quot;206pt&quot; height=&quot;209pt&quot; viewBox=&quot;0.00 0.00 205.76 209.07&quot;&gt;
&lt;title&gt;graphviz-c41bdccf1b26dd42e0a181e45104d297&lt;/title&gt;
&lt;desc&gt;
graph {
  layout=neato
  node [shape=ellipse]
  n0 -- n1
  n0 -- n2
  n0 -- n3
  n0 -- n4
  n0 -- n5
  n0 -- n6
}
&lt;/desc&gt;

&lt;g id=&quot;graph0&quot; class=&quot;graph&quot; transform=&quot;scale(1 1) rotate(0) translate(4 205.07)&quot;&gt;
&lt;title&gt;%3&lt;/title&gt;
&lt;polygon fill=&quot;white&quot; stroke=&quot;transparent&quot; points=&quot;-4,4 -4,-205.07 201.76,-205.07 201.76,4 -4,4&quot; /&gt;
&lt;!-- n0 --&gt;
&lt;g id=&quot;node1&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n0&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;98.78&quot; cy=&quot;-100.51&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;98.78&quot; y=&quot;-96.81&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n0&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n1 --&gt;
&lt;g id=&quot;node2&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n1&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;170.18&quot; cy=&quot;-59.01&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;170.18&quot; y=&quot;-55.31&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n1&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n0&amp;#45;&amp;#45;n1 --&gt;
&lt;g id=&quot;edge1&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n0&amp;#45;&amp;#45;n1&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M119.43,-88.51C128.99,-82.95 140.3,-76.38 149.82,-70.84&quot; /&gt;
&lt;/g&gt;
&lt;!-- n2 --&gt;
&lt;g id=&quot;node3&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n2&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;98.62&quot; cy=&quot;-18&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;98.62&quot; y=&quot;-14.3&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n2&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n0&amp;#45;&amp;#45;n2 --&gt;
&lt;g id=&quot;edge2&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n0&amp;#45;&amp;#45;n2&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M98.75,-82.2C98.72,-68.53 98.68,-49.92 98.65,-36.26&quot; /&gt;
&lt;/g&gt;
&lt;!-- n3 --&gt;
&lt;g id=&quot;node4&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n3&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;28.04&quot; cy=&quot;-142.96&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;28.04&quot; y=&quot;-139.26&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n3&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n0&amp;#45;&amp;#45;n3 --&gt;
&lt;g id=&quot;edge3&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n0&amp;#45;&amp;#45;n3&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M78.7,-112.56C69.22,-118.25 57.93,-125.02 48.42,-130.73&quot; /&gt;
&lt;/g&gt;
&lt;!-- n4 --&gt;
&lt;g id=&quot;node5&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n4&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;170.76&quot; cy=&quot;-141.05&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;170.76&quot; y=&quot;-137.35&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n4&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n0&amp;#45;&amp;#45;n4 --&gt;
&lt;g id=&quot;edge4&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n0&amp;#45;&amp;#45;n4&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M119.6,-112.24C129.07,-117.57 140.25,-123.87 149.74,-129.21&quot; /&gt;
&lt;/g&gt;
&lt;!-- n5 --&gt;
&lt;g id=&quot;node6&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n5&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;100.22&quot; cy=&quot;-183.07&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;100.22&quot; y=&quot;-179.37&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n5&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n0&amp;#45;&amp;#45;n5 --&gt;
&lt;g id=&quot;edge5&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n0&amp;#45;&amp;#45;n5&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M99.1,-118.83C99.34,-132.52 99.66,-151.13 99.9,-164.8&quot; /&gt;
&lt;/g&gt;
&lt;!-- n6 --&gt;
&lt;g id=&quot;node7&quot; class=&quot;node&quot;&gt;
&lt;title&gt;n6&lt;/title&gt;
&lt;ellipse fill=&quot;none&quot; stroke=&quot;black&quot; cx=&quot;27&quot; cy=&quot;-59.9&quot; rx=&quot;27&quot; ry=&quot;18&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;27&quot; y=&quot;-56.2&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;n6&lt;/text&gt;
&lt;/g&gt;
&lt;!-- n0&amp;#45;&amp;#45;n6 --&gt;
&lt;g id=&quot;edge6&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;n0&amp;#45;&amp;#45;n6&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M78.02,-88.77C68.58,-83.42 57.43,-77.11 47.96,-71.76&quot; /&gt;
&lt;/g&gt;
&lt;/g&gt;
&lt;/svg&gt;
&lt;/div&gt;

&lt;p&gt;It is natural to see that the latency in the line topology will be higher than in the star topology. And the balanced tree is a middle ground.&lt;/p&gt;

&lt;p&gt;The branching factor of the tree also impacts &lt;strong&gt;msgs/op&lt;/strong&gt;. Imagine a network partition in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n1&lt;/code&gt;. In the line topology, a message from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n0&lt;/code&gt; will not reach the other nodes, and it will need to be retried until the partition heals. That does not happen in the star topology. In the star topology, a partition at &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n1&lt;/code&gt; only blocks &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n1&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n0&lt;/code&gt; still reaches everyone else directly, so fewer retries are needed.&lt;/p&gt;

&lt;h3 id=&quot;batching&quot;&gt;Batching&lt;/h3&gt;

&lt;p&gt;The second thing to look into is &lt;strong&gt;batching&lt;/strong&gt;. Instead of generating one message per operation, we can batch multiple broadcasts into a single message, decreasing &lt;strong&gt;msgs/op&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;But you can see that to introduce a batching mechanism, we need to necessarily introduce some latency. The node had to wait a bit for a batch of broadcasts to arrive first. So there’s a trade-off between &lt;strong&gt;msgs/op&lt;/strong&gt; and &lt;strong&gt;latency&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;To implement batching, I’ve built on top of our fault-tolerant strategy. Now we keep track of all unacked messages per neighbor and periodically send all unacked messages in a single request. That required a new internal &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;maelstrom&lt;/code&gt; message, that we called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gossip&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Another thing that can be considered is whether we really need an immediate broadcast to the neighbors as we need a new message. By having a background routine, we can simply mark the new incoming messages as unacked and let the background routine deal with it. The decision of having or not having this immediate call also impacts &lt;strong&gt;msgs/op&lt;/strong&gt; and &lt;strong&gt;latency.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;With these two mechanisms in hand, &lt;strong&gt;topology&lt;/strong&gt; and &lt;strong&gt;batching&lt;/strong&gt;, we can tweak a couple of parameters to figure out reasonable values of latency and &lt;strong&gt;msgs/op&lt;/strong&gt; that satisfy the requirements.&lt;/p&gt;

&lt;p&gt;code: &lt;a href=&quot;https://github.com/brunocalza/gossip-glomers/tree/main/maelstrom-broadcast/fault-tolerant-efficient-1&quot;&gt;fault-tolerant-efficient-1&lt;/a&gt; and &lt;a href=&quot;https://github.com/brunocalza/gossip-glomers/tree/main/maelstrom-broadcast/fault-tolerant-efficient-2&quot;&gt;fault-tolerant-efficient-2&lt;/a&gt;&lt;/p&gt;

&lt;h3 id=&quot;benchmark&quot;&gt;Benchmark&lt;/h3&gt;

&lt;p&gt;To understand the trade-offs discussed above and figure out a reasonable topology and batch interval, I’ve run the same test with different batch intervals and topologies. You can see the batch interval used in the &lt;strong&gt;INTERVAL&lt;/strong&gt; column, and the topology is defined by the branching factor of the tree in the &lt;strong&gt;BRANCH&lt;/strong&gt; column.&lt;/p&gt;

&lt;div class=&quot;table-responsive&quot;&gt;
&lt;table class=&quot;benchmark-table&quot;&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Interval (ms)&lt;/th&gt;
      &lt;th&gt;Branch&lt;/th&gt;
      &lt;th&gt;msgs/op&lt;/th&gt;
      &lt;th&gt;p50&lt;/th&gt;
      &lt;th&gt;p95&lt;/th&gt;
      &lt;th&gt;p99&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;100&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;13.31&lt;/td&gt;&lt;td&gt;2281&lt;/td&gt;&lt;td&gt;3363&lt;/td&gt;&lt;td&gt;3646&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr class=&quot;highlight-row&quot;&gt;
      &lt;td&gt;100&lt;/td&gt;&lt;td&gt;2&lt;/td&gt;&lt;td&gt;10.70&lt;/td&gt;&lt;td&gt;867&lt;/td&gt;&lt;td&gt;1154&lt;/td&gt;&lt;td&gt;1214&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr class=&quot;highlight-row&quot;&gt;
      &lt;td&gt;100&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;9.95&lt;/td&gt;&lt;td&gt;512&lt;/td&gt;&lt;td&gt;704&lt;/td&gt;&lt;td&gt;795&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr class=&quot;highlight-row&quot;&gt;
      &lt;td&gt;100&lt;/td&gt;&lt;td&gt;8&lt;/td&gt;&lt;td&gt;9.33&lt;/td&gt;&lt;td&gt;416&lt;/td&gt;&lt;td&gt;587&lt;/td&gt;&lt;td&gt;599&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr class=&quot;highlight-row&quot;&gt;
      &lt;td&gt;100&lt;/td&gt;&lt;td&gt;16&lt;/td&gt;&lt;td&gt;9.28&lt;/td&gt;&lt;td&gt;348&lt;/td&gt;&lt;td&gt;438&lt;/td&gt;&lt;td&gt;457&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;200&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;8.80&lt;/td&gt;&lt;td&gt;3046&lt;/td&gt;&lt;td&gt;4397&lt;/td&gt;&lt;td&gt;4640&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;200&lt;/td&gt;&lt;td&gt;2&lt;/td&gt;&lt;td&gt;7.30&lt;/td&gt;&lt;td&gt;1155&lt;/td&gt;&lt;td&gt;1466&lt;/td&gt;&lt;td&gt;1567&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr class=&quot;highlight-row&quot;&gt;
      &lt;td&gt;200&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;7.02&lt;/td&gt;&lt;td&gt;708&lt;/td&gt;&lt;td&gt;955&lt;/td&gt;&lt;td&gt;981&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr class=&quot;highlight-row&quot;&gt;
      &lt;td&gt;200&lt;/td&gt;&lt;td&gt;8&lt;/td&gt;&lt;td&gt;6.74&lt;/td&gt;&lt;td&gt;566&lt;/td&gt;&lt;td&gt;749&lt;/td&gt;&lt;td&gt;791&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr class=&quot;highlight-row&quot;&gt;
      &lt;td&gt;200&lt;/td&gt;&lt;td&gt;16&lt;/td&gt;&lt;td&gt;6.74&lt;/td&gt;&lt;td&gt;414&lt;/td&gt;&lt;td&gt;608&lt;/td&gt;&lt;td&gt;636&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;300&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;7.26&lt;/td&gt;&lt;td&gt;4365&lt;/td&gt;&lt;td&gt;6580&lt;/td&gt;&lt;td&gt;6775&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;300&lt;/td&gt;&lt;td&gt;2&lt;/td&gt;&lt;td&gt;6.33&lt;/td&gt;&lt;td&gt;1667&lt;/td&gt;&lt;td&gt;2114&lt;/td&gt;&lt;td&gt;2222&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr class=&quot;highlight-row&quot;&gt;
      &lt;td&gt;300&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;5.91&lt;/td&gt;&lt;td&gt;958&lt;/td&gt;&lt;td&gt;1249&lt;/td&gt;&lt;td&gt;1331&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr class=&quot;highlight-row&quot;&gt;
      &lt;td&gt;300&lt;/td&gt;&lt;td&gt;8&lt;/td&gt;&lt;td&gt;5.80&lt;/td&gt;&lt;td&gt;723&lt;/td&gt;&lt;td&gt;981&lt;/td&gt;&lt;td&gt;1056&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr class=&quot;highlight-row&quot;&gt;
      &lt;td&gt;300&lt;/td&gt;&lt;td&gt;16&lt;/td&gt;&lt;td&gt;5.89&lt;/td&gt;&lt;td&gt;484&lt;/td&gt;&lt;td&gt;711&lt;/td&gt;&lt;td&gt;780&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;500&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;6.21&lt;/td&gt;&lt;td&gt;6661&lt;/td&gt;&lt;td&gt;10857&lt;/td&gt;&lt;td&gt;11287&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;500&lt;/td&gt;&lt;td&gt;2&lt;/td&gt;&lt;td&gt;5.47&lt;/td&gt;&lt;td&gt;2733&lt;/td&gt;&lt;td&gt;3467&lt;/td&gt;&lt;td&gt;3584&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;500&lt;/td&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;5.30&lt;/td&gt;&lt;td&gt;1648&lt;/td&gt;&lt;td&gt;2029&lt;/td&gt;&lt;td&gt;2143&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;500&lt;/td&gt;&lt;td&gt;8&lt;/td&gt;&lt;td&gt;5.24&lt;/td&gt;&lt;td&gt;1170&lt;/td&gt;&lt;td&gt;1532&lt;/td&gt;&lt;td&gt;1655&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr class=&quot;highlight-row&quot;&gt;
      &lt;td&gt;500&lt;/td&gt;&lt;td&gt;16&lt;/td&gt;&lt;td&gt;5.24&lt;/td&gt;&lt;td&gt;794&lt;/td&gt;&lt;td&gt;1112&lt;/td&gt;&lt;td&gt;1177&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;

&lt;p&gt;The test used was:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;./maelstrom &lt;span class=&quot;nb&quot;&gt;test&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-w&lt;/span&gt; broadcast &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--bin&lt;/span&gt; ~/go/bin/fault-tolerant-efficient-2 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--node-count&lt;/span&gt; 25 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--time-limit&lt;/span&gt; 20 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--rate&lt;/span&gt; 100 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--latency&lt;/span&gt; 100
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Nice to see that our understanding about trade-offs checks out: higher branching implies lower latency, and, at the same branching, higher interval implies lower &lt;strong&gt;msgs/op&lt;/strong&gt; but higher &lt;strong&gt;latency&lt;/strong&gt;. Also, we can see diminishing returns on branching.&lt;/p&gt;

&lt;p&gt;Back to the challenge. Given the requirements of:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Messages-per-operation is below &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;20&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;Median latency is below &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1 second&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;Maximum latency is below &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;2 seconds&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We see that 11 out of 20 configurations pass the test. Awesome. We would not be able to achieve that without a tree topology and a batching strategy.&lt;/p&gt;

&lt;p&gt;And that’s all I got for this one.&lt;/p&gt;

      </content>
    </entry>
  
    <entry>
      <title>Generating Unique IDs with Raft Consensus</title>
      <link href="https://bcalza.b-cdn.net/blog/2026/04/08/generating-unique-ids-with-raft-consensus.html" />
      <id>https://bcalza.b-cdn.net/blog/2026/04/08/generating-unique-ids-with-raft-consensus</id>
      <updated>2026-04-08T18:15:13+00:00</updated>
      <content type="html">
        &lt;blockquote class=&quot;callout&quot;&gt;
  &lt;p&gt;This post is part of a series on Fly.io’s &lt;a href=&quot;https://fly.io/dist-sys/&quot;&gt;distributed systems challenges&lt;/a&gt;:&lt;/p&gt;

  &lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;/blog/2026/04/07/implementing-snowflake-unique-id-generation&quot;&gt;Implementing Snowflake Unique ID Generation&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;Generating Unique IDs with Raft Consensus (this post)&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;/blog/2026/04/09/flyio-broadcast-challenges&quot;&gt;Fly.io’s Broadcast Challenges&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;/blog/2026/04/13/building-a-grow-only-counter-on-a-sequentially-consistent-kv-store&quot;&gt;Building a Grow-Only Counter on a Sequentially Consistent KV Store&lt;/a&gt;&lt;/li&gt;
  &lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt;This blog post is a follow-up on &lt;a href=&quot;/blog/2026/04/07/implementing-snowflake-unique-id-generation&quot;&gt;Implementing Snowflake Unique ID Generation&lt;/a&gt;. In that post, I explain an implementation of Snowflake IDs that can be used to solve the distributed systems challenge &lt;a href=&quot;https://fly.io/dist-sys/2/&quot;&gt;Challenge #2: Unique ID Generation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Now, we discuss an alternative to that. We’re going to use Raft to generate globally unique IDs. We assume here a general understanding of how Raft works. I recommend &lt;a href=&quot;https://thesecretlivesofdata.com/raft/&quot;&gt;The Secret Lives of Data&lt;/a&gt;, just in case. The idea is to see if we can use the &lt;a href=&quot;https://github.com/etcd-io/raft&quot;&gt;etcd-io/raft&lt;/a&gt; library to build a program that passes the challenge.&lt;/p&gt;

&lt;p&gt;Here’s the full code we’ll discuss: &lt;a href=&quot;https://github.com/brunocalza/gossip-glomers/tree/main/maelstrom-unique-ids-raft&quot;&gt;maelstrom-unique-ids-raft&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;thinking-about-the-system-architecture&quot;&gt;Thinking about the system architecture&lt;/h2&gt;

&lt;p&gt;One challenge in building a Raft application is the architecture of the code. Because you have to wait for nodes to achieve consensus, it can be a bit confusing how to architect that waiting. The architecture also depends on the API provided by the Raft library.&lt;/p&gt;

&lt;p&gt;Here’s an idea on how we can architect the code to build a distributed counter using an embedded Raft library that communicates with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;maelstrom&lt;/code&gt; clients. In essence, we can do something very similar to what we’ve done with Snowflake. Have some kind of object, and call a method (e.g. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Next&lt;/code&gt;) on that object to get the next number in the sequence whenever a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;generate&lt;/code&gt; message from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;maelstrom&lt;/code&gt; comes in.&lt;/p&gt;

&lt;p&gt;We’ll call this object &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DistributedCounter&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Result&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DistributedCounter&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;// state under consensus&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;seq&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;// deps&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;raft&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Node&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;storage&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;raft&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MemoryStorage&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;// ...&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DistributedCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Next&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Context&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reqID&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;error&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;// request/propose a new value to raft nodes and wait for it&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This object will act as a Raft node, communicate with other Raft nodes to achieve consensus, and store the Raft log. That’s why we see a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;raft.Node&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;*raft.MemoryStorage&lt;/code&gt; as dependencies. Also, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;seq&lt;/code&gt; is the state for our globally unique ID, and it is incremented whenever consensus is achieved.&lt;/p&gt;

&lt;p&gt;And this is how everything fits together:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;// in main.go&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maelstrom&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;NewNode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DistributedCounter&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Handle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;init&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maelstrom&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Message&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;error&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;// initialize raft nodes and distributed counter&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Handle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;generate&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maelstrom&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Message&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;error&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;next&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Next&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Background&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;generateRequestID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;// ...&lt;/span&gt;

&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;You can see the similarities with the Snowflake solution. There, we had a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Generator&lt;/code&gt;, that needed to be initialized inside &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;init&lt;/code&gt;, with a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Next&lt;/code&gt; method. Here we have a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DistributedCounter&lt;/code&gt;, that needs to be initialized in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;init&lt;/code&gt;, with a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Next&lt;/code&gt; method. In essence, we need to implement these 3 methods: the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;init&lt;/code&gt; handler, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;generate&lt;/code&gt; handler, and the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Next&lt;/code&gt; method. There’s more to it, actually, but we’ll get there.&lt;/p&gt;

&lt;h2 id=&quot;generate-implementation&quot;&gt;Generate implementation&lt;/h2&gt;

&lt;p&gt;Starting with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;generate&lt;/code&gt; because it’s the simplest and there’s no Raft explicitly involved (we treat it as a black box for now). We can simply return the value we got from our counter back to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;maelstrom&lt;/code&gt; client.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Handle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;generate&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maelstrom&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Message&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;error&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;next&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Next&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Background&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;generateRequestID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maelstrom&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;NewRPCError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;maelstrom&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TemporarilyUnavailable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Error&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;response&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;Type&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;string&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;`json:&quot;type&quot;`&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;Id&lt;/span&gt;   &lt;span class=&quot;kt&quot;&gt;string&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;`json:&quot;id&quot;`&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Reply&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;response&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;Type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;generate_ok&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;Id&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;fmt&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Sprintf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;%d&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;next&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We add a “random” request ID for every &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;generate&lt;/code&gt; request, because in our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DistributedCounter&lt;/code&gt; implementation, when a node proposes a next ID, it needs to wait for the nodes to achieve consensus before getting the next ID. And that waiting happens through a channel. So we use a request ID to map results to requests. Whenever a result is ready, we associate it with its request, and the node waiting on that request ID can respond to its client.&lt;/p&gt;

&lt;h2 id=&quot;getting-the-next-value&quot;&gt;Getting the next value&lt;/h2&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Next&lt;/code&gt; can be implemented simply using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Propose&lt;/code&gt; from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;raft.Node&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DistributedCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Next&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Context&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reqID&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;error&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;make&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;chan&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mu&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Lock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;waiters&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reqID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mu&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Unlock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;// If this node is a leader, it will append the entry and send to the followers waiting for majority&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;// If not, the proposal is sent internally to the leader&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;// We retry until raft accepts the proposal (e.g., no leader yet).    &lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Propose&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;byte&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reqID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;break&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Sleep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;100&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Millisecond&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;// blocked waiting for a result&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;A couple of things to discuss here. We add a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;waiters map[string]chan Result&lt;/code&gt; to our object. For every request, we create a channel, and our node waits on that channel:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;// blocked waiting for a result&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;To get the next number in the sequence, we need to propose an increment to our cluster, hence &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;node.Propose&lt;/code&gt;. The Raft protocol is agnostic in regard to the data it replicates. It is just concerned about making sure nodes execute the same operations in the same order. In our case, there’s only one operation and it is a simple increment (e.g., &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;seq++&lt;/code&gt;). In more complex applications, it can be a command, for example, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;set x = 2&lt;/code&gt;, or a SQL statement. I’m saying this because, given the simplicity of our application, there’s not much we need to propose to our nodes; we just need to signal them that a new ID was requested. And they will achieve consensus on that signaling. So that’s why we just send a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;reqID&lt;/code&gt;. Whenever consensus is achieved, nodes will use this value to associate the new ID with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;reqID&lt;/code&gt; using the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;waiters&lt;/code&gt; map.&lt;/p&gt;

&lt;p&gt;The other thing to say is how &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Propose&lt;/code&gt; works. In Raft, only the leader can propose a new entry to the log. The recommended approach in the case of a proposal request arriving at a follower is to deny the request and send the information about the leader back to the client. If we do that, our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;maelstrom&lt;/code&gt; test will fail, because &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;maelstrom&lt;/code&gt; clients are not expecting that kind of interaction. However, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Propose&lt;/code&gt; detects internally whether the node is the leader, and if not, it sends the proposal to the leader. So we don’t have to worry about that.&lt;/p&gt;

&lt;p&gt;One last thing. The proposal can fail. Maybe the network is going through a partition and is having a hard time electing a new leader, so no new proposals can be made. In this case, we naively retry until it works. A more appropriate solution would be to add a timeout and return an error to the client. I’ll come back to this later.&lt;/p&gt;

&lt;h2 id=&quot;initializing-our-distributed-counter&quot;&gt;Initializing our distributed counter&lt;/h2&gt;

&lt;p&gt;There are two parts to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;init&lt;/code&gt;: we need to initialize the Raft nodes, and we need to start a background routine in our counter that runs the logic the Raft library expects our application to execute.&lt;/p&gt;

&lt;h3 id=&quot;initializing-raft-nodes&quot;&gt;Initializing Raft nodes&lt;/h3&gt;

&lt;p&gt;Here, we need to figure out the node ID and all the peers and start the Raft node with the right configuration.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;nodeNum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;strconv&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ParseUint&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fmt&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Errorf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;failed to parse node id: %v&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;raftID&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nodeNum&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;peers&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;raft&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Peer&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nid&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;range&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;NodeIDs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;strconv&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ParseUint&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fmt&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Errorf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;failed to parse peer id: %v&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;peers&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;peers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;raft&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Peer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ID&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;num&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;storage&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;raft&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;NewMemoryStorage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;cfg&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;raft&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Config&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ID&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;              &lt;span class=&quot;n&quot;&gt;raftID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ElectionTick&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;    &lt;span class=&quot;m&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;HeartbeatTick&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;   &lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;Storage&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;         &lt;span class=&quot;n&quot;&gt;storage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;MaxSizePerMsg&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;   &lt;span class=&quot;m&quot;&gt;4096&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; 
    &lt;span class=&quot;n&quot;&gt;MaxInflightMsgs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;256&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;raftNode&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;raft&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;StartNode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cfg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;peers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NewDistributedCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;raftNode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;storage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Send&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;go&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;For this application, we’re using a memory storage, but we can swap that if we want. A &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;HeartbeatTick&lt;/code&gt; of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1&lt;/code&gt;, meaning every clock tick, a heartbeat is sent. And &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ElectionTick&lt;/code&gt; of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;10&lt;/code&gt;: if nodes don’t hear from leader in 10 clock ticks, a leader election starts.&lt;/p&gt;

&lt;h3 id=&quot;background-routine&quot;&gt;Background routine&lt;/h3&gt;

&lt;p&gt;The background routine is what runs when we call  &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;go counter.run()&lt;/code&gt;. If we look at &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;etcd-io/raft&lt;/code&gt; &lt;a href=&quot;https://github.com/etcd-io/raft&quot;&gt;Readme&lt;/a&gt;, it says that the Raft client application has certain responsibilities. Let’s start with the 2 high-level responsibilities: indicating how frequently the clock ticks, and being ready to act on a batch of updates:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;ticker&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;NewTicker&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;100&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Millisecond&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;defer&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ticker&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Stop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;select&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ticker&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tick&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rd&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Ready&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;c&quot;&gt;// ...&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now we know that heartbeats will happen every &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;100ms&lt;/code&gt; and the election timeout is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1s&lt;/code&gt;. And &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;node.Ready&lt;/code&gt; is a channel that informs us that a new batch of updates is available, and the application needs to act on it. Inside &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Ready&lt;/code&gt;, there’s a bunch more responsibilities. In our case, there are five:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;// 1. store updated term/vote/commit&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;raft&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IsEmptyHardState&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rd&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;HardState&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;storage&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SetHardState&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rd&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;HardState&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;log&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Printf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;failed to set hard state: %v&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;// 2. append log entries to local storage&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rd&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Entries&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;storage&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rd&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Entries&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;log&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Printf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;failed to append entries: %v&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;// 3. if we have new messages to send to other nodes, send it&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;range&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rd&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Messages&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;send&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;// 4. if we have log entries that have achieved consensus &lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;//    apply your application logic&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ent&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;range&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rd&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CommittedEntries&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;applyEntry&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ent&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;// 5. indicates that we&apos;re done&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Advance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This is what the Raft library provides as an API so we can make our application work with the consensus engine it uses. I wasn’t familiar with this library, so it was a bit hard to grasp at first. But if you’re familiar with Raft, you can get a sense of what is going on. Other libraries will probably offer a different API. I won’t spend too much time on the details of each of these responsibilities. The most important for our discussion are 3 and 4. We’ll skip 3 for now and focus on 4.&lt;/p&gt;

&lt;p&gt;So, what happens in 4 is that we have received all new log entries that achieved consensus. We’re in a state where the Raft log is replicated across nodes, and our application’s state needs to reflect that. The library is telling you to apply these new entries to your application so that its state is consistent across nodes. If the application is a decentralized database, the entry could be a SQL statement, for example. In our case, it is just a request ID signaling that the sequence must be incremented.&lt;/p&gt;

&lt;p&gt;Here’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;applyEntry&lt;/code&gt; implementation:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DistributedCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;applyEntry&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ent&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;raftpb&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Entry&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ent&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Type&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;raftpb&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;EntryConfChange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cc&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;raftpb&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ConfChange&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Unmarshal&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ent&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;log&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Printf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;failed to unmarshal conf change: %v&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ApplyConfChange&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ent&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Type&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;raftpb&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;EntryNormal&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ent&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;reqID&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ent&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mu&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Lock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;seq&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;res&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;seq&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;waiters&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reqID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;nb&quot;&gt;delete&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;waiters&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reqID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mu&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Unlock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;res&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It starts with some more responsibilities that I’ll ignore. The important thing is that it is here that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;seq&lt;/code&gt; is incremented and the value is put on &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;waiters[reqID]&lt;/code&gt; so that the node waiting for that value can respond to the client. There’s a very subtle thing happening here that might not be obvious:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;if &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ch != nil&lt;/code&gt;: this is the node that proposed a new ID and is waiting on another blocked goroutine for that ID&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;if &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ch == nil&lt;/code&gt;: this entry was proposed by a different node, so we don’t have to send the value to the channel&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Note that all nodes increment the sequence to ensure the value is consistent across the cluster.&lt;/p&gt;

&lt;h2 id=&quot;communication-among-the-nodes&quot;&gt;Communication among the nodes&lt;/h2&gt;

&lt;p&gt;We’ve implemented the core of our application. But there’s one supporting piece missing. Nodes need to be able to communicate with each other. We said that if a follower receives a proposal, it needs to send it to the leader. Also, the leader needs to send entries to followers. There are a bunch of messages being exchanged, and we did not set that up. That’s where that third responsibility above comes in:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;range&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rd&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Messages&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;send&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Whenever the protocol indicates that new messages need to be sent to other nodes, we need to send them. Here’s how we implement that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;send&lt;/code&gt;, hooking that up with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;maelstrom&lt;/code&gt; :&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DistributedCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;send&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;raftpb&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Message&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;destID&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fmt&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Sprintf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;n%d&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;To&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Marshal&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;log&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Printf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;failed to marshal raft message: %v&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sendFn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;destID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;any&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;raft&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;&quot;data&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;});&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;log&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Printf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;failed to send raft message to %s: %v&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;destID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sendFn&lt;/code&gt; is a dependency that we add in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DistributedCounter&lt;/code&gt;&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DistributedCounter&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;// state under consensus&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;seq&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;// deps&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;sendFn&lt;/span&gt;  &lt;span class=&quot;k&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dest&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;body&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;any&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;error&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;raft&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Node&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;storage&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;raft&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MemoryStorage&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;//control&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;mu&lt;/span&gt;      &lt;span class=&quot;n&quot;&gt;sync&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Mutex&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;waiters&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;chan&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Result&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;// in main.go&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NewDistributedCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;raftNode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;storage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Send&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;You see that we created a new kind of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;maelstrom&lt;/code&gt; message called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;raft&lt;/code&gt;. We need to handle that as well:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;// maelstrom handler&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Handle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;raft&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maelstrom&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Message&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;error&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;body&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;Data&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;byte&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;`json:&quot;data&quot;`&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;json&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Unmarshal&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Body&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;body&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;log&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Printf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;failed to unmarshal raft envelope: %v&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Sync&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Background&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;body&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;log&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Printf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;failed to sync raft: %v&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;// a new method on DistributedCounter&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DistributedCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Sync&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Context&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;byte&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;error&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;raftpb&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Message&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Unmarshal&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Step&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;msg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We can see that whenever a new internal message is received, we eventually call &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;node.Step&lt;/code&gt;. That is the method that the library provides to be called whenever new messages arrive.&lt;/p&gt;

&lt;p&gt;And with that, our solution is completed.&lt;/p&gt;

&lt;h2 id=&quot;running-the-test&quot;&gt;Running the test&lt;/h2&gt;

&lt;p&gt;So I ran the same test I ran for Snowflake:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;./maelstrom &lt;span class=&quot;nb&quot;&gt;test&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-w&lt;/span&gt; unique-ids &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--bin&lt;/span&gt; ~/go/bin/maelstrom-unique-ids-raft &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--node-count&lt;/span&gt; 5 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--time-limit&lt;/span&gt; 30 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--rate&lt;/span&gt; 1000 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--availability&lt;/span&gt; total &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--concurrency&lt;/span&gt; 10 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--nemesis&lt;/span&gt; partition
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And the test failed. All that work to get a failing test!&lt;/p&gt;

&lt;p&gt;But if we think about it and look at the test we’re running, it kind of makes sense. We’re saying we want total availability even with a network partition. However, Raft is not a totally available system. Depending on how the network partition occurs, the system may become unavailable while it tries to elect a new leader. We can either reduce our availability requirement or remove the requirement that it needs to be fault-tolerant to network partitions. The following tests pass:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# availability 0.999&lt;/span&gt;
./maelstrom &lt;span class=&quot;nb&quot;&gt;test&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-w&lt;/span&gt; unique-ids &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--bin&lt;/span&gt; ~/go/bin/maelstrom-unique-ids-raft &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--node-count&lt;/span&gt; 5 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--time-limit&lt;/span&gt; 30 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--rate&lt;/span&gt; 1000 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--availability&lt;/span&gt; 0.999 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--concurrency&lt;/span&gt; 10 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--nemesis&lt;/span&gt; partition
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# no network partition&lt;/span&gt;
./maelstrom &lt;span class=&quot;nb&quot;&gt;test&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-w&lt;/span&gt; unique-ids &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--bin&lt;/span&gt; ~/go/bin/maelstrom-unique-ids-raft &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--node-count&lt;/span&gt; 5 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--time-limit&lt;/span&gt; 30 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--rate&lt;/span&gt; 1000 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--availability&lt;/span&gt; total &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--concurrency&lt;/span&gt; 10 
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;That is evidence of Raft’s &lt;strong&gt;CP&lt;/strong&gt; (from the &lt;strong&gt;CAP theorem&lt;/strong&gt;) nature and the trade-offs between availability and network partition tolerance.&lt;/p&gt;

&lt;p&gt;One thing that bugged me was that the application was implemented so that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Next&lt;/code&gt; should never fail. Remember that ugly retry loop? So why don’t we get 100% availability? Turns out &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;maelstrom&lt;/code&gt; has a 5-second timeout on the client. If the client doesn’t hear a response within 5s, it will count as a failure. By default, partitions last 10s in tests. So that makes sense. But even if you set the partitions to last 2s, you still don’t get 100% availability. My hypothesis is that there’s probably a network partition at the end of the test run, so requests made close to the end never get a response, and those failures are counted.&lt;/p&gt;

&lt;h2 id=&quot;comparing-with-snowflake&quot;&gt;Comparing with Snowflake&lt;/h2&gt;

&lt;p&gt;It is interesting to compare the consensus approach with the Snowflake approach. You clearly see the trade-offs of the &lt;strong&gt;CAP theorem&lt;/strong&gt; in action. The Snowflake approach is totally available even with network partitions. Meaning it is compromising on consistency somehow. That makes sense. You cannot get an orderly increasing sequence with no gaps (e.g., 1, 2, 3, 4, ….) with Snowflake. Snowflake IDs are only roughly sortable, not strictly ordered.&lt;/p&gt;

&lt;p&gt;Another comparison is the amount of overhead coordination adds. The table below compares the latency of both approaches.&lt;/p&gt;

&lt;div class=&quot;table-responsive&quot;&gt;
&lt;table class=&quot;benchmark-table&quot;&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Metric&lt;/th&gt;
      &lt;th&gt;Raft&lt;/th&gt;
      &lt;th&gt;Snowflake&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Ops&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;23,269&lt;/td&gt;&lt;td&gt;27,965&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr class=&quot;highlight-row&quot;&gt;
      &lt;td&gt;&lt;strong&gt;msgs/op&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;11.2&lt;/td&gt;&lt;td&gt;0.0&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Min&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;0.3 ms&lt;/td&gt;&lt;td&gt;0.1 ms&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Mean&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;3.7 ms&lt;/td&gt;&lt;td&gt;0.4 ms&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Median&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;0.7 ms&lt;/td&gt;&lt;td&gt;0.4 ms&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;p95&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;2.4 ms&lt;/td&gt;&lt;td&gt;0.8 ms&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;p99&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;5.5 ms&lt;/td&gt;&lt;td&gt;1.1 ms&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr class=&quot;highlight-row&quot;&gt;
      &lt;td&gt;&lt;strong&gt;Max&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;4,250 ms&lt;/td&gt;&lt;td&gt;10.6 ms&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;

&lt;p&gt;I mean, this is all running locally and does not reflect real-world scenarios, but it is interesting to see the latency increase, the effect of the network partition on &lt;strong&gt;max,&lt;/strong&gt; and the introduction of internal messages being reflected on &lt;strong&gt;msgs/op.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That was a fun challenge. I hope you have enjoyed reading as much as I have enjoyed writing. And that’s all for this blog post.&lt;/p&gt;

      </content>
    </entry>
  
    <entry>
      <title>Implementing Snowflake Unique ID Generation</title>
      <link href="https://bcalza.b-cdn.net/blog/2026/04/07/implementing-snowflake-unique-id-generation.html" />
      <id>https://bcalza.b-cdn.net/blog/2026/04/07/implementing-snowflake-unique-id-generation</id>
      <updated>2026-04-07T14:59:10+00:00</updated>
      <content type="html">
        &lt;blockquote class=&quot;callout&quot;&gt;
  &lt;p&gt;This post is part of a series on Fly.io’s &lt;a href=&quot;https://fly.io/dist-sys/&quot;&gt;distributed systems challenges&lt;/a&gt;:&lt;/p&gt;

  &lt;ul&gt;
    &lt;li&gt;Implementing Snowflake Unique ID Generation (this post)&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;/blog/2026/04/08/generating-unique-ids-with-raft-consensus&quot;&gt;Generating Unique IDs with Raft Consensus&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;/blog/2026/04/09/flyio-broadcast-challenges&quot;&gt;Fly.io’s Broadcast Challenges&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;/blog/2026/04/13/building-a-grow-only-counter-on-a-sequentially-consistent-kv-store&quot;&gt;Building a Grow-Only Counter on a Sequentially Consistent KV Store&lt;/a&gt;&lt;/li&gt;
  &lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt;I’ve wanted for some time to work on this &lt;a href=&quot;https://fly.io/dist-sys/&quot;&gt;series of distributed systems challenges&lt;/a&gt; by Fly.io, and I have finally found some time. This post is about &lt;a href=&quot;https://fly.io/dist-sys/2/&quot;&gt;Challenge #2: Unique ID Generation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;A brief explanation of what these challenges are all about for those not familiar. In essence, you are given a task to build a decentralized algorithm that should pass a test. You use &lt;a href=&quot;https://github.com/jepsen-io/maelstrom&quot;&gt;maelstrom&lt;/a&gt; as the platform that provides the workload and checks if the results generated by your algorithm are valid for that workload. To give you a sense of how that works, here’s the command I’ve run for project 2:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;./maelstrom &lt;span class=&quot;nb&quot;&gt;test&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-w&lt;/span&gt; unique-ids &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--bin&lt;/span&gt; ~/go/bin/maelstrom-unique-ids &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--node-count&lt;/span&gt; 5 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--time-limit&lt;/span&gt; 30 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--rate&lt;/span&gt; 1000 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--availability&lt;/span&gt; total &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--concurrency&lt;/span&gt; 10 &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
    &lt;span class=&quot;nt&quot;&gt;--nemesis&lt;/span&gt; partition
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;You run a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;maelstrom&lt;/code&gt; test specifying the workload, in this case &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unique-ids&lt;/code&gt;. You provide your task as a binary, e.g. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;~/go/bin/maelstrom-unique-ids&lt;/code&gt;. You can set some parameters that influence how the test is run:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;node-count&lt;/code&gt;: the number of processes &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;maelstrom&lt;/code&gt; will spawn of your binary&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;time-limit&lt;/code&gt;: for how long the test runs&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rate&lt;/code&gt;: requests/s&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;concurrency&lt;/code&gt;: clients running concurrently&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;nemesis&lt;/code&gt;: inject a fault, in this case a network partition&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;availability&lt;/code&gt;: the availability target that will be checked&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Generating unique ids is a common problem in distributed systems and it is not particularly hard to solve, and for this challenge you can get a valid result pretty easily with solutions such as:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Use the node id as the namespace, and a sequence counter per namespace&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Generate an UUID or ULID&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Use a centralized database as the provider of the sequence&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are two approaches I wanted to explore for fun: implementing &lt;a href=&quot;https://blog.x.com/engineering/en_us/a/2010/announcing-snowflake&quot;&gt;Twitter Snowflake&lt;/a&gt; and using a consensus algorithm. I mean, the idea is not just to pass the challenge but learn a bit more about the problem, alternative solutions, and trade-offs. In this post, I focus more on Snowflake, and hopefully I’ll write another one about using consensus.&lt;/p&gt;

&lt;p&gt;It is kind of interesting to see their motivation in that blog post. They were using Cassandra, which had no built-in way of generating unique ids, and they needed an id that was roughly sortable, highly available, and had to fit into 64 bits. Interestingly enough, none of the solutions we listed above worked for them.&lt;/p&gt;

&lt;p&gt;The solution was the following composition:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/img/snowflake_1.png&quot; alt=&quot;Snowflake ID composition&quot; /&gt;&lt;/p&gt;

&lt;p&gt;You have a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sequence&lt;/code&gt; scoped by &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;node id&lt;/code&gt;, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;timestamp&lt;/code&gt;. The algorithm runs as follows:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Every time a new id is requested on a node (e.g. node &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0&lt;/code&gt;), it grabs the current timestamp, e.g. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1775566509000&lt;/code&gt;, if this timestamp is different from the last timestamp used when an id was generated the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sequence&lt;/code&gt; is reset.&lt;/p&gt;

    &lt;p&gt;&lt;img src=&quot;/assets/img/snowflake_2.png&quot; alt=&quot;Snowflake ID bit layout&quot; /&gt;&lt;/p&gt;

    &lt;p&gt;The timestamp in this algorithm is the number of milliseconds since a custom epoch. I am using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1735689600000&lt;/code&gt; as the custom epoch, so that’s why we use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;39876909000 = 1775566509000 - 1735689600000&lt;/code&gt;&lt;/p&gt;

    &lt;p&gt;Also, 41 bits of timestamp is roughly 69 years. That means, Twitter has 69 years after their chosen epoch time before this algorithm expires.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;If a new id is requested at the same timestamp, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sequence&lt;/code&gt; is increased. So &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sequence&lt;/code&gt; is only increased when two requests happen at the same millisecond. Meaning we can have at most 4096 (12 bits) requests per millisecond in that node.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;If it happens to be the case that we have more than 4095 requests, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sequence&lt;/code&gt; overflows, the algorithm waits for the next millisecond and resets the sequence.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So, I implemented this algorithm for the challenge.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;package&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;sync&quot;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;nodeBits&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;10&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;seqBits&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;12&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;maxNodeID&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nodeBits&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;maxSeq&lt;/span&gt;    &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;seqBits&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;nodeShift&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;seqBits&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;timeShift&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;seqBits&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nodeBits&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;// 2025-01-01T00:00:00Z&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;customEpoch&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;1735689600000&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Generator&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;mu&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sync&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Mutex&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;lastMs&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int64&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;seq&lt;/span&gt;    &lt;span class=&quot;kt&quot;&gt;int64&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;nodeID&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int64&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;clock&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Clock&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NewSnowflakeGenerator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nodeID&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;clock&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Clock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Generator&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Generator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;lastMs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;seq&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;    &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;nodeID&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nodeID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;clock&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;clock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;g&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Generator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Next&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int64&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mu&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Lock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;defer&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mu&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Unlock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;now&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;clock&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Now&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;UnixMilli&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;now&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lastMs&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;seq&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maxSeq&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;seq&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;now&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;WaitNextMillis&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;clock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lastMs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lastMs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;now&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;seq&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lastMs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;now&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;seq&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;ts&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lastMs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;customEpoch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ts&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;timeShift&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nodeID&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nodeShift&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;seq&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And with that I was able to pass the challenge just fine. You can check the full solution to the challenge &lt;a href=&quot;https://github.com/brunocalza/gossip-glomers/tree/main/maelstrom-unique-ids&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

      </content>
    </entry>
  
    <entry>
      <title>When Does C++ Call the Move Constructor?</title>
      <link href="https://bcalza.b-cdn.net/blog/2025/09/04/when-does-c-call-the-move-constructor.html" />
      <id>https://bcalza.b-cdn.net/blog/2025/09/04/when-does-c-call-the-move-constructor</id>
      <updated>2025-09-04T22:35:40+00:00</updated>
      <content type="html">
        &lt;p&gt;This is a note to myself on some of the occasions the &lt;strong&gt;move constructor&lt;/strong&gt; is called in C++. Let’s use a simple class that implements &lt;strong&gt;move constructor&lt;/strong&gt; and &lt;strong&gt;move assignment&lt;/strong&gt; with some print statements for debugging/confirmation.&lt;/p&gt;

&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;A&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nl&quot;&gt;public:&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cout&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;Default constructor called&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

  &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;noexcept&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cout&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;Move constructor called&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

  &lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;operator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;noexcept&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cout&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;Move assignment called&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I explore 5 cases, which I think cover most of the situations. I’d love to know if there are more.&lt;/p&gt;

&lt;h2 id=&quot;1st-case---explicit&quot;&gt;1st Case - Explicit&lt;/h2&gt;

&lt;p&gt;The move constructor is called when we explicitly pass an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rvalue&lt;/code&gt; to the class constructor:&lt;/p&gt;

&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// Default constructor called&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;a2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;move&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// Move constructor called&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;2nd-case---initializing-from-a-temporary-object&quot;&gt;2nd Case - Initializing from a temporary object&lt;/h2&gt;

&lt;p&gt;When we initialize a variable from an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rvalue&lt;/code&gt;, the move constructor is called.&lt;/p&gt;

&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;auto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a3&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;move&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// Move constructor called&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Note that assigning to a variable already initialized does not call the move constructor, but move assignment instead:&lt;/p&gt;

&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;a4&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;move&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// Move assignment called&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;3rd-case---returning-a-local-object-from-a-function&quot;&gt;3rd Case - Returning a local object from a function&lt;/h2&gt;

&lt;p&gt;This is an interesting one. When we have a function that returns a local object, for example:&lt;/p&gt;

&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;make_a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The move constructor can or cannot be called depending on whether the &lt;a href=&quot;https://en.wikipedia.org/wiki/Copy_elision&quot;&gt;&lt;strong&gt;Return Value Optimization&lt;/strong&gt;&lt;/a&gt; is applied or not. For this case, in C++17, the move constructor will not be called because the optimization is applied. We can disable RVO with the flag &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-fno-elide-constructors&lt;/code&gt;, and see the move constructor being called.&lt;/p&gt;

&lt;h2 id=&quot;4th-case---passing-rvalue-to-a-function&quot;&gt;4th Case - Passing &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rvalue&lt;/code&gt; to a function&lt;/h2&gt;

&lt;p&gt;Here we have a function that receives &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;A&lt;/code&gt; by value, meaning a new object &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;A&lt;/code&gt; must be constructed. However, if you pass an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rvalue&lt;/code&gt; to that function, you get a move constructor call.&lt;/p&gt;

&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;receive_a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;receive_a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;move&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// Move constructor called&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Think of this as the same as &lt;strong&gt;2nd Case&lt;/strong&gt;:&lt;/p&gt;

&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;move&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Think about what happens in the following cases:&lt;/p&gt;

&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;receive_a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;

&lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;receive_a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;move&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;5th-case-emplacing-into-standard-containers&quot;&gt;5th Case: Emplacing into standard containers&lt;/h2&gt;

&lt;p&gt;Standard containers usually implement move semantics and take ownership when an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rvalue&lt;/code&gt; is passed:&lt;/p&gt;

&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;vector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;push_back&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;move&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// Move constructor called&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

      </content>
    </entry>
  
    <entry>
      <title>Scoped Threads in Rust</title>
      <link href="https://bcalza.b-cdn.net/blog/2025/09/01/scoped-threads.html" />
      <id>https://bcalza.b-cdn.net/blog/2025/09/01/scoped-threads</id>
      <updated>2025-09-01T21:17:25+00:00</updated>
      <content type="html">
        &lt;p&gt;Reading &lt;a href=&quot;https://matklad.github.io/2020/07/15/two-beautiful-programs.html&quot;&gt;Two Beautiful Rust Programs&lt;/a&gt; I came across a construct I hadn’t seen before for working with threads in Rust: &lt;a href=&quot;https://doc.rust-lang.org/std/thread/fn.scope.html&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;std::thread::scope&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;When you create a thread in Rust, you must pass a closure of the code you want that thread to execute. Sometimes, that closure works on data from the parent scope (the code that is spawning the thread):&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;nd&quot;&gt;println!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;counter = {}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;thread&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;spawn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;That code does not compile. This is essentially because &lt;a href=&quot;https://doc.rust-lang.org/std/thread/fn.spawn.html&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;std::thread::spawn&lt;/code&gt;&lt;/a&gt; takes a closure with the bound &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&apos;static&lt;/code&gt;, which means &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;counter&lt;/code&gt; is technically able to outlive &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;main&lt;/code&gt;. You cannot have a reference to something that might not be there (dangling pointer). So, you must force the closure to take ownership of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;counter&lt;/code&gt;. We fix that by adding &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;move&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;move&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;nd&quot;&gt;println!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;counter = {}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;thread&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;spawn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;What happens if we access counter at the end of the program?&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;move&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;nd&quot;&gt;println!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;counter = {}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;thread&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;spawn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;

    &lt;span class=&quot;nd&quot;&gt;println!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;counter = {}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;counter &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; 1
counter &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; 0
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Because &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;counter&lt;/code&gt; implements &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Copy&lt;/code&gt;, a copy of it is made and passed to the closure. What if we don’t want that behavior? Suppose we want to &lt;strong&gt;mutably borrow&lt;/strong&gt; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;counter&lt;/code&gt;. We can use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;std::thread::scope&lt;/code&gt; to achieve this:&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;nd&quot;&gt;println!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;counter = {}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;

    &lt;span class=&quot;nn&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;thread&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;scope&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.spawn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;

    &lt;span class=&quot;nd&quot;&gt;println!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;counter = {}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And that works:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;counter &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; 1
counter &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; 1
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Note that we don’t use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;move&lt;/code&gt; anymore. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;std::thread::scope&lt;/code&gt; defines a lifetime &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&apos;scope&lt;/code&gt; that is contained within the parent scope and lasts for the duration of the call to scope. It is guaranteed that all threads spawned within the scope will be joined before scope returns. Because threads cannot outlive &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&apos;scope&lt;/code&gt;, any data borrowed by their closures is guaranteed to remain valid for the lifetime of the threads. When you access &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;counter&lt;/code&gt; again, there’s no mutable reference to it anymore.&lt;/p&gt;

&lt;p&gt;Let’s use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Box&lt;/code&gt; as an example of some heap-allocated data that does not implement &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Copy&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;Box&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;move&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;nd&quot;&gt;println!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;counter = {}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;thread&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;spawn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;That works just fine. The problem is that if we want to access &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;counter&lt;/code&gt; at the end of the program, we get an error:&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nd&quot;&gt;println!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;counter = {}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;error[E0382]: borrow of moved value: &lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;counter&lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; src/main.rs:30:30
   |
20 |     &lt;span class=&quot;nb&quot;&gt;let &lt;/span&gt;counter &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; Box::new&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;0&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
   |         &lt;span class=&quot;nt&quot;&gt;-------&lt;/span&gt; move occurs because &lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;counter&lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt; has &lt;span class=&quot;nb&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;Box&amp;lt;i32&amp;gt;&lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;, which does not implement the &lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;Copy&lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt; trait
21 |
22 |     &lt;span class=&quot;nb&quot;&gt;let &lt;/span&gt;f &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; move &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
   |             &lt;span class=&quot;nt&quot;&gt;-------&lt;/span&gt; value moved into closure here
23 |         &lt;span class=&quot;nb&quot;&gt;let &lt;/span&gt;mut c &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; counter&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
   |                     &lt;span class=&quot;nt&quot;&gt;-------&lt;/span&gt; variable moved due to use &lt;span class=&quot;k&quot;&gt;in &lt;/span&gt;closure
...
30 |     println!&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;counter = {}&quot;&lt;/span&gt;, &lt;span class=&quot;k&quot;&gt;*&lt;/span&gt;counter&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
   |                              ^^^^^^^^ value borrowed here after move
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;counter&lt;/code&gt; was moved and we don’t have access to it after the thread is finished. We can use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;std::thread::scope&lt;/code&gt; and &lt;strong&gt;mutably borrow&lt;/strong&gt; the counter to fix this:&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;Box&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;i32&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;nd&quot;&gt;println!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;counter = {}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;

    &lt;span class=&quot;nn&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;thread&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;scope&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.spawn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;

    &lt;span class=&quot;nd&quot;&gt;println!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;counter = {}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;There it is, a simple explanation of how to use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;std::thread::scope&lt;/code&gt; to execute a thread on borrowed data from the parent scope.&lt;/p&gt;

      </content>
    </entry>
  
    <entry>
      <title>Writing a simple lexer in Rust</title>
      <link href="https://bcalza.b-cdn.net/2023/09/20/writing-a-simple-lexer-in-rust.html" />
      <id>https://bcalza.b-cdn.net/2023/09/20/writing-a-simple-lexer-in-rust</id>
      <updated>2023-09-20T00:00:00+00:00</updated>
      <content type="html">
        &lt;p&gt;Recently, I began to delve deeper into Rust. As a way to become more familiar with the language, I decided to write a simple &lt;em&gt;lexer&lt;/em&gt; for a mathematical expression such as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;10 - 3 + ( ( 4 / 2 ) * ( 8 * 4 ) )&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Writing a &lt;em&gt;lexer&lt;/em&gt; shouldn’t be a difficult task, especially if you’ve built one in another language. I’ve tried to explore Rust features and write idiomatic code as much as possible, without getting too attached to past implementations in other languages.&lt;/p&gt;

&lt;p&gt;If you’re not familiar with compiler theory, essentially a &lt;em&gt;lexer&lt;/em&gt; is a phase of the compiler that transforms your text input into a list of meaningful tokens. Essentially, we want to implement a function like this:&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;tokenizer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Vec&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
	&lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In reality, &lt;em&gt;lexers&lt;/em&gt; are not written in a way that produces all of the tokens at once. Usually they produce one token at a time, in an iterator fashion, and that token is passed to the parser that will check grammar rules and build an Abstract Syntax Tree.&lt;/p&gt;

&lt;p&gt;We start by defining our tokens using an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Enum&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;enum&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Token&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;nf&quot;&gt;Number&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;i64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;Plus&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;Dash&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;Star&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;Slash&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;LeftParen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;RightParen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;EOF&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;For now, we will focus on the four basic operations, parentheses, and positive integers.&lt;/p&gt;

&lt;p&gt;The typical approach to implementing a &lt;em&gt;lexer&lt;/em&gt; is by iterating through the text and checking whether the current character is relevant for creating a token. This can be achieved using either a slice and indices or an iterator. In this task, I have chosen to use an iterator as it is considered the more idiomatic approach and allows me to explore the iterator-related APIs.&lt;/p&gt;

&lt;p&gt;Essentially, the core pattern we need is:&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Vec&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;Vec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.chars&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;Some&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iter&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.next&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;match&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    	&lt;span class=&quot;c1&quot;&gt;// pattern matching logic&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;// tokens.push(...)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We loop through each character and apply pattern matching logic to it.&lt;/p&gt;

&lt;p&gt;Since our mathematical expression is very simple, most of the characters are tokens themselves. Additionally, whitespaces have no meaning in our expression. Therefore:&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;match&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.is_whitespace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;continue&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;sc&quot;&gt;&apos;(&apos;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LeftParen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;sc&quot;&gt;&apos;)&apos;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RightParen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;sc&quot;&gt;&apos;+&apos;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Plus&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;sc&quot;&gt;&apos;-&apos;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Dash&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;sc&quot;&gt;&apos;*&apos;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Star&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The only token that is slightly more complex is the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Number&lt;/code&gt;, which is a sequence of characters. To implement it, we can utilize the matching ranges feature:&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;match&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;
    &lt;span class=&quot;sc&quot;&gt;&apos;1&apos;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;..=&lt;/span&gt;&lt;span class=&quot;sc&quot;&gt;&apos;9&apos;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;
     &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;To implement the body, I have decided to take a functional approach and use only iterators. The idea is to consume the iterator by extracting characters as long as they are digits. We can achieve this by using &lt;a href=&quot;https://doc.rust-lang.org/std/iter/struct.TakeWhile.html&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;take_while&lt;/code&gt;&lt;/a&gt;:&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;match&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;
    &lt;span class=&quot;sc&quot;&gt;&apos;1&apos;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;..=&lt;/span&gt;&lt;span class=&quot;sc&quot;&gt;&apos;9&apos;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
       &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iter&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.take_while&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;match&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.is_ascii_digit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.collect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
     &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
	
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The issue with that approach is that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;s&lt;/code&gt; would not include the first digit, which is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ch&lt;/code&gt;. Additionally, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;take_while&lt;/code&gt; would have already consumed the first non-digit character, resulting in the loss of one character in the subsequent iteration of the loop.&lt;/p&gt;

&lt;p&gt;The way to work around this is by making &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;iter&lt;/code&gt; &lt;a href=&quot;https://doc.rust-lang.org/stable/std/iter/struct.Peekable.html&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Peekable&lt;/code&gt;&lt;/a&gt; and making use of &lt;a href=&quot;https://doc.rust-lang.org/stable/std/iter/struct.Peekable.html#method.next_if&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;next_if&lt;/code&gt;&lt;/a&gt;. With it we can &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;peek&lt;/code&gt; at the next element and only consume it if it is a digit. Also, by combining it with &lt;a href=&quot;https://doc.rust-lang.org/std/iter/fn.from_fn.html&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;from_fn&lt;/code&gt;&lt;/a&gt; we can create a new iterator chained with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ch&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;sc&quot;&gt;&apos;1&apos;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;..=&lt;/span&gt;&lt;span class=&quot;sc&quot;&gt;&apos;9&apos;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;i64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;iter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;once&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;nf&quot;&gt;.chain&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;from_fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(||&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iter&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.by_ref&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.next_if&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.is_ascii_digit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())))&lt;/span&gt;
        &lt;span class=&quot;py&quot;&gt;.collect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;nf&quot;&gt;.parse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;nf&quot;&gt;.unwrap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;

     &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;Number&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;With that we finish our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tokenizer&lt;/code&gt; function:&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;tokenizer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Vec&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Vec&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;Vec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.chars&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.peekable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;Some&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iter&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.next&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;match&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.is_whitespace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;continue&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
            &lt;span class=&quot;sc&quot;&gt;&apos;(&apos;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LeftParen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
            &lt;span class=&quot;sc&quot;&gt;&apos;)&apos;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RightParen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
            &lt;span class=&quot;sc&quot;&gt;&apos;+&apos;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Plus&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
            &lt;span class=&quot;sc&quot;&gt;&apos;-&apos;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Dash&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
            &lt;span class=&quot;sc&quot;&gt;&apos;*&apos;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Star&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
            &lt;span class=&quot;sc&quot;&gt;&apos;1&apos;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;..=&lt;/span&gt;&lt;span class=&quot;sc&quot;&gt;&apos;9&apos;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;i64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;iter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;once&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
                    &lt;span class=&quot;nf&quot;&gt;.chain&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;from_fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(||&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iter&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.by_ref&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.next_if&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.is_ascii_digit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())))&lt;/span&gt;
                    &lt;span class=&quot;py&quot;&gt;.collect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
                    &lt;span class=&quot;nf&quot;&gt;.parse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
                    &lt;span class=&quot;nf&quot;&gt;.unwrap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;

                &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;Number&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
            &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
                &lt;span class=&quot;nd&quot;&gt;panic!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;unrecognized char&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
            &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;EOF&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We can improve our error handling by defining a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SyntaxError&lt;/code&gt; struct:&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nd&quot;&gt;#[derive(Debug)]&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SyntaxError&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;message&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;impl&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SyntaxError&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;message&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;Self&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;SyntaxError&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;message&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;and returning &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Result&amp;lt;Vec&amp;lt;Token&amp;gt;, SyntaxError&amp;gt;&lt;/code&gt;&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;tokenizer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Result&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;Vec&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SyntaxError&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Vec&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;Vec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.chars&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.peekable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;Some&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iter&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.next&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;match&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.is_whitespace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;continue&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
            &lt;span class=&quot;sc&quot;&gt;&apos;(&apos;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LeftParen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
            &lt;span class=&quot;sc&quot;&gt;&apos;)&apos;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RightParen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
            &lt;span class=&quot;sc&quot;&gt;&apos;+&apos;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Plus&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
            &lt;span class=&quot;sc&quot;&gt;&apos;-&apos;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Dash&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
            &lt;span class=&quot;sc&quot;&gt;&apos;*&apos;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Star&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
            &lt;span class=&quot;sc&quot;&gt;&apos;1&apos;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;..=&lt;/span&gt;&lt;span class=&quot;sc&quot;&gt;&apos;9&apos;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;i64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;iter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;once&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
                    &lt;span class=&quot;nf&quot;&gt;.chain&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;from_fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(||&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iter&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.by_ref&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.next_if&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.is_ascii_digit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())))&lt;/span&gt;
                    &lt;span class=&quot;py&quot;&gt;.collect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
                    &lt;span class=&quot;nf&quot;&gt;.parse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
                    &lt;span class=&quot;nf&quot;&gt;.unwrap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;

                &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;Number&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
            &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;Err&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;SyntaxError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nd&quot;&gt;format!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;unrecognized character {}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;EOF&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;nf&quot;&gt;Ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tokens&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It wasn’t a very hard exercise but sufficient for me to get familiarized with Rust’s syntax, its type system, some error handling patterns, iterator-related APIs, and pattern-matching syntax. I’ve also been doing Rustlings exercises on the side. For someone used to programming in Go, I’ve often found reading Rust code to be challenging. However, I’m gradually beginning to appreciate its elegance.&lt;/p&gt;

&lt;p&gt;Future posts ideas: work on an iterator-based version of the &lt;em&gt;lexer&lt;/em&gt;; build a parser to evaluate the expression; build a &lt;em&gt;lexer&lt;/em&gt; for a more complex language.&lt;/p&gt;

      </content>
    </entry>
  
    <entry>
      <title>The day I discovered vmtouch</title>
      <link href="https://bcalza.b-cdn.net/2023/01/27/the-day-i-discoverd-vmtouch.html" />
      <id>https://bcalza.b-cdn.net/2023/01/27/the-day-i-discoverd-vmtouch</id>
      <updated>2023-01-27T00:00:00+00:00</updated>
      <content type="html">
        &lt;h1 id=&quot;introduction&quot;&gt;Introduction&lt;/h1&gt;

&lt;p&gt;Last weekend I decided to take a deeper look at the famous &lt;a href=&quot;https://www.sqlite.org/fasterthanfs.html&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SQLite&lt;/code&gt;&lt;/a&gt;&lt;a href=&quot;https://www.sqlite.org/fasterthanfs.html&quot;&gt;35% Faster Than The Filesystem&lt;/a&gt; benchmark. I didn’t want to do a shallow read of the post. I wanted to compile the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kvtest&lt;/code&gt; tool and run the experiments myself and see what is going on. I recommend doing that, especially if you want to follow along with the blog post.&lt;/p&gt;

&lt;p&gt;While running the read experiments, something caught my attention. It seems that to see good performance results using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SQLite&lt;/code&gt;, you have to run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kvtest&lt;/code&gt; twice discarding the first run results. So the first run is used to load data into the cache, and the second and following runs take advantage of that. That makes sense but for some reason that intrigued me a bit and I became curious to understand more about what was happening between the first and second run.&lt;/p&gt;

&lt;p&gt;The first question that popped into my mind was: &lt;em&gt;what kind of caching is happening here?&lt;/em&gt; I understand that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SQLite&lt;/code&gt; has its buffer pool implementation that caches pages into memory. But, it’s clear that the performance does not come from that since the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kvtest&lt;/code&gt; program is completely finished after the first run. In the second run, the buffer would start empty (I suppose). So, the only place left for caching is in the &lt;em&gt;Operating System&lt;/em&gt;. That reasoning gained more credibility when I saw the Figure 1 image from &lt;a href=&quot;https://www.google.com.br/books/edition/SQLite_Database_System_Design_and_Implem/OEJ1CQAAQBAJ?hl=en&amp;amp;gbpv=0&quot;&gt;SQLite Database System: Design and Implementation&lt;/a&gt; book:
&lt;img src=&quot;/assets/img/sqlite_cache.png&quot; alt=&quot;Figure 1 - SQLite caching&quot; /&gt;
To confirm that I wanted to tinker with the disk page cache. In the search to figure out how I could pin or evict pages from the cache, I ended up, founding an awesome tool called &lt;a href=&quot;https://hoytech.com/vmtouch/&quot;&gt;vmtouch&lt;/a&gt;.&lt;/p&gt;

&lt;h1 id=&quot;confirming-that-the-os-is-caching-the-sqlite-data&quot;&gt;Confirming that the OS is caching the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SQLite&lt;/code&gt; data&lt;/h1&gt;

&lt;ul&gt;
  &lt;li&gt;Let’s first purge the OS page cache: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sudo purge&lt;/code&gt; (in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;MacOs&lt;/code&gt;, in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Linux&lt;/code&gt; it seems that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;echo 3 &amp;gt; /proc/sys/vm/drop_caches&lt;/code&gt; does the trick)&lt;/li&gt;
  &lt;li&gt;Calling &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;vmtouch test1.db&lt;/code&gt; tells that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0%&lt;/code&gt; of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;test1.db&lt;/code&gt; is cached&lt;/li&gt;
&lt;/ul&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; vmtouch test1.db                             
Files: 1
Directories: 0
Resident Pages: 0/65579  0/1G  0%
Elapsed: 0.001622 seconds
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;ul&gt;
  &lt;li&gt;Running the test, then running &lt;strong&gt;vmtouch&lt;/strong&gt; again&lt;/li&gt;
&lt;/ul&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; kvtest % ./kvtest run test1.db &lt;span class=&quot;nt&quot;&gt;--count&lt;/span&gt; 100k &lt;span class=&quot;nt&quot;&gt;--blob-api&lt;/span&gt;
SQLite version: 3.39.4
&lt;span class=&quot;nt&quot;&gt;--count&lt;/span&gt; 100000 &lt;span class=&quot;nt&quot;&gt;--max-id&lt;/span&gt; 100000 &lt;span class=&quot;nt&quot;&gt;--asc&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;--cache-size&lt;/span&gt; 1000 &lt;span class=&quot;nt&quot;&gt;--jmode&lt;/span&gt; delete
&lt;span class=&quot;nt&quot;&gt;--mmap&lt;/span&gt; 0 &lt;span class=&quot;nt&quot;&gt;--blob-api&lt;/span&gt;
Database page size: 4096
Total elapsed &lt;span class=&quot;nb&quot;&gt;time&lt;/span&gt;: 2.140
Microseconds per BLOB &lt;span class=&quot;nb&quot;&gt;read&lt;/span&gt;: 21.400
Content &lt;span class=&quot;nb&quot;&gt;read &lt;/span&gt;rate: 467.2 MB/s


&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; vmtouch test1.db                             
           Files: 1
     Directories: 0
  Resident Pages: 65579/65579  1G/1G  100%
         Elapsed: 0.012271 seconds
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;ul&gt;
  &lt;li&gt;You can see that our entire database is cached now. And that’s why the second run is much faster.&lt;/li&gt;
  &lt;li&gt;If you run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;vmtouch test1.db&lt;/code&gt; again after some period of the first run you see a number smaller than &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;100%&lt;/code&gt;, which means the OS has evicted some pages already. So the longer you wait to execute the second run, the worse the performance will be.&lt;/li&gt;
&lt;/ul&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; vmtouch test1.db
           Files: 1
     Directories: 0
  Resident Pages: 51595/65579  806M/1G  78.7%
         Elapsed: 0.017729 seconds
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Things you can try:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Before executing the first run, put all of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;test1.db&lt;/code&gt; into the cache by running &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;vmtouch -t test1.db&lt;/code&gt;, then execute the test. You’ll already see good performance results in the first run;&lt;/li&gt;
  &lt;li&gt;You can execute the first run, evict all pages from the cache and execute again.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This confirmed that the performance is really coming from the OS disk page cache. And with that came, the question: isn’t the OS caching the files in the FileSystem approach? If not, why?&lt;/p&gt;

&lt;h1 id=&quot;playing-directly-with-the-files&quot;&gt;Playing directly with the files&lt;/h1&gt;

&lt;p&gt;Now I’ve tried to understand what was happening with the cache when running the test directly on the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;FileSystem&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; ./kvtest run test1.dir &lt;span class=&quot;nt&quot;&gt;--count&lt;/span&gt; 100k &lt;span class=&quot;nt&quot;&gt;--blob-api&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;--count&lt;/span&gt; 100000 &lt;span class=&quot;nt&quot;&gt;--max-id&lt;/span&gt; 1000 &lt;span class=&quot;nt&quot;&gt;--asc&lt;/span&gt;
Total elapsed &lt;span class=&quot;nb&quot;&gt;time&lt;/span&gt;: 0.776
Microseconds per BLOB &lt;span class=&quot;nb&quot;&gt;read&lt;/span&gt;: 7.760
Content &lt;span class=&quot;nb&quot;&gt;read &lt;/span&gt;rate: 1282.1 MB/s


&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; kvtest % vmtouch test1.dir                             
           Files: 100000
     Directories: 1
  Resident Pages: 1000/100000  15M/1G  1%
         Elapsed: 1.0113 seconds
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Only &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1%&lt;/code&gt; of the files were cached, after the first run. That is pretty interesting. It seems that the caching behavior of the OS is a lot different when accessing the blobs via &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SQLite&lt;/code&gt; than via FileSystem. What if I force the files to be in the cache? You can do that by using the flags &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-dl&lt;/code&gt;. You can see the percentages increase until it reaches 100% and it stays that way.&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; vmtouch &lt;span class=&quot;nt&quot;&gt;-dl&lt;/span&gt; test1.dir
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; vmtouch test1.dir                             
           Files: 100000
     Directories: 1
  Resident Pages: 50115/100000  783M/1G  50.1%
         Elapsed: 1.159 seconds


&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; vmtouch test1.dir
           Files: 100000
     Directories: 1
  Resident Pages: 78288/100000  1G/1G  78.3%
         Elapsed: 1.0382 seconds


&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; vmtouch test1.dir
           Files: 100000
     Directories: 1
  Resident Pages: 95351/100000  1G/1G  95.4%
         Elapsed: 1.0803 seconds


&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; vmtouch test1.dir
           Files: 100000
     Directories: 1
  Resident Pages: 100000/100000  1G/1G  100%
         Elapsed: 1.0232 seconds
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Running the test again, we get similar performance results:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; ./kvtest run test1.dir &lt;span class=&quot;nt&quot;&gt;--count&lt;/span&gt; 100k &lt;span class=&quot;nt&quot;&gt;--blob-api&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;--count&lt;/span&gt; 100000 &lt;span class=&quot;nt&quot;&gt;--max-id&lt;/span&gt; 1000 &lt;span class=&quot;nt&quot;&gt;--asc&lt;/span&gt;
Total elapsed &lt;span class=&quot;nb&quot;&gt;time&lt;/span&gt;: 0.736
Microseconds per BLOB &lt;span class=&quot;nb&quot;&gt;read&lt;/span&gt;: 7.360
Content &lt;span class=&quot;nb&quot;&gt;read &lt;/span&gt;rate: 1351.7 MB/s
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;So not only does the caching behave differently in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;FileSystem&lt;/code&gt; approach, but it also does not impact the performance.&lt;/p&gt;

&lt;h1 id=&quot;final-thoughts&quot;&gt;Final thoughts&lt;/h1&gt;

&lt;p&gt;I’m not an OS expert and have no idea how the cache replacement policy of an OS works but seems reasonable to expect 100,000 syscalls to open and read from a file would have an impact on the cache behavior; and also how doing so many syscalls seems to make the caching insignificant. In the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SQLite&lt;/code&gt; approach, you are doing that only once. The article is pretty good but the way the benchmark was set up may lead to wrong conclusions if you’re not alert (not saying the article concluded things wrongly). Most of the performance comes from how the data was organized and accessed, which is usually a design decision a Software Engineer should make when targeting a use case.&lt;/p&gt;

&lt;p&gt;Other than that, the greatest reward for reading the article was discovering &lt;strong&gt;vmtouch&lt;/strong&gt;. Such a simple tool that enables awesome performance debugging. And also, I’m also in awe of how following your curiosity leads you to find things you could never have imagined before you started. One of the reasons I did not want to do a shallow reading on this article.&lt;/p&gt;

      </content>
    </entry>
  
    <entry>
      <title>Making a change to SQLite source code</title>
      <link href="https://bcalza.b-cdn.net/2022/10/18/making-a-change-to-sqlite-source-code.html" />
      <id>https://bcalza.b-cdn.net/2022/10/18/making-a-change-to-sqlite-source-code</id>
      <updated>2022-10-18T00:00:00+00:00</updated>
      <content type="html">
        &lt;h1 id=&quot;introduction&quot;&gt;Introduction&lt;/h1&gt;

&lt;p&gt;The other day, I was thinking about how I could get the bytes of a &lt;a href=&quot;https://www.sqlite.org/fileformat.html#record_format&quot;&gt;record&lt;/a&gt; of a recently inserted or updated row in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SQLite&lt;/code&gt;. The motivation for that is that I wanted to create a hash of that row, essentially, to be able to build a &lt;a href=&quot;https://en.wikipedia.org/wiki/Merkle_tree&quot;&gt;Merkle Tree&lt;/a&gt; of the corresponding table as rows get inserted or updated.&lt;/p&gt;

&lt;p&gt;The closest API that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SQLite&lt;/code&gt; offers to what I was looking for is the &lt;a href=&quot;https://www.sqlite.org/c3ref/update_hook.html&quot;&gt;sqlite3_update_hook&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sqlite3_update_hook()&lt;/code&gt; interface registers a callback function with the database connection identified by the first argument to be invoked whenever a row is updated, inserted or deleted in a rowid table.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The issue with that API is that it only returns the &lt;a href=&quot;https://sqlite.org/lang_createtable.html#rowid&quot;&gt;rowid&lt;/a&gt; of the row. That means I would have to fetch all the columns for the rows inside the table. And even with that approach, I would still not get the raw bytes of the row record. I would just get the driver’s representation of that row.&lt;/p&gt;

&lt;p&gt;There are probably plenty of approaches to how I could build such a tree, but as far as I’m concerned, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SQLite&lt;/code&gt; API does not offer what I was exactly looking for. So, I decided to dig deeper into the source code to see if there could be a world where I would understand it, and not only that, I could actually make some changes to it to provide what I was hoping for.&lt;/p&gt;

&lt;p&gt;I have always felt intimidated by reading &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;C&lt;/code&gt; language code. So I thought it would be just one of those times of opening a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;C&lt;/code&gt; language file and giving up minutes later. But it turns out this time was different.&lt;/p&gt;

&lt;h1 id=&quot;navigating-the-sqlite-source-code&quot;&gt;Navigating the SQLite source code&lt;/h1&gt;

&lt;p&gt;I’ve cloned &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SQLite&lt;/code&gt;&lt;a href=&quot;https://sqlite.org/src/doc/trunk/README.md&quot;&gt;source code&lt;/a&gt; using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;fossil&lt;/code&gt; and started navigating the files structure. 
&lt;img src=&quot;/assets/img/sqlite_file_structure.png&quot; alt=&quot;&quot; /&gt;Figure 1 - SQLite src directory
If you’re somewhat familiar with how databases work you can probably imagine what some files are responsible for. Not bad. I decided to jump straight to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;insert.c&lt;/code&gt; file to see if I could find something interesting there.&lt;/p&gt;

&lt;p&gt;If you skim through all function implementation you’ll probably hit &lt;a href=&quot;https://github.com/sqlite/sqlite/blob/version-3.39.4/src/insert.c#L671&quot;&gt;sqlite3Insert&lt;/a&gt;. Above the function signature, we see:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;**&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;This&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;routine&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;called&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;handle&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SQL&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;of&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;the&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;following&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;forms&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;**&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;**&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;insert&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;into&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IDLIST&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;values&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;EXPRLIST&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;EXPRLIST&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),...&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;**&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;insert&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;into&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IDLIST&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;select&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;**&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;insert&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;into&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TABLE&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IDLIST&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;default&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;values&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;**&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Maybe inside this function, there was something that I could tweak. I was able to make some guesses about what’s happening in there, but what caught my eye was the number of function calls to functions of a name similar to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sqlite3vdbeXXX&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That reminded me that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SQLite&lt;/code&gt; uses a virtual machine called &lt;a href=&quot;https://www.sqlite.org/opcode.html&quot;&gt;vdbe&lt;/a&gt;. That means all SQL statements are translated to the language of this virtual machine first. Then, the execution engine executes the virtual machine code. Let’s look at an example of how a simpler &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;INSERT&lt;/code&gt; is translated into bytecode:&lt;/p&gt;

&lt;div class=&quot;language-sql highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;sqlite&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;create&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;table&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;sqlite&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;explain&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;INSERT&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;INTO&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;VALUES&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;Hello&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;addr&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;opcode&lt;/span&gt;         &lt;span class=&quot;n&quot;&gt;p1&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;p2&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;p3&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;p4&lt;/span&gt;             &lt;span class=&quot;n&quot;&gt;p5&lt;/span&gt;  &lt;span class=&quot;k&quot;&gt;comment&lt;/span&gt;      
&lt;span class=&quot;c1&quot;&gt;----  -------------  ----  ----  ----  -------------  --  -------------&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;     &lt;span class=&quot;n&quot;&gt;Init&lt;/span&gt;           &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;     &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;     &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;                    &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;   &lt;span class=&quot;k&quot;&gt;Start&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;at&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;     &lt;span class=&quot;n&quot;&gt;OpenWrite&lt;/span&gt;      &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;     &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;     &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;     &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;              &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;root&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iDb&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;     &lt;span class=&quot;nb&quot;&gt;Integer&lt;/span&gt;        &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;     &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;     &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;                    &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;     &lt;span class=&quot;n&quot;&gt;String8&lt;/span&gt;        &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;     &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;     &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;     &lt;span class=&quot;n&quot;&gt;Hello&lt;/span&gt;          &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;Hello&apos;&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;     &lt;span class=&quot;n&quot;&gt;NewRowid&lt;/span&gt;       &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;     &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;     &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;                    &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rowid&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;     &lt;span class=&quot;n&quot;&gt;MakeRecord&lt;/span&gt;     &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;     &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;     &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;     &lt;span class=&quot;n&quot;&gt;DB&lt;/span&gt;             &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mkrec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;..&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;     &lt;span class=&quot;k&quot;&gt;Insert&lt;/span&gt;         &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;     &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;     &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;     &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;              &lt;span class=&quot;mi&quot;&gt;57&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;intkey&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;     &lt;span class=&quot;n&quot;&gt;Halt&lt;/span&gt;           &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;     &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;     &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;                    &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;   
&lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;     &lt;span class=&quot;n&quot;&gt;Transaction&lt;/span&gt;    &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;     &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;     &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;     &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;              &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;usesStmtJournal&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;9&lt;/span&gt;     &lt;span class=&quot;k&quot;&gt;Goto&lt;/span&gt;           &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;     &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;     &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;                    &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;   
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I concluded that all &lt;a href=&quot;https://github.com/sqlite/sqlite/blob/version-3.39.4/src/insert.c#L671&quot;&gt;sqlite3Insert&lt;/a&gt; is really doing is translating a parsed &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;INSERT&lt;/code&gt; statement to a bunch of virtual machine operations, according to all rules of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SQLite&lt;/code&gt; insertion.&lt;/p&gt;

&lt;p&gt;This was not really the place I was looking for. What I really needed was the place where the record is created before insertion. That could only be the place that is executing the virtual machine code, probably the place that is executing the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Insert (&lt;/code&gt;OP_INSERT&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;)&lt;/code&gt;  opcode, I thought.&lt;/p&gt;

&lt;p&gt;Looking at &lt;em&gt;Figure 1&lt;/em&gt;, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;vdbe.c&lt;/code&gt; file felt like a reasonable place to look for that. I went straight there.&lt;/p&gt;

&lt;p&gt;What I found there was an &lt;strong&gt;8000&lt;/strong&gt; lines &lt;a href=&quot;https://github.com/sqlite/sqlite/blob/master/src/vdbe.c#L875&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;switch( pOp-&amp;gt;opcode )&lt;/code&gt;&lt;/a&gt; statement and with a simple &lt;em&gt;CMD+F&lt;/em&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;OP_INSERT&lt;/code&gt; I found the &lt;a href=&quot;https://github.com/sqlite/sqlite/blob/version-3.39.4/src/vdbe.c#L5393&quot;&gt;place&lt;/a&gt; that was handling the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Insert&lt;/code&gt; operation execution.&lt;/p&gt;

&lt;p&gt;The first line of the case I found a hint of what I was looking for:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;Mem&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pData&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;       &lt;span class=&quot;cm&quot;&gt;/* MEM cell holding data for the record to be inserted */&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;So &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pData&lt;/code&gt; points to the record data to be inserted. And you can see at &lt;a href=&quot;https://github.com/sqlite/sqlite/blob/version-3.39.4/src/vdbe.c#L5402&quot;&gt;[L54](https://github.com/sqlite/sqlite/blob/version-3.39.4/src/vdbe.c#L5402)02&lt;/a&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pData = &amp;amp;aMem[pOp-&amp;gt;p2];&lt;/code&gt;, how it is setting &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pData&lt;/code&gt; value to the virtual machine memory &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;aMem&lt;/code&gt; address at the position pointed by virtual machine register &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;p2&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Quick recap: at &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;insert.c&lt;/code&gt; we’ve learned that the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;INSERT&lt;/code&gt; statement was translated into a bunch of virtual machine code. The data from the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;INSERT&lt;/code&gt; went to the virtual machine through those &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sqlite3vdbeXXX&lt;/code&gt; calls. I assume the call that registered the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;OP_INSERT&lt;/code&gt; opcode and the data into the virtual machine is the one at line &lt;a href=&quot;https://github.com/sqlite/sqlite/blob/version-3.39.4/src/insert.c#L2593&quot;&gt;2593&lt;/a&gt;:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;sqlite3VdbeAddOp3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;OP_Insert&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iDataCur&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;aRegIdx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;regNewData&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And &lt;a href=&quot;https://github.com/sqlite/sqlite/blob/version-3.39.4/src/insert.c#L1560&quot;&gt;here&lt;/a&gt; is a nice description of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;regNewData&lt;/code&gt; :&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;**&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;The&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;regNewData&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;parameter&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;the&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;first&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;register&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;range&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;that&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;contains&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;**&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;the&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;be&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;inserted&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;or&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;the&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;after&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;the&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;update&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;There&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;will&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;be&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;**&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pTab&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nCol&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;registers&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;this&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;The&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;first&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;register&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;the&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;one&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;**&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;that&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;regNewData&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;points&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;will&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;contain&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;the&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rowid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;or&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;NULL&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;the&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;**&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;of&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;WITHOUT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ROWID&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;The&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;second&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;register&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;the&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;range&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;will&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;**&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;contain&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;the&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;content&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;of&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;the&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;first&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;column&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;The&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;third&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;register&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;will&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;**&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;contain&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;the&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;content&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;of&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;the&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;second&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;column&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;And&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;so&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;forth&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;**&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;**&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;The&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;regOldData&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;parameter&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;similar&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;regNewData&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;except&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;that&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;contains&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;**&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;the&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;prior&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;an&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;UPDATE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rather&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;than&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;afterwards&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;regOldData&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;zero&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;**&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;an&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;INSERT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;This&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;routine&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;can&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;distinguish&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;between&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;UPDATE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;and&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;INSERT&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;by&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;**&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;checking&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;regOldData&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;zero&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;So, at this point we are executing that machine code with that data. Scrolling a bit down, let’s see how &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pData&lt;/code&gt; is used. At &lt;a href=&quot;https://github.com/sqlite/sqlite/blob/version-3.39.4/src/vdbe.c#L5448-L5449&quot;&gt;L5448-L5449&lt;/a&gt;, we see:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pData&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nData&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x&lt;/code&gt;&lt;a href=&quot;https://github.com/sqlite/sqlite/blob/version-3.39.4/src/vdbe.c#L5400&quot;&gt;is&lt;/a&gt;:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;BtreePayload&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;   &lt;span class=&quot;cm&quot;&gt;/* Payload to be inserted */&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Awesome. Scrolling a bit more, we &lt;a href=&quot;https://github.com/sqlite/sqlite/blob/version-3.39.4/src/vdbe.c#L5457&quot;&gt;see&lt;/a&gt;:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;rc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sqlite3BtreeInsert&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pC&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;uc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pCursor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pOp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p5&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;OPFLAG_APPEND&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;OPFLAG_SAVEPOSITION&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;OPFLAG_PREFORMAT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt; 
    &lt;span class=&quot;n&quot;&gt;seekResult&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Cool, we found the place where the raw record byte is being inserted. But, how do we know that it is the record formatted as documented &lt;a href=&quot;https://www.sqlite.org/fileformat.html#record_format&quot;&gt;here&lt;/a&gt;? If you look closely at the virtual machine code from our example &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;INSERT&lt;/code&gt;, before the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Insert&lt;/code&gt; opcode there is a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;MakeRecord&lt;/code&gt; opcode, that is responsible for building the record.&lt;/p&gt;

&lt;p&gt;You can check the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;OP_MakeRecord&lt;/code&gt; implementation at &lt;a href=&quot;https://github.com/sqlite/sqlite/blob/version-3.39.4/src/vdbe.c#L3153&quot;&gt;vdbe.c&lt;/a&gt; file and see the following comment:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Convert P2 registers beginning with P1 into the [record format] use as a data record in a database table or as a key in an index.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And the &lt;a href=&quot;https://github.com/sqlite/sqlite/blob/version-3.39.4/src/vdbe.c#L5464-L5473&quot;&gt;last lines&lt;/a&gt; of the case statement, we find this:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;cm&quot;&gt;/* Invoke the update-hook if required. */&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rc&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;goto&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;abort_due_to_error&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pTab&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;){&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;assert&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xUpdateCallback&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;assert&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pTab&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aCol&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xUpdateCallback&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pUpdateArg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
         &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pOp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p5&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;OPFLAG_ISUPDATE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;?&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SQLITE_UPDATE&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SQLITE_INSERT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
         &lt;span class=&quot;n&quot;&gt;zDb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pTab&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;zName&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;break&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Yes! The update hook!&lt;/p&gt;

&lt;p&gt;So, here I have everything I need in my hands. The update hook and the record bytes. So, I just need to update the callback to add the record, and &lt;em&gt;voila.&lt;/em&gt;&lt;/p&gt;

&lt;h1 id=&quot;making-changes-to-sqlite&quot;&gt;Making changes to SQLite&lt;/h1&gt;

&lt;p&gt;That’s exactly what I did:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xUpdateCallback&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pUpdateArg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
	&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pOp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p5&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;OPFLAG_ISUPDATE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;?&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SQLITE_UPDATE&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SQLITE_INSERT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;zDb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pTab&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;zName&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pData&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Passing the payload (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pData-&amp;gt;z&lt;/code&gt;) and its size (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pData-&amp;gt;n&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;And, of course a bunch of more changes in multiple places to account for function signature change were necessary.&lt;/p&gt;

&lt;p&gt;Here’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;fossil status&lt;/code&gt; after the changes:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;EDITED     src/main.c
EDITED     src/sqlite.h.in
EDITED     src/sqlite3ext.h
EDITED     src/sqliteInt.h
EDITED     src/tclsqlite.c
EDITED     src/vdbe.c
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And the &lt;a href=&quot;https://gist.github.com/brunocalza/7fccffae20878694cc4cf8237af61de0&quot;&gt;diff&lt;/a&gt; of changes in case you’re following along. I’ve compiled my changes following the &lt;a href=&quot;https://sqlite.org/src/doc/trunk/README.md&quot;&gt;instructions&lt;/a&gt;.&lt;/p&gt;

&lt;h1 id=&quot;forking-gos-sqlite-driver&quot;&gt;Forking Go’s SQLite driver&lt;/h1&gt;

&lt;p&gt;All right! Now it’s time to create a simple test of my change in a Go program. I’m most familiar with &lt;a href=&quot;https://github.com/mattn/go-sqlite3&quot;&gt;mattn/go-sqlite3&lt;/a&gt; driver for interacting with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SQLite&lt;/code&gt;. This project offers &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SQLite&lt;/code&gt; API access through &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Go&lt;/code&gt; by importing the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SQLite&lt;/code&gt;&lt;a href=&quot;https://github.com/mattn/go-sqlite3/blob/master/sqlite3-binding.c&quot;&gt;amalgamation file&lt;/a&gt; and working through &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;C&lt;/code&gt; bindings.&lt;/p&gt;

&lt;p&gt;So I &lt;a href=&quot;https://github.com/brunocalza/go-sqlite3&quot;&gt;forked that repo&lt;/a&gt;, and imported &lt;a href=&quot;https://github.com/mattn/go-sqlite3/compare/master...brunocalza:go-sqlite3:bcalza/updatehook#diff-6fecad8f0a67dd53104130218011b8fcd7c1180ebd083b2e0a2840a487cb49d8&quot;&gt;my own compiled amalgamation file&lt;/a&gt;. And did the &lt;a href=&quot;https://github.com/mattn/go-sqlite3/compare/master...brunocalza:go-sqlite3:bcalza/updatehook&quot;&gt;necessary updates&lt;/a&gt; in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Go&lt;/code&gt; API to have access to that new value. My linter messed up with the diff. But only a few changes were needed:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;A change at the &lt;a href=&quot;https://github.com/mattn/go-sqlite3/compare/master...brunocalza:go-sqlite3:bcalza/updatehook#diff-6c33163da8a75fb344db8bf4261df0fe2f6fc8a8e867221fe0bb81cb8de77643R74&quot;&gt;updateHookTrampoline&lt;/a&gt; that now receives the record as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;*C.char&lt;/code&gt; and its size as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;int&lt;/code&gt;, and casts it to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[]byte&lt;/code&gt; passing it to callback:&lt;/li&gt;
&lt;/ul&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;updateHookTrampoline&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;handle&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;unsafe&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pointer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;db&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rowid&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;size&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;callback&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lookupHandle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;handle&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;byte&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;callback&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GoString&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GoString&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rowid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GoBytes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unsafe&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pointer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;ul&gt;
  &lt;li&gt;And a change to &lt;a href=&quot;https://github.com/mattn/go-sqlite3/compare/master...brunocalza:go-sqlite3:bcalza/updatehook#diff-2b10fcf999dec2c93bce3364fdeefd688eda6ad78934ac1a2d8b359b57ee7687R575&quot;&gt;RegisterUpdateHook&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;putting-everything-together&quot;&gt;Putting everything together&lt;/h1&gt;

&lt;p&gt;So, now we have everything in place to test this out. Let’s run a simple example inspired by &lt;a href=&quot;https://fly.io/blog/sqlite-internals-btree/&quot;&gt;SQLite Internals: Pages &amp;amp; B-trees&lt;/a&gt; blog post.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;package&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
	&lt;span class=&quot;s&quot;&gt;&quot;database/sql&quot;&lt;/span&gt;
	&lt;span class=&quot;s&quot;&gt;&quot;fmt&quot;&lt;/span&gt;
	&lt;span class=&quot;s&quot;&gt;&quot;log&quot;&lt;/span&gt;
	&lt;span class=&quot;s&quot;&gt;&quot;os&quot;&lt;/span&gt;

	&lt;span class=&quot;s&quot;&gt;&quot;github.com/mattn/go-sqlite3&quot;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;sqlite3conn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sqlite3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SQLiteConn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;sql&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Register&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;sqlite3_with_hook_example&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
		&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sqlite3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SQLiteDriver&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
			&lt;span class=&quot;n&quot;&gt;ConnectHook&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sqlite3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SQLiteConn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;error&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
				&lt;span class=&quot;n&quot;&gt;sqlite3conn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sqlite3conn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
				&lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RegisterUpdateHook&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;db&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rowid&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;byte&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
					&lt;span class=&quot;k&quot;&gt;switch&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
					&lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sqlite3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SQLITE_INSERT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;
						&lt;span class=&quot;n&quot;&gt;fmt&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Printf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;%x&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
					&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
				&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
				&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt;
			&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
		&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Remove&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;./foo.db&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

	&lt;span class=&quot;n&quot;&gt;srcDb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sql&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Open&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;sqlite3_with_hook_example&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;./foo.db&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
	&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
		&lt;span class=&quot;n&quot;&gt;log&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Fatal&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
	&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
	&lt;span class=&quot;k&quot;&gt;defer&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;srcDb&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Close&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;srcDb&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Ping&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

	&lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;srcDb&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Exec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;`CREATE TABLE sandwiches (
		id INTEGER PRIMARY KEY,
		name TEXT,
		length REAL,
		count INTEGER
	);`&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
	&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
		&lt;span class=&quot;n&quot;&gt;log&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Fatal&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
	&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;srcDb&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Exec&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;INSERT INTO sandwiches (name, length, count) VALUES (&apos;Italian&apos;, 7.5, 2);&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
	&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
		&lt;span class=&quot;n&quot;&gt;log&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Fatal&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
	&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Don’t forget to add &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;replace github.com/mattn/go-sqlite3 =&amp;gt; github.com/brunocalza/go-sqlite3 v0.0.0-20220926005737-36475033d841&lt;/code&gt; to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;go.mod&lt;/code&gt; .&lt;/p&gt;

&lt;p&gt;If you run that, you’ll get:&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;05001b07014974616c69616e401e00000000000002&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Which is exactly the raw record of the row &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(&apos;Italian&apos;, 7.5, 2)&lt;/code&gt; as described at &lt;a href=&quot;https://fly.io/blog/sqlite-internals-btree/#efficient-sandwich-encoding&quot;&gt;Efficient Sandwich Encoding&lt;/a&gt;, without the primary key and the length of the record (two first bytes).&lt;/p&gt;

&lt;p&gt;And here I finish the journey. It was really fun to know I was able to understand some parts of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SQLite&lt;/code&gt; source code, although I did not understand most of it, and make some changes and see those changes reflected through &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Go&lt;/code&gt;’s driver.&lt;/p&gt;

&lt;p&gt;To be honest this approach of changing the source code of a database is too risky. Keeping up-to-date with new versions (and also I’d have to also keep an up-to-date fork of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Go&lt;/code&gt;’s driver) is too problematic.  &lt;/p&gt;

&lt;p&gt;I work on a web3 protocol called &lt;a href=&quot;https://tableland.xyz/&quot;&gt;Tableland&lt;/a&gt;. Problems similar to this are always frequent in our Engineering team. If you enjoy this kind of stuff, get in touch through our &lt;a href=&quot;https://github.com/tablelandnetwork/&quot;&gt;Github&lt;/a&gt; or &lt;a href=&quot;https://discord.gg/dc8EBEhGbg&quot;&gt;Discord&lt;/a&gt;, or &lt;a href=&quot;https://twitter.com/brunocalza&quot;&gt;@brunocalza&lt;/a&gt;.&lt;/p&gt;

      </content>
    </entry>
  
    <entry>
      <title>There Are Many Ways To Safely Count</title>
      <link href="https://bcalza.b-cdn.net/2021/07/08/there-are-many-ways-to-safely-count.html" />
      <id>https://bcalza.b-cdn.net/2021/07/08/there-are-many-ways-to-safely-count</id>
      <updated>2021-07-08T00:00:00+00:00</updated>
      <content type="html">
        &lt;p&gt;Another day I was looking at a simple classic implementation of a shared counter in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;C++&lt;/code&gt; using mutex, and I wondered what other thread-safe implementations existed. I usually use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Go&lt;/code&gt; to explore my curiosity. The result of this exploration is a compilation of ways on how to implement a &lt;em&gt;goroutine-safe&lt;/em&gt; counter.&lt;/p&gt;

&lt;h2 id=&quot;dont-do-this&quot;&gt;Don’t Do This&lt;/h2&gt;

&lt;p&gt;Let’s start with the non-safe implementation.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NotSafeCounter&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint64&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NewNotSafeCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Counter&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;NotSafeCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;NotSafeCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;num&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;num&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;NotSafeCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint64&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;number&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Nothing magical. Let’s test its correctness by running &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;100 *goroutines*&lt;/code&gt; where 2 thirds of them &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Add&lt;/code&gt; 1 to the shared counter.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;testCorrectness&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;testing&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;wg&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sync&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;WaitGroup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;wg&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;3&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;go&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;wg&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Done&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
            &lt;span class=&quot;p&quot;&gt;}(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;3&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;go&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;wg&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Done&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
            &lt;span class=&quot;p&quot;&gt;}(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;go&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;wg&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Done&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
            &lt;span class=&quot;p&quot;&gt;}(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;wg&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Wait&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;66&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Errorf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;counter should be %d and was %d&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;66&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The result of the test is not deterministic. Sometimes it passes. But sometimes you get a message like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;counter_test.go:34: counter should be 66 and was 65&lt;/code&gt;.&lt;/p&gt;

&lt;h2 id=&quot;the-classic&quot;&gt;The Classic&lt;/h2&gt;

&lt;p&gt;The traditional way to implement a correct counter is to use a mutex that guarantees that only one operation is done at a time. In &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Go&lt;/code&gt;, we simply use the &lt;a href=&quot;https://golang.org/pkg/sync/&quot;&gt;sync&lt;/a&gt; package.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MutexCounter&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;mu&lt;/span&gt;     &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sync&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RWMutex&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint64&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NewMutexCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Counter&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MutexCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sync&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RWMutex&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{},&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MutexCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;num&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mu&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Lock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;defer&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mu&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Unlock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;num&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MutexCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint64&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mu&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RLock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;defer&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mu&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RUnlock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;number&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now the tests run deterministically and always pass.&lt;/p&gt;

&lt;h2 id=&quot;using-channels&quot;&gt;Using Channels&lt;/h2&gt;

&lt;p&gt;Locks are low-level primitives that let you achieve synchronization. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Go&lt;/code&gt; offers a more high-level primitive called channels. There are a lot &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mutexes-versus-channels&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;which-one-is-better&lt;/code&gt; or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;which-one-should-I-use&lt;/code&gt; kinds of discussions about mutexes and channels. Some of the discussions are valid and very interesting, but that is not the point of this blog post.&lt;/p&gt;

&lt;p&gt;The way we are going to implement a &lt;em&gt;goroutine-safe&lt;/em&gt; counter using channels is by having a channel where every operation (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Add&lt;/code&gt; or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Read&lt;/code&gt;) called on the counter will be queued in a channel. The operations will be represented as a function &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;func()&lt;/code&gt;. When created, the counter spawns a &lt;em&gt;goroutine&lt;/em&gt; that executes the queued operations in serial order.&lt;/p&gt;

&lt;p&gt;Here is the counter definition:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ChannelCounter&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt;     &lt;span class=&quot;k&quot;&gt;chan&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint64&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NewChannelCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Counter&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ChannelCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;make&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;chan&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;go&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ChannelCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;range&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;See how the counter’s &lt;em&gt;goroutine&lt;/em&gt; only reads the operations from the channel and executes them.&lt;/p&gt;

&lt;p&gt;When a &lt;em&gt;goroutine&lt;/em&gt; calls &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Add&lt;/code&gt;, we queue a write operation:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ChannelCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;num&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;num&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;When a &lt;em&gt;goroutine&lt;/em&gt; calls &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Read&lt;/code&gt;, we queue a read operation:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ChannelCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint64&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;make&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;chan&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ch&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;number&lt;/span&gt;
        &lt;span class=&quot;nb&quot;&gt;close&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ret&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;What I really like about this implementation is how clear it is to visualize the operations being executed in serial order.&lt;/p&gt;

&lt;h2 id=&quot;the-atomic-way&quot;&gt;The Atomic Way&lt;/h2&gt;

&lt;p&gt;We can use even lower-level primitives and execute atomic instructions provided by the &lt;a href=&quot;https://golang.org/pkg/sync/atomic/&quot;&gt;sync/atomic&lt;/a&gt; package.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AtomicCounter&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint64&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NewAtomicCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Counter&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AtomicCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AtomicCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;num&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;atomic&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AddUint64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AtomicCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint64&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;atomic&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LoadUint64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;compare-and-swap&quot;&gt;Compare And Swap&lt;/h2&gt;

&lt;p&gt;Alternatively, we can use the very classical atomic primitive &lt;a href=&quot;https://en.wikipedia.org/wiki/Compare-and-swap&quot;&gt;Compare And Swap&lt;/a&gt; to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Add&lt;/code&gt; a number to the counter.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CASCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;num&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;atomic&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LoadUint64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;atomic&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CompareAndSwapUint64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CASCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint64&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;atomic&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LoadUint64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Basically, it tries infinitely until it successfully updates the counter correctly.&lt;/p&gt;

&lt;h2 id=&quot;what-about-float-types&quot;&gt;What About Float Types?&lt;/h2&gt;

&lt;p&gt;In my exploration, I came across an awesome talk, called &lt;a href=&quot;https://www.youtube.com/watch?v=1V7eJ0jN8-E&quot;&gt;Prometheus: Designing and Implementing a Modern Monitoring Solution in Go&lt;/a&gt;, that discusses these techniques and benchmarks them. At the final, it talks about how to implement a counter of floats. All techniques provided so far works for floats, except the ones that use &lt;a href=&quot;https://golang.org/pkg/sync/atomic/&quot;&gt;sync/atomic&lt;/a&gt;. &lt;a href=&quot;https://golang.org/pkg/sync/atomic/&quot;&gt;sync/atomic&lt;/a&gt; does not provide atomic operations on floats. In the video, Björn Rabenstein presents how to solve this by storing the float as an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;uint64&lt;/code&gt; and use the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;math.Float64bits&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;math.Float64frombits&lt;/code&gt; to do the conversion between &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;float64&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;uint64&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CASFloatCounter&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint64&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NewCASFloatCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CASFloatCounter&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CASFloatCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CASFloatCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;num&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;float64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;atomic&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LoadUint64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;newValue&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;math&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Float64bits&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;math&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Float64frombits&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;atomic&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CompareAndSwapUint64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;newValue&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CASFloatCounter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;float64&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;math&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Float64frombits&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;atomic&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LoadUint64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;final-words&quot;&gt;Final Words&lt;/h2&gt;

&lt;p&gt;This is a simple collection of implementations of a shared counter. It is the result of my curiosity and also the result of trying to achieve a fundamental understanding of concurrency. If you know more ways on how to do this, I’d love to know.&lt;/p&gt;

&lt;p&gt;You can check the implementations, run the tests and benchmarks at &lt;a href=&quot;https://github.com/brunocalza/sharedcounter&quot;&gt;brunocalza/sharedcounter.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I’m always trying to share what I am learning about database internals.&lt;/p&gt;

&lt;p&gt;I’m on &lt;a href=&quot;https://twitter.com/brunocalza&quot;&gt;Twitter&lt;/a&gt;.&lt;/p&gt;

      </content>
    </entry>
  
    <entry>
      <title>Getting To Know Logical Clocks By Implementing Them</title>
      <link href="https://bcalza.b-cdn.net/2021/07/02/getting-to-know-logical-clocks-by-implementing-them.html" />
      <id>https://bcalza.b-cdn.net/2021/07/02/getting-to-know-logical-clocks-by-implementing-them</id>
      <updated>2021-07-02T00:00:00+00:00</updated>
      <content type="html">
        &lt;h1 id=&quot;the-clock-synchronization-and-ordering-problems&quot;&gt;The Clock Synchronization and Ordering Problems&lt;/h1&gt;

&lt;p&gt;A single node system has no problem deciding what time it is and which order the events inside the system happened. The node has a timer, called clock, and any process that needs to make use of time makes a call to the operating system. If process (a) makes use of time before a second process (b), the time read by (a) will be smaller than the time read by (b). In a more formal way,&lt;/p&gt;

\[a \to b \implies T(a) &amp;lt; T(b)\]

&lt;p&gt;This is a very natural condition. If something happens before another thing, it is expected that the time at the first thing occurred to be smaller. The operator ( $\to$ ) is called &lt;strong&gt;happened-before&lt;/strong&gt;, and it is defined as:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;If (a) and (b) are events in the same process, and (a) occurs before (b), then (a $\to$ b)&lt;/li&gt;
  &lt;li&gt;If event (a) is the sending of a message from a process, and event (b) is the receiving of the same message by another process, then (a $\to$ b )&lt;/li&gt;
  &lt;li&gt;If (a $\to$ b) and (b $\to$ c), then (a $\to$ c)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In a distributed system, however, the above condition is not so easy to maintain. We have multiple nodes. Each with its own clock. A clock is an electronic circuit present in the hardware made of a crystal that oscillates at a constant frequency. We call &lt;strong&gt;physical clocks&lt;/strong&gt; this kind of clock that measures real-time. The problem in a distributed system is that there is no guarantee that the rate at which each one of the clocks oscillates will be the same, a phenomenon known as &lt;a href=&quot;https://en.wikipedia.org/wiki/Clock_drift&quot;&gt;clock drift&lt;/a&gt;. This is the &lt;a href=&quot;https://en.wikipedia.org/wiki/Clock_synchronization&quot;&gt;clock synchronization&lt;/a&gt; problem. &lt;strong&gt;Nodes can’t agree on which time it is&lt;/strong&gt;. Of course, measuring the real-time is important to many applications, so solutions to coordinate the clocks as &lt;a href=&quot;https://en.wikipedia.org/wiki/Network_Time_Protocol&quot;&gt;NTP&lt;/a&gt; have been in use. However, even when using something like NTP and node’s physical clock times becoming very close within a bounded range, it is still possible that an event happening before another to have a superior timestamp. There are many applications that it is important to have an agreement on the order of events. These applications cannot rely on physical clocks.&lt;/p&gt;

&lt;h1 id=&quot;lamport-clocks&quot;&gt;Lamport Clocks&lt;/h1&gt;

&lt;p&gt;In 1978, Leslie Lamport tackled this problem in the paper &lt;a href=&quot;http://lamport.azurewebsites.net/pubs/time-clocks.pdf&quot;&gt;Time, Clocks, and the Ordering of Events in a Distributed System&lt;/a&gt; and presented a logical clock implementation built in a way to satisfy the above condition and it is able to make nodes agree on an order in which events occur.&lt;/p&gt;

&lt;p&gt;A logical clock is just a counter that holds a number, called &lt;em&gt;timestamp&lt;/em&gt;. This number has no relationship to physical time. Its only purpose is to capture the ordering of events. And in order to have this property, the clock needs to satisfy  $a \to b \implies T(a) &amp;lt; T(b) $ for any events $a$ and $b$. Lamport calls it &lt;strong&gt;Clock Condition&lt;/strong&gt;. In other words, each node of a distributed system will have a clock (logical), that is just a counter; and all of the counters store numbers that are in agreement that if something happened the timestamp of this happening will be smaller than the next event.&lt;/p&gt;

&lt;p&gt;It is not hard to come up with a counter that captures the condition. Let’s break things into two parts: events happening inside a process and events happening between processes (messages exchange). If two events happen locally, a counter that increments its value every time an event satisfies the Clock Condition. For the second part, we have to satisfy the fact $C_i(a) &amp;lt; C_j(b)$, where $a$ is the event of sending a message at process $i$ and event $b$ is the receiving the same message at process $j$; $C_x$ is the value of the clock at process $x$. Lamport proposes the sending of the timestamp together with the message and the receiver to update its counter to the greater value between its timestamp and the received one, upon receiving the message.&lt;/p&gt;

&lt;h2 id=&quot;the-algorithm&quot;&gt;The Algorithm&lt;/h2&gt;

&lt;p&gt;Each node in a distributed system has a clock, which is a counter that stores the time &lt;strong&gt;t&lt;/strong&gt; that the last event occurred. The counter starts at &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;t = 0&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;When an event occurs at the node, &lt;strong&gt;t&lt;/strong&gt; is incremented, that is, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;t = t + 1&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;When a node wants to send a message &lt;strong&gt;m&lt;/strong&gt; to another node, &lt;strong&gt;t&lt;/strong&gt; is incremented and it is sent together with the message, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;t = t + 1&lt;/code&gt;, then, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;send(m, t)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;When a node receives a message, it updates its own clock with the maximum of its current time &lt;strong&gt;t&lt;/strong&gt; and the time &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;t_received&lt;/code&gt; that came with the message and adds one. That is, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;t = max(t, t_received) + 1&lt;/code&gt;.&lt;/p&gt;

&lt;h2 id=&quot;visualizing&quot;&gt;Visualizing&lt;/h2&gt;

&lt;p&gt;I built a simple &lt;a href=&quot;https://github.com/brunocalza/logical-clocks&quot;&gt;tool&lt;/a&gt; in &lt;em&gt;Go&lt;/em&gt; that helps with defining the events in concurrent &lt;em&gt;goroutines&lt;/em&gt; and visualizing the message flow and clock values. It is highly based on an &lt;a href=&quot;https://github.com/mwhittaker/clocks&quot;&gt;implementation done by Michael Whittaker.&lt;/a&gt; You can plug your own example (like &lt;a href=&quot;https://github.com/brunocalza/logical-clocks/blob/main/examples/example3.go&quot;&gt;example3&lt;/a&gt;) and execute it by running &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;go run cmd/* Example3 | ./plot.py&lt;/code&gt; .
&lt;img src=&quot;/assets/img/clocks_1.png&quot; alt=&quot;&quot; /&gt;Execution of Example 3
Every event has a mark on the timeline. A line in the figure is the connection between two events and it increases from the sender to the receiver.&lt;/p&gt;

&lt;p&gt;It is very clear from the figure that the Clock Condition holds. Any pair of events that has a &lt;strong&gt;happened-before&lt;/strong&gt; relation can be compared. For example,  $ (1, A) \to (3, A)$; or $ (1, C) \to (8, B)$.&lt;/p&gt;

&lt;p&gt;Does $ (1, A) \to (6, C)$? Yes. According to the definition, $ (1, A) $ &lt;strong&gt;happened-before&lt;/strong&gt; $ (6, C) $ by transitivity. There is a sequence of events from $ (1, A)$ to $ (6, C)$: $ (1, A), (3, B)$, $(4, B)$, $(5, B)$ and $(6, C)$. Be aware that for some pair of events we can’t establish a &lt;strong&gt;happened-before&lt;/strong&gt; relation according to our definition, for example, the pair $(3, A)$ and $(4, B)$. We say that these events are &lt;strong&gt;concurrent&lt;/strong&gt;.&lt;/p&gt;

&lt;h2 id=&quot;total-ordering&quot;&gt;Total ordering&lt;/h2&gt;

&lt;p&gt;The usefulness of Lamport Clocks comes from the fact that it can be used to define a total order relation among all events in the system. In Lamport words,&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;We can use a system of clocks satisfying the Clock Condition to place a total ordering on the set of all system events. We simply order the events by the times at which they occur. To break ties, we use any arbitrary total ordering of the processes.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;blockquote&gt;
  &lt;p&gt;Being able to totally order the events can be very useful in implementing a distributed system. In fact, the reason for implementing a correct system of logical clocks is to obtain such a total ordering.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The use of Lamport Clocks and total ordering in a distributed system is what let us coordinate or synchronize events of multiple nodes in a way that every node agrees. It is a simple consensus algorithm.&lt;/p&gt;

&lt;h2 id=&quot;limitations&quot;&gt;Limitations&lt;/h2&gt;

&lt;p&gt;One limitation of Lamport Clocks is that the inverse of the Clock Condition does not hold:&lt;/p&gt;

\[T(a) &amp;lt; T(b) \implies a \to b\]

&lt;p&gt;So it can’t be used to determine if one event happens before the other given both timestamps. And also, it can’t help us determine if two events are concurrent or not by comparing the timestamps only.&lt;/p&gt;

&lt;p&gt;These limitations are overcome by Vector Clocks, the second kind of logical clock.&lt;/p&gt;

&lt;h1 id=&quot;vector-clocks&quot;&gt;Vector Clocks&lt;/h1&gt;

&lt;p&gt;Vector Clocks is an improvement of Lamport Clocks, because it not only captures the &lt;strong&gt;happened-before&lt;/strong&gt; ($ \to $) relationship, that physical clocks do not have, but it also has the property of telling if events are concurrent or not, a property that Lamport Clocks don’t have. Instead of holding only the timestamp of its own process, each clock holds a vector of timestamps, containing the timestamps of all processes. Apparently, it was first mentioned in 1986 at a paper called &lt;a href=&quot;https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.569.3601&amp;amp;rep=rep1&amp;amp;type=pdf&quot;&gt;Highly-Available Distributed Services and Fault-Tolerant Distributed Garbage Collection.&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;the-algorithm-1&quot;&gt;The Algorithm&lt;/h2&gt;

&lt;p&gt;Each node stores a vector of timestamps $t = \langle t_1, t_2, \ldots, t_n \rangle $ of size &lt;strong&gt;n&lt;/strong&gt;, where &lt;strong&gt;n&lt;/strong&gt; is the number of nodes in the distributed systems. It is initialized at $ \langle 0, 0, \ldots, 0 \rangle $.&lt;/p&gt;

&lt;p&gt;When an event occurs at a node &lt;strong&gt;i&lt;/strong&gt;, it increments the &lt;em&gt;ith&lt;/em&gt; entry of the &lt;strong&gt;t&lt;/strong&gt; vector &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;t[i] = t[i] + 1&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;When a node &lt;strong&gt;i&lt;/strong&gt; sends a message to another node, it increments the &lt;em&gt;ith&lt;/em&gt; entry of the &lt;strong&gt;t&lt;/strong&gt; vector &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;t[i] = t[i] + 1&lt;/code&gt; and attaches the vector with the message.&lt;/p&gt;

&lt;p&gt;When a node &lt;strong&gt;i&lt;/strong&gt; receives a message it updates its vector &lt;strong&gt;t&lt;/strong&gt; using an element-wise maximum operation between its vector and the received vector &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;t_received&lt;/code&gt;. Then, increments the &lt;em&gt;ith&lt;/em&gt; entry of the &lt;strong&gt;t&lt;/strong&gt; vector &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;t[i] = t[i] + 1&lt;/code&gt; .&lt;/p&gt;

&lt;h3 id=&quot;properties&quot;&gt;Properties&lt;/h3&gt;

&lt;p&gt;Let’s use the same tool to visualize the same sequence of events of the example before (this example can be found at &lt;a href=&quot;https://github.com/brunocalza/logical-clocks/blob/main/examples/example8.go&quot;&gt;example8.go&lt;/a&gt;).
&lt;img src=&quot;/assets/img/clocks_2.png&quot; alt=&quot;&quot; /&gt;
The first thing to observe is that we don’t need to add the label of the process to uniquely identify a timestamp. A vector timestamp uniquely identifies an event. Second, we can see that the vector clock of an event is a counter of the number of events that have a &lt;strong&gt;happened-before&lt;/strong&gt; relationship to the event. For example, the event $($(3,1,0)$)$ informs that 2 events at A and 1 event at B &lt;strong&gt;happened-before&lt;/strong&gt; it. And that is true, since the events $($(1,0,0)$)$, $($(0,1,0)$)$ and $($(2,1,0)$)$ &lt;strong&gt;happened-before&lt;/strong&gt; $($(3,1,0)$)$. Another way of putting it: a vector timestamp of an event $e$ represents a set of events, $e$ and its causal dependencies: $ {e} \cup {a \mid a \to e} $.&lt;/p&gt;

&lt;p&gt;Vector Clocks improves on Lamport Clocks on the fact that now you can compare event timestamps and decide which one happened before the other or if there are concurrent. We already said a vector timestamp uniquely identify an event, that is $ V(a) = V(b) \iff a = b $ for events $a$ and $b$. We can also compare if a vector timestamp is less ($&amp;lt;$) than another: for every element of the first vector it should be less or equal to the corresponding element of the second vector and the vectors can’t be the same. The interesting fact is that $ V(a) &amp;lt; V(b) \iff a \to b $ for events $a$ and $b$. This means that given two vector clocks we can decide if one &lt;strong&gt;happened-before&lt;/strong&gt; the other. If the events are not comparable by the equals ($=$) operator or the less ($&amp;lt;$) operator, we say that the events are concurrent. For example, we know that $(1,0, 0) \to (1, 5, 4)$ because $(1,0, 0) &amp;lt; (1, 5, 4)$. And we know that $(4,6,0)$ and $(0,0,1)$ are concurrent because they are not comparable.&lt;/p&gt;

&lt;h1 id=&quot;wrapping-up&quot;&gt;Wrapping Up&lt;/h1&gt;

&lt;p&gt;This was a discussion on two popular kinds of logical clocks, their properties, and their implementations. Logical Clocks (not exactly the implementations provided above) are very important in distributed systems and very popular among databases for detecting conflicts of data versioning. Take a look at an excerpt from the &lt;a href=&quot;https://www.researchgate.net/publication/220910159_Dynamo_Amazon&apos;s_highly_available_key-value_store&quot;&gt;Dynamo paper&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;One can determine whether two versions of an object are on parallel branches or have a causal ordering, by examining their vector clocks. If the counters on the first object’s clock are less-than-or-equal to all of the nodes in the second clock, then the first is an ancestor of the second and can be forgotten. Otherwise, the two changes are considered to be in conflict and require reconciliation.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We can find examples on &lt;a href=&quot;https://riak.com/posts/technical/vector-clocks-revisited/index.html?p=9545.html&quot;&gt;Riak KV that uses logical clocks to track the history of updates to values, and detect &lt;em&gt;conflicting&lt;/em&gt; writes&lt;/a&gt;, and also on &lt;a href=&quot;https://jepsen.io/analyses/cockroachdb-beta-20160829&quot;&gt;CockroachDB relies on hybrid logical clocks to provide serializability&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I hope you enjoyed the content. Consider following me on &lt;a href=&quot;https://twitter.com/brunocalza&quot;&gt;Twitter&lt;/a&gt; if that’s the case. I write about databases internals while I learn more about it.&lt;/p&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;http://lamport.azurewebsites.net/pubs/time-clocks.pdf&quot;&gt;Time, Clocks, and the Ordering of Events in a Distributed System&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.youtube.com/playlist?list=PLeKd45zvjcDFUEv_ohr_HdUFe97RItdiB&quot;&gt;Concurrent and Distributed Systems Course&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.distributed-systems.net/index.php/books/ds3/&quot;&gt;Distributed Systems 3rd edition (2017)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

      </content>
    </entry>
  
    <entry>
      <title>What Zero-copy Serialization Means?</title>
      <link href="https://bcalza.b-cdn.net/2021/06/01/what-zero-copy-serialization-means.html" />
      <id>https://bcalza.b-cdn.net/2021/06/01/what-zero-copy-serialization-means</id>
      <updated>2021-06-01T00:00:00+00:00</updated>
      <content type="html">
        &lt;p&gt;I was reading about &lt;a href=&quot;https://en.wikipedia.org/wiki/Comparison_of_data-serialization_formats&quot;&gt;serialization formats&lt;/a&gt; the other day and came across the last column “Supports Zero-copy operations”. I had no idea of what it meant. Moments before I got on this Wikipedia page, I was looking into how to serialize a struct in *Go, *without using any specific format, just raw serialization (don’t even know if the term raw serialization means anything). While searching for a way I stumbled upon this &lt;a href=&quot;https://stackoverflow.com/a/56272984/822023&quot;&gt;Stack Overflow answer&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;… if you consent to unsafety and actually need to read struct as bytes, then relying on byte array memory representation might be a bit better than relying on byte slice internal structure.&lt;/p&gt;

  &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;type Struct struct {
   Src int32
   Dst int32
   SrcPort uint16
   DstPort uint16
}

const sz = int(unsafe.SizeOf(Struct{}))
var asByteSlice []byte = (*(*[sz]byte)(unsafe.Pointer(&amp;amp;struct_value)))[:]
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;  &lt;/div&gt;

  &lt;p&gt;Works and provides read-write view into struct, &lt;strong&gt;&lt;em&gt;zero-copy&lt;/em&gt;&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I became intrigued by the term and decided to research a little bit. The first results were about the &lt;a href=&quot;https://en.wikipedia.org/wiki/Zero-copy&quot;&gt;&lt;strong&gt;zero-copy&lt;/strong&gt;&lt;/a&gt; strategy of copying data minimizing context switches between the kernel space and user space. I had no clue this existed and I was very excited to know about this. Learned about the &lt;a href=&quot;https://man7.org/linux/man-pages/man2/sendfile.2.html&quot;&gt;sendfile&lt;/a&gt; system call and found an awesome &lt;a href=&quot;https://jvns.ca/blog/2016/01/23/sendfile-a-new-to-me-system-call/&quot;&gt;blog post&lt;/a&gt; about it by &lt;a href=&quot;https://twitter.com/b0rk&quot;&gt;@b0rk&lt;/a&gt;. But it seemed to me that this had nothing to do with the &lt;strong&gt;zero-copy&lt;/strong&gt; meant in the serialization context. So I favorited a bunch of web pages about the operational system zero-copy and resumed my research about serialization zero-copy.&lt;/p&gt;

&lt;p&gt;For some reason, I could only find discussions about zero-copy tied to the &lt;strong&gt;&lt;a href=&quot;https://capnproto.org/&quot;&gt;Cap’n Proto&lt;/a&gt;&lt;/strong&gt; serialization protocol. &lt;strong&gt;Cap’n Proto&lt;/strong&gt; is a serialization protocol (after &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;v0.4&lt;/code&gt; it also became a RPC protocol) created &lt;a href=&quot;https://github.com/kentonv&quot;&gt;&lt;strong&gt;Kenton Varda&lt;/strong&gt;&lt;/a&gt;, which worked on &lt;a href=&quot;https://developers.google.com/protocol-buffers&quot;&gt;Protocol Buffers&lt;/a&gt; version 2. According to him, &lt;a href=&quot;https://capnproto.org/index.html&quot;&gt;Cap’n Proto is the result of years of experience working on Protobufs, listening to user feedback, and thinking about how things could be done better&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The marvelous of &lt;strong&gt;Cap’n Proto&lt;/strong&gt; is that it has no cost of serialization/deserialization. This is because of its &lt;strong&gt;zero-copy&lt;/strong&gt; implementation. But what &lt;strong&gt;zero-copy&lt;/strong&gt; means? The first good clarification for me came when reading the comparison of zero-copy protocols to Protocol Buffer at &lt;a href=&quot;https://capnproto.org/news/2014-06-17-capnproto-flatbuffers-sbe.html&quot;&gt;&lt;strong&gt;Cap’n Proto, FlatBuffers, and SBE&lt;/strong&gt;&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;Zero-copy&lt;/strong&gt;&lt;/p&gt;

  &lt;p&gt;The central thesis of all three competitors is that data should be  structured the same way in-memory and on the wire, thus avoiding costly encode/decode steps.&lt;/p&gt;

  &lt;p&gt;Protobufs represents the old way of thinking.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;On the &lt;strong&gt;Cap’n Proto&lt;/strong&gt; home page, it says:&lt;/p&gt;

&lt;p&gt;Proto encoding is appropriate both as a data interchange format and an in-memory representation, so once your structure is built, you can simply write the bytes straight out to disk!&lt;/p&gt;

&lt;p&gt;Things became a lot clearer to me. It is as if the “serialization” implicitly happened when the object was built, and you already have the bytes in your hands. I found an interesting question on the &lt;a href=&quot;https://groups.google.com/g/capnproto&quot;&gt;Cap’n Proto forum&lt;/a&gt;: &lt;a href=&quot;https://groups.google.com/g/capnproto/c/kKw89THwoEY/m/mvqOOYaPztwJ&quot;&gt;&lt;strong&gt;What does zero-copy mean?&lt;/strong&gt;&lt;/a&gt;. The discussion revolves around the fact that zero-copy protocols are not well suitable for an object that mutates state as pointed in &lt;a href=&quot;https://capnproto.org/news/2014-06-17-capnproto-flatbuffers-sbe.html&quot;&gt;&lt;strong&gt;Cap’n Proto, FlatBuffers, and SBE&lt;/strong&gt;&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;Usable as mutable state&lt;/strong&gt;&lt;/p&gt;

  &lt;p&gt;Protobuf generated classes have often been (ab)used as a convenient way to store an application’s mutable internal state. There’s mostly no problem with modifying a message gradually over time and then serializing it when needed.&lt;/p&gt;

  &lt;p&gt;This usage pattern does not work well with any zero-copy serialization format because these formats must use arena-style allocation to make sure the message is built in contiguous memory. Arena allocation has the property that you cannot free any object unless you free the entire arena. Therefore, when objects are discarded, the memory ends up leaked until the message as a whole is destroyed. A long-lived message that is modified many times will thus leak memory.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So, if it not recommended to mutate the state of the &lt;strong&gt;Cap’n Proto&lt;/strong&gt; message, doesn’t that imply that a copy is needed to perform the mutation in another structure? You can check Kenton’s answer in the thread.&lt;/p&gt;

&lt;p&gt;I have also come across an interesting and intense &lt;a href=&quot;https://news.ycombinator.com/item?id=23589037&quot;&gt;&lt;strong&gt;discussion on Hacker News&lt;/strong&gt;&lt;/a&gt; about the definition of zero-copy. Most of the discussion was around the usefulness and the trade-offs of the definition. I am just going to highlight Kenton’s definition:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Some people use the term “zero-copy” to mean only that when the message contains a string or byte array, the parsed representation of those specific fields will point back into the original message buffer, rather than having to allocate a copy of the bytes at parse time.&lt;/p&gt;

  &lt;p&gt;Cap’n Proto and FlatBuffers implement a much stronger form of zero-copy. With them, it’s not just strings and byte buffers that are zero-copy, it’s the entire data structure. With these systems, once you have the bytes of a message mapped into memory, you do not need to do any “parse” step at all before you start using the message.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;All this made things much clearer to me. But I still wondered about the claim made on that Stack Overflow post. Isn’t that assignment a copy of data? And the answer to that I found on &lt;a href=&quot;https://blog.golang.org/slices-intro&quot;&gt;&lt;strong&gt;Go Slices: usage and internals&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Slicing does not copy the slice’s data. It creates a new slice value that points to the original array. This makes slice operations as efficient as manipulating array indices.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Stumbled upon a lot of new things that I will surely explore further and learned a lot from this quest. I was recently working on an implementation of database storage using the concept of slotted paged. I was basically doing raw manipulation of bytes and pointers in order to insert a tuple into a table. I wonder if one can make use of a protocol such as &lt;strong&gt;Cap’n Proto&lt;/strong&gt; to make database storage more reliable and faster. That’s it for this post.&lt;/p&gt;

      </content>
    </entry>
  
    <entry>
      <title>The Cache is Full</title>
      <link href="https://bcalza.b-cdn.net/2021/05/23/the-cache-is-full.html" />
      <id>https://bcalza.b-cdn.net/2021/05/23/the-cache-is-full</id>
      <updated>2021-05-23T00:00:00+00:00</updated>
      <content type="html">
        &lt;p&gt;In computing, caching is all over the place. It is found in hardware (&lt;em&gt;CPU&lt;/em&gt; and &lt;em&gt;GPU&lt;/em&gt;), operating system’s virtual memory, buffer pool managers inside databases, in the Web (&lt;em&gt;Content Delivery Networks&lt;/em&gt;, browsers, &lt;em&gt;DNS servers&lt;/em&gt; …), and also inside the applications we build. In this post, we take an abstract look of what cache is, the replacement problem that arises from cache’s own nature and some algorithms for implementing cache replacement policies.&lt;/p&gt;

&lt;h2 id=&quot;defining-cache&quot;&gt;Defining Cache&lt;/h2&gt;

&lt;p&gt;Independently of the context of where caching is being applied, the pure reason for its existence is to a system have better performance than it would without the cache. A system works “fine” without caching. It is just that by applying this technique the system performs better. And, in the context of caching, perform better means accessing the data that is needed faster.&lt;/p&gt;

&lt;p&gt;In order to achieve the performance gain, &lt;strong&gt;part of the data is replicated to a secondary storage&lt;/strong&gt;. Instead of accessing the data through the primary storage, the system accesses through the cache; and the access through the cache is supposed to be more performative. The performance gain of accessing this different storage may arise for multiples reasons. It can be for a technological one. For example, the data in the cache is in memory and the primary storage uses a disk. It can be to avoid overheads. For example, the primary storage can be a relational database, and you add a cache to avoid the overhead of connecting to the database, parsing the SQL query, accessing the disk, and transforming the results to the format that suits the client application. It can be for geographical reasons, as in the case of &lt;em&gt;Content Delivery Networks&lt;/em&gt;.
&lt;img src=&quot;/assets/img/Caching-as-a-subset-6.png&quot; alt=&quot;&quot; /&gt;Cache as a copy of a subset of the primary storage
One of the main drivers of why the cache contains only a subset of the primary storage is cost. Maintaining a copy of the entire primary storage may be too costly, so we have to content ourselves with a cache of limited capacity. More importantly, by design, a cache is supposed to have small sizes. That’s because the whole purpose of caching is to optimize for data access. Most of the data is not accessed, so keeping in cache a subset of data that is frequently accessed is what allows faster look-ups. With all this in mind, let’s define cache, for the purposes of this post, as:&lt;/p&gt;

&lt;p&gt;*A secondary storage with limited capacity that contains a copy of a subset of the data that is stored in main storage. *&lt;/p&gt;

&lt;p&gt;By copy here, I don’t mean exactly copy. Data stored in cache can be the result of computation like the database example talked about above.&lt;/p&gt;

&lt;h2 id=&quot;the-cache-is-full-problem&quot;&gt;The “Cache is Full” Problem&lt;/h2&gt;

&lt;p&gt;What would be the optimal performance of a cache design? Every time that data needs to be accessed, it can be found in the cache (&lt;strong&gt;cache hit&lt;/strong&gt;). But since the cache is only a subset, it means that sometimes data won’t be found in cache (&lt;strong&gt;cache miss&lt;/strong&gt;). If the data that is not in the cache is never requested, it doesn’t really matter if it is not in the cache. So, as cache designers, we have to make a decision of &lt;strong&gt;what data should we keep in the cache and what data should we keep out of the cache in order to maximize the cache hit (or minimize cache miss)&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;To make things clear and for the sake of the discussion let’s assume the following while interacting with the cache:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The cache starts empty;&lt;/li&gt;
  &lt;li&gt;The application always interacts with the cache first. If it doesn’t find the data that is looking for (&lt;em&gt;Get operation&lt;/em&gt;), it consults the primary storage, then it adds to the cache (&lt;em&gt;Put operation&lt;/em&gt;), as in the figure below;&lt;/li&gt;
  &lt;li&gt;The entries inside the cache never expires (there is no &lt;em&gt;TTL&lt;/em&gt;);&lt;/li&gt;
  &lt;li&gt;The entries in the cache are always up-to-date.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;img src=&quot;/assets/img/Cache-miss-3.png&quot; alt=&quot;&quot; /&gt;On a cache miss, the primary storage has to be consulted. Decision on what data lies in cache has to be made
There are multiple architectural options on how to interact with the cache. That is not the point to be discussed here. The point is what data should we keep in cache. From the above figure, there are two points where we have to make a decision of what data lies in the cache. The first point in the &lt;em&gt;Put operation&lt;/em&gt;. The client doesn’t need to add every missed entry to the cache, it can wisely choose those that make more sense. The second decision point is when the client decides to add an entry to the cache but &lt;strong&gt;the cache is full&lt;/strong&gt;. Full may mean that the numbers of entries reached maximum capacity or that the cache has reached its maximum pre-defined storage or any other criteria that are more relevant to your scenario. When that happens, a decision on &lt;strong&gt;what existing entry needs to be evicted from cache to make room for a new entry&lt;/strong&gt; has to be made. This decision-making process, which is the topic of interest in the post, is called &lt;strong&gt;cache replacement policy&lt;/strong&gt;.&lt;/p&gt;

&lt;h2 id=&quot;cache-replacement-policies&quot;&gt;Cache Replacement Policies&lt;/h2&gt;

&lt;p&gt;The problem of choosing an entry to discard from cache can be re-framed as discarding the entry that is least likely to be used in the future. This is not an easy problem, because we can’t predict which entries are not going to be needed in the future. If we could predict, we’d have an optimal algorithm.&lt;/p&gt;

&lt;p&gt;Cache replacement policies rely on heuristics that are dependent on the context. A policy for a &lt;em&gt;CPU cache&lt;/em&gt; will probably differ from the policy of web caching. Hardware caching may impose architectural restrictions that may be not found on the web. Besides that, the parameters that are aimed for optimization influence a lot on the chosen algorithm. The most common parameter that is attempted to maximize is the &lt;strong&gt;hit ratio&lt;/strong&gt;. Some other parameters that can be considered for optimization are &lt;strong&gt;byte hit ratio&lt;/strong&gt;, &lt;strong&gt;cost of access&lt;/strong&gt;, and &lt;strong&gt;latency&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In addition, these policies can make use of characteristics of the cache entry to base a decision. Some of the most popular characteristics are:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;recency&lt;/strong&gt;: time of (since) the last reference to the entry;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;frequency&lt;/strong&gt;: number of requests to an entry;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;size&lt;/strong&gt;: the size of the entry;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;cost of fetching the object&lt;/strong&gt;: cost to fetch an object from its origin;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because of the context, the parameters that are being optimized, and the information used in the heuristics, an enormous variety of replacement algorithms arose. We are going to focus on &lt;strong&gt;FIFO&lt;/strong&gt;, &lt;strong&gt;LRU&lt;/strong&gt;, &lt;strong&gt;CLOCK,&lt;/strong&gt; and &lt;strong&gt;LFU&lt;/strong&gt; to get a full understanding of how a replacement policy typically works.  &lt;/p&gt;

&lt;h2 id=&quot;implementing-some-policies-in-go&quot;&gt;Implementing Some Policies in Go&lt;/h2&gt;

&lt;h3 id=&quot;the-cache&quot;&gt;The cache&lt;/h3&gt;

&lt;p&gt;We are going to make clear distinctions about the core logic of the cache and the logic of the replacement policy. Not only that, we are going to have multiples policies that implement the same policy interface. This way it makes it easier to swap implementations to see how the policies differ from one another. In practice, such decoupled implementation adds some memory overhead (you need an additional hash table) and it is quite hard to have an interface that abstracts all possible caching policies. In our case, the abstraction works because the chosen algorithms are somewhat related.&lt;/p&gt;

&lt;p&gt;Our cache is going to be represented by&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;string&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CacheData&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;string&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Cache&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;maxSize&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;    &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;policy&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;CachePolicy&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;CacheData&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;For simplicity will have both the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;key&lt;/code&gt; and the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;value&lt;/code&gt; as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;string&lt;/code&gt;. There is one attribute, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;size&lt;/code&gt;, for keeping track of the cache’s size and another indicating its capacity, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;maxSize&lt;/code&gt;. It has a policy, that will be responsible for evicting cache entries when the cache is full. Lastly, it uses a hash table for storing the cache entries.&lt;/p&gt;

&lt;p&gt;A cache policy is defined as&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CachePolicy&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;interface&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;Victim&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;Add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;Remove&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;Access&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Victim&lt;/code&gt; runs the policy algorithm and elects a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CacheKey&lt;/code&gt; , called victim, for removal;&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Add&lt;/code&gt; makes a cache key eligible for eviction&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Remove&lt;/code&gt; makes a cache key no longer eligible for eviction&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Access&lt;/code&gt; indicates to the cache policy that a cache key was accessed. This provides additional information to the cache replacement algorithm to make its decision&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By decoupling the core logic of the cache and the replacement logic, it is very easy to understand how a cache works. Here is the &lt;em&gt;Put operation&lt;/em&gt;&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Cache&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Put&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;maxSize&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;victimKey&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;policy&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Victim&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;nb&quot;&gt;delete&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;victimKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;--&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;policy&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If the cache is full, it calls the replacement policy &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Victim&lt;/code&gt; that runs the algorithm and returns a cache key for eviction. The cache, then, removes the cache entry making room for the new key. The &lt;em&gt;Get operation&lt;/em&gt; is straightforward. If it finds the key in the cache, it indicates to the key was accessed and returns the cached value.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Cache&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;error&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;];&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;policy&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Access&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;errors&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;New&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;key not found&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We have a cache implementation. Now let’s focus on the replacement policies.&lt;/p&gt;

&lt;h3 id=&quot;fifo&quot;&gt;FIFO&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;FIFO&lt;/em&gt; (First In First Out) is one of the simplest algorithms. It simply removes the entries in the order that they were added. So, the decision that the algorithm does is: choose the oldest entry in the cache for removal.&lt;/p&gt;

&lt;p&gt;It can be easily implemented using a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;queue&lt;/code&gt;. A &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;O(1) queue&lt;/code&gt; is usually implemented using a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;doubly linked list&lt;/code&gt; . We are going to really on Go’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;container/list&lt;/code&gt; implementations of a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;doubly-linked list&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Here is the FIFO policy definition&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;FIFOPolicy&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;list&lt;/span&gt;    &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;List&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;keyNode&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Element&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We are going to make use of an additional hash table, to make it possible to make the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Remove&lt;/code&gt; method implementation &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;O(1)&lt;/code&gt;. If you add 5 elements to the cache, the &lt;em&gt;FIFO&lt;/em&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;queue&lt;/code&gt; will look like this:
&lt;img src=&quot;/assets/img/FIFO.png&quot; alt=&quot;&quot; /&gt;The state of the FIFO &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;queue&lt;/code&gt;
Choosing a key for eviction is as simple as choosing the element that lies at the tail of the list. The methods’ implementations are just simple list operations.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FIFOPolicy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Victim&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;element&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Back&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Remove&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;delete&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keyNode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;element&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;element&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FIFOPolicy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PushFront&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keyNode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FIFOPolicy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Remove&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keyNode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Remove&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;delete&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keyNode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;One interesting thing is the implementation of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Access&lt;/code&gt; method, which is empty. Since the &lt;em&gt;FIFO&lt;/em&gt; policy does not rely on access patterns to make its decision, the *Get operation *has no effect on the algorithm.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FIFOPolicy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Access&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;lru&quot;&gt;LRU&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;LRU&lt;/em&gt; (&lt;em&gt;Least Recently Used&lt;/em&gt;) is one of the most popular cache replacement policies. If an item is in the cache for a long time but was used recently, it will probably not be elected. The items that have not been used for a while are the ones considered for eviction.
If you think about it, this algorithm is very similar to the &lt;em&gt;FIFO&lt;/em&gt; algorithm. We just have to take into account the fact that an entry was used recently. In fact, this algorithm can be implemented identically as the &lt;em&gt;FIFO&lt;/em&gt;, excepts for the Access method.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LRUPolicy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Access&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Remove&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The &lt;em&gt;LRU&lt;/em&gt; relies on access patterns. Every time a cache entry is accessed, we update the recency of the cache key in the policy by removing the key from the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;queue&lt;/code&gt; and pushing it to the front, where the most recent item lies.&lt;/p&gt;

&lt;p&gt;Let’s do some calls and how the internal &lt;em&gt;LRU&lt;/em&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;queue&lt;/code&gt; behaves.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;cache&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NewCache&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;LRU&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;cache&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Put&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;1&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;1&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;cache&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Put&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;2&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;2&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;cache&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Put&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;3&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;3&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;cache&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Put&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;4&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;4&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;cache&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Put&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;5&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;5&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;cache&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;3&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;cache&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;1&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;img src=&quot;/assets/img/LRU-2.png&quot; alt=&quot;&quot; /&gt;How the queue &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;changes&lt;/code&gt; after the &lt;em&gt;Get operation&lt;/em&gt;
If we call &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;cache.Victim()&lt;/code&gt; now, it will choose &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;2&lt;/code&gt; which is the least recent.
&lt;img src=&quot;/assets/img/Put-6.png&quot; alt=&quot;&quot; /&gt;2 is evicted to make room for 6&lt;/p&gt;
&lt;h3 id=&quot;clock&quot;&gt;CLOCK&lt;/h3&gt;

&lt;p&gt;In some situations, the &lt;em&gt;LRU&lt;/em&gt; policy is not considered because accessing a cache entry involves removing the entry from the back of the list and reinserting it at the front. For an operating system, this operation is very costly. Since the &lt;em&gt;LRU&lt;/em&gt; is generally considered a good policy, alternatives that approximate it are used instead. One of the alternatives is called &lt;em&gt;CLOCK&lt;/em&gt;. Like the &lt;em&gt;LRU&lt;/em&gt;, it is an improvement from &lt;em&gt;FIFO&lt;/em&gt;. However, it avoids the costly operation of manipulating the list on every access making use of a circular list  (hence the name &lt;em&gt;CLOCK&lt;/em&gt;) a reference bit to indicate that the entry was recently accessed. Here is the representation of its data structure.
&lt;img src=&quot;/assets/img/Clock.png&quot; alt=&quot;&quot; /&gt;Representation of the circular list
The eviction algorithm makes use of a clock pointer to traverse the circular list in search of an entry that has the reference bit set to 0 (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ref = 0&lt;/code&gt;). Every time that it checks the reference bit, if it is set to 1, it sets it to 0 and goes for the next entry. To illustrate this, let’s look at the state of the clock after adding entry 6 at the above clock.
&lt;img src=&quot;/assets/img/Clock2.png&quot; alt=&quot;&quot; /&gt;After &lt;em&gt;Put 6&lt;/em&gt;
Using Redis as an &lt;em&gt;LRU&lt;/em&gt; cache&lt;/p&gt;

&lt;p&gt;It will traverse the circular list setting every reference bit to 0 until it reaches the first entry of reference bit 0, which will be entry 1. It will evict the 1, to make room for the 6. Now let’s do two Get operations: Get 2 and Get 3. They are going to set the reference bit to 1.
&lt;img src=&quot;/assets/img/Clock3.png&quot; alt=&quot;&quot; /&gt;After &lt;em&gt;Get *operations
When we look at the implementation of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Access&lt;/code&gt; method, it is clear that there is no need to manipulate the list as the *LRU&lt;/em&gt; did. It directly accesses the node and changes its reference bit.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ClockPolicy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Access&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keyNode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ClockItem&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If we try to add a new entry 7 now, it will set the reference bits of 6, 2, and 3 to 0, and then evict entry 4.
&lt;img src=&quot;/assets/img/Clock4.png&quot; alt=&quot;&quot; /&gt;After &lt;em&gt;Put 7&lt;/em&gt;
Here is how we represent the policy in code:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ClockItem&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;key&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bit&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ClockPolicy&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;list&lt;/span&gt;      &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CircularList&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;keyNode&lt;/span&gt;   &lt;span class=&quot;k&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ring&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Ring&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;clockHand&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ring&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Ring&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CircularList&lt;/code&gt; is an implementation that makes use of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;container/ring&lt;/code&gt;, &lt;em&gt;Go’s&lt;/em&gt; implementation of a circular list.&lt;/p&gt;

&lt;p&gt;And the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Victim&lt;/code&gt; method described above is implemented with a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;for loop&lt;/code&gt; that advances the clock hand checking the references bit.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ClockPolicy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Victim&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;victimKey&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nodeItem&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ClockItem&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;currentNode&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;clockHand&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;nodeItem&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;currentNode&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ClockItem&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nodeItem&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bit&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;nodeItem&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bit&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;false&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;currentNode&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nodeItem&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;clockHand&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;currentNode&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Next&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;victimKey&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nodeItem&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Move&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;clockHand&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Prev&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;clockHand&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Remove&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;currentNode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;nb&quot;&gt;delete&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keyNode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;victimKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;victimKey&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;You can check how the clock replacement can be used in a buffer pool manager at How &lt;a href=&quot;/how-buffer-pool-works-an-implementation-in-go/&quot;&gt;Buffer Pool Works: An Implementation In Go&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;lfu&quot;&gt;LFU&lt;/h3&gt;

&lt;p&gt;The &lt;em&gt;LFU&lt;/em&gt; (L&lt;em&gt;east Frequently Used&lt;/em&gt;) cares more about frequency than recency. It relies on how many accesses an entry had, which means that somehow the algorithm needs to keep track of the frequency count. Historically, the most popular implementation uses a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;min-heap&lt;/code&gt; to keep track of the least frequently used entries. A heap is a good data structure choice, but it offers &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;O(log n)&lt;/code&gt; operations. We are going to implement an algorithm with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;O(1)&lt;/code&gt; operations, by keeping a hash table the maps the &lt;em&gt;frequency&lt;/em&gt; to a &lt;em&gt;doubly-linked list&lt;/em&gt;. Each frequency has its own &lt;em&gt;queue&lt;/em&gt;. Just because this implementation is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;O(1)&lt;/code&gt; it doesn’t mean it better. It has additional memory costs that may not be appropriate depending on the context.&lt;/p&gt;

&lt;p&gt;Here is the &lt;em&gt;LFU&lt;/em&gt; policy definition:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Frequency&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;LFUItem&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;frequency&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Frequency&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;       &lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;LFUPolicy&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;freqList&lt;/span&gt;     &lt;span class=&quot;k&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Frequency&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;List&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;keyNode&lt;/span&gt;      &lt;span class=&quot;k&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Element&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;minFrequency&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Frequency&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We are going to store an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LFUItem&lt;/code&gt; as the value of the list element. The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LFUItem&lt;/code&gt; contains the frequency, which is helpful when updates to the frequency list need to be made. The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;minFrequency&lt;/code&gt; keeps track ofImplementations the minimum frequency that can be found. That is helpful to locate the list from which some element will be chosen to evict.&lt;/p&gt;

&lt;p&gt;Here is a representation of how the algorithm works (from left to right) showing the state of the frequency list after some operation.
&lt;img src=&quot;/assets/img/image-6.png&quot; alt=&quot;&quot; /&gt;
Adding a new entry to the cache means pushing the cache key of that entry to the front of the queue of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;freqList[1]&lt;/code&gt;. Frames 1, 2, 4, 6, 7 show that. Here is the implementation.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LFUPolicy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;freqList&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;freqList&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;New&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;freqList&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PushFront&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LFUItem&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keyNode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;minFrequency&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;When a &lt;em&gt;Get operation&lt;/em&gt; occurs, it follow that we need to remove the element where it currently lies and push to the front of list stored at the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;freqList[frequency + 1]&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LFUPolicy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Access&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;remove&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;frequency&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LFUItem&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;frequency&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;freqList&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;frequency&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;freqList&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;frequency&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;New&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;freqList&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;frequency&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PushFront&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LFUItem&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;frequency&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keyNode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Lastly, the element the is chosen for eviction is the one that lies at the tail of the  &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;minFrequency&lt;/code&gt; list. In our example, would be entry 4. And here is how to implement it:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LFUPolicy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Victim&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CacheKey&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;fList&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;freqList&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;minFrequency&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;element&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fList&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Back&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;fList&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Remove&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;delete&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;keyNode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;element&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LFUItem&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;element&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LFUItem&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;final-considerations&quot;&gt;Final Considerations&lt;/h2&gt;

&lt;p&gt;In this post, we tried to take an abstract look at the cache and what happens when the cache is full. We discussed and showed how to implement 4 popular cache replacement policies, that can be found at &lt;a href=&quot;https://github.com/brunocalza/cache-replacement-policies&quot;&gt;brunocalza/cache-replacement-policies&lt;/a&gt;. The aim of this post was not to discuss the trade-offs or in which circumstances one approach is better than another. That is a huge topic and full of nuances. Having said that if you know how these techniques are applied in real systems feel free to share. Consider following me on &lt;a href=&quot;https://twitter.com/brunocalza&quot;&gt;Twitter&lt;/a&gt;, if you enjoy the content.  &lt;/p&gt;

&lt;h2 id=&quot;further-materials-and-references&quot;&gt;Further Materials and References&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;Meizhen, Weng, Shang Yanlei, and Tian Yue. “The design and implementation of LRU-based web cache.” 2013 8th International Conference on Communications and Networking in China (CHINACOM). IEEE, 2013.&lt;/li&gt;
  &lt;li&gt;Podlipnig, Stefan, and Laszlo Böszörmenyi. “A survey of web cache replacement strategies.” ACM Computing Surveys (CSUR) 35.4 (2003): 374-398.&lt;/li&gt;
  &lt;li&gt;Balamash, Abdullah, and Marwan Krunz. “An overview of web caching replacement algorithms.” IEEE Communications Surveys &amp;amp; Tutorials 6.2 (2004): 44-56.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Cache_(computing)&quot;&gt;Wikipedia Cache (computing)&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Page_replacement_algorithm&quot;&gt;Wikipedia Page Replacement Algorithm&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=b-dRK8B8dQk&quot;&gt;Clock Page Replacement&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;HashiCorp LRU’s library: &lt;a href=&quot;https://github.com/hashicorp/golang-lru&quot;&gt;hashicorp/golang-lru&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://dropbox.tech/infrastructure/caching-in-theory-and-practice&quot;&gt;Caching in theory and practice&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.geeksforgeeks.org/least-frequently-used-lfu-cache-implementation/&quot;&gt;Least Frequently Used (LFU) Cache Implementation&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://arpitbhayani.me/blogs/lfu&quot;&gt;Constant Time LFU&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://ieftimov.com/post/when-why-least-frequently-used-cache-implementation-golang/&quot;&gt;When and Why to use a Least Frequently Used (LFU) cache with an implementation in Golang&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://redis.io/topics/lru-cache&quot;&gt;Using Redis as an LRU cache&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

      </content>
    </entry>
  
    <entry>
      <title>How Buffer Pool Works: An Implementation In Go</title>
      <link href="https://bcalza.b-cdn.net/2021/02/11/how-buffer-pool-works-an-implementation-in-go.html" />
      <id>https://bcalza.b-cdn.net/2021/02/11/how-buffer-pool-works-an-implementation-in-go</id>
      <updated>2021-02-11T00:00:00+00:00</updated>
      <content type="html">
        &lt;p&gt;I have been exploring how disk-oriented databases efficiently move data in and out of disk. One way, that I explored in &lt;a href=&quot;/discovering-and-exploring-mmap-using-go/&quot;&gt;Discovering and exploring &lt;em&gt;mmap&lt;/em&gt; using Go&lt;/a&gt; and &lt;a href=&quot;/but-how-exactly-databases-use-mmap/&quot;&gt;But how, exactly, databases use &lt;em&gt;mmap&lt;/em&gt;?&lt;/a&gt;, is through &lt;strong&gt;memory-mapped files&lt;/strong&gt;. Although &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap&lt;/code&gt; is a really neat solution, it has some troubles. Most troubles come from the fact that the database has no control of how pages are flushed to disk since that job is carried through the OS. Because of that most databases avoid using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap&lt;/code&gt;. However, they still need to read and write data from disk in an efficient manner. And the answer to that is: &lt;strong&gt;buffer pool&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In this post,  we’ll explain how a &lt;strong&gt;buffer pool manager&lt;/strong&gt; works and how to implement one in &lt;strong&gt;Go&lt;/strong&gt;. But first, let’s do a simple recap of how data is structured in disk.&lt;/p&gt;

&lt;h2 id=&quot;how-a-database-file-is-structured&quot;&gt;How a database file is structured?&lt;/h2&gt;

&lt;p&gt;The file in disk that stores data of a database is just, in most databases, an array of bytes organized into fixed-length blocks called &lt;strong&gt;pages&lt;/strong&gt;. A database file is a linear array of continuous &lt;strong&gt;pages&lt;/strong&gt;.
&lt;img src=&quot;/assets/img/buffer_pool_pages.png&quot; alt=&quot;&quot; /&gt;A database file is organized into fixed-length blocks called &lt;strong&gt;pages&lt;/strong&gt;
The page length is usually a multiple of the filesystem block length ranging from &lt;strong&gt;512 bytes&lt;/strong&gt; to &lt;strong&gt;16KB&lt;/strong&gt;. &lt;em&gt;PostgreSQL&lt;/em&gt;, for example, uses &lt;strong&gt;8KB&lt;/strong&gt;,  although a different page size can be selected when compiling the server&lt;sup id=&quot;fnref:1&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;. &lt;em&gt;InnoDB&lt;/em&gt;, a storage engine used in MySQL, uses &lt;strong&gt;16KB&lt;/strong&gt; for page size&lt;sup id=&quot;fnref:2&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;The layout of these pages is not the scope of this post. However, just to give a glimpse of what happens when writes are executed, think that, for row-oriented databases, tuples are appended inside the page until it is filled up. When the page is filled up, another page needs to be allocated to support incoming writes.&lt;/p&gt;

&lt;h2 id=&quot;the-buffer-pool-manager&quot;&gt;The buffer pool manager&lt;/h2&gt;

&lt;p&gt;So we have our data in disk and the database wants to access it, let’s say, through a &lt;strong&gt;SELECT&lt;/strong&gt; command. Or we have new incoming data, that we want to store. In order to manage reads and writes efficiently, the common practice is to avoid touching the disk directly. The &lt;strong&gt;execution engine&lt;/strong&gt; that is processing the commands has no access to disk. It has access to an &lt;strong&gt;in-memory intermediate layer abstraction&lt;/strong&gt; called &lt;strong&gt;buffer pool&lt;/strong&gt;. 
&lt;img src=&quot;/assets/img/buffer_pool_architecture.png&quot; alt=&quot;&quot; /&gt;High level architecture of how data is accessed by the execution engine
The &lt;strong&gt;execution engine&lt;/strong&gt; requests pages from the &lt;strong&gt;buffer pool&lt;/strong&gt; by providing a &lt;strong&gt;page identifier&lt;/strong&gt;. It can read the content of the page and send the data to the client. Or it can modify the content of the page and send it back to the &lt;strong&gt;buffer pool&lt;/strong&gt; so that the change can be persisted. The execution engineaccess the buffer pool thinking that it is accessing the disk.&lt;/p&gt;

&lt;p&gt;I really think the term buffer pool is kind of misleading here, and it confused me a lot. And that’s because the purpose of this structure is not only to manage a pool of empty buffers and its reuse but also it is responsible for &lt;strong&gt;caching the disk pages into memory for future reuse&lt;/strong&gt; (not only by the requesting thread but also for concurrent threads). And as with any cache, also responsible for &lt;strong&gt;caching replacement policy&lt;/strong&gt;. For this reason, a more appropriate name for the buffer pool is &lt;strong&gt;page cache&lt;/strong&gt;. With this in mind, you can imagine the importance of the buffer pool for speeding up the database access to disk.&lt;/p&gt;

&lt;p&gt;The buffer pool can be implemented with a statically allocated array of &lt;strong&gt;frames&lt;/strong&gt;. Frames start empty and fill up as pages are being requested. There is an associative structure called &lt;strong&gt;page table&lt;/strong&gt;, that indicates which page is being held by each frame. 
&lt;img src=&quot;/assets/img/buffer_pool_full.png&quot; alt=&quot;&quot; /&gt;A representation of a full buffer pool
The above image indicates the two data structures used for implementing a buffer pool: a &lt;strong&gt;frame array&lt;/strong&gt; and a &lt;strong&gt;page table&lt;/strong&gt;. The image represents a buffer pool that can cache a maximum of 4 pages. The pages being cached are 4, 5, 6, and 7. The arrows indicate where the page can be found in memory (in the array of frames). Let’s look into another buffer pool:
&lt;img src=&quot;/assets/img/buffer_pool_empty_frame.png&quot; alt=&quot;&quot; /&gt;A representation of a buffer pool with one empty frame
At this, we can see that there are 3 pages in cache, and one space left. The frames that are free and are ready for receiving a new page are tracked in an additional structure called &lt;strong&gt;free list&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;pages in red&lt;/strong&gt; indicate the pages that are being accessed by some thread. That indication is very important because it means that the page cannot leave the buffer pool until all accesses are finished. We will be back to this.&lt;/p&gt;

&lt;p&gt;Another important concept is &lt;strong&gt;page replacement&lt;/strong&gt;. &lt;strong&gt;What happens when the cache is full, and a page that is not in the cache is requested?&lt;/strong&gt; We need to evict a page that is not being accessed and give its frame to the requested page. There are many algorithms (or policies) to decide which page to evict. We’ll talk about this later.&lt;/p&gt;

&lt;p&gt;With these concepts in mind, here is how we can represent our buffer pool in &lt;strong&gt;Go&lt;/strong&gt; code&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;//BufferPoolManager represents the buffer pool manager&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BufferPoolManager&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;diskManager&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DiskManager&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;pages&lt;/span&gt;       &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MaxPoolSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Page&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;replacer&lt;/span&gt;    &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ClockReplacer&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;freeList&lt;/span&gt;    &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FrameID&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;pageTable&lt;/span&gt;   &lt;span class=&quot;k&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PageID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FrameID&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;//NewBufferPoolManager returns a empty buffer pool manager&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NewBufferPoolManager&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DiskManager&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DiskManager&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;clockReplacer&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ClockReplacer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BufferPoolManager&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;freeList&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;make&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([]&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FrameID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;pages&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MaxPoolSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Page&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MaxPoolSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;freeList&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;freeList&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;FrameID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;pages&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FrameID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BufferPoolManager&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DiskManager&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pages&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;clockReplacer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;freeList&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;make&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PageID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FrameID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The buffer pool receives a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DiskManager&lt;/code&gt; implementation and a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;replacer&lt;/code&gt;. In our case, we are going to implement the &lt;strong&gt;clock replacement policy&lt;/strong&gt;, implemented by &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ClockReplacer&lt;/code&gt;, for page eviction. Internally, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pages&lt;/code&gt; represents the frame array of fixed size (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;MaxPoolSize&lt;/code&gt;); &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pageTable&lt;/code&gt; is a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;map&lt;/code&gt; the indicates which frame holds which page; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;freeList&lt;/code&gt; tracks the array of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;free&lt;/code&gt; frames. Initially, we have all frames ids inside the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;freeList&lt;/code&gt;, because all frames are free. That also means all frames pointing to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;nil&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;And here is the &lt;strong&gt;API&lt;/strong&gt; provided by our buffer pool implementation:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;NewPage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Page&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;FetchPage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pageID&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PageID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Page&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;FlushPage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pageID&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PageID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;FlushAllPages&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;DeletePage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pageID&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PageID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;error&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;UnpinPage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pageID&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PageID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;isDirty&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;error&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NewPage() *Page&lt;/code&gt;&lt;/strong&gt;: a clean new page can be requested. The buffer pool will ask the disk manager for a new page id;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;FetchPage(pageID PageID) *Page&lt;/code&gt;&lt;/strong&gt;: fetches a known page. If the page is in cache, it is immidiatatly returned. If not, the buffer pool will find it on disk and cache it;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;FlushPage(pageID PageID) bool&lt;/code&gt;&lt;/strong&gt;: writes the page that is in cache to disk;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;FlushAllPages()&lt;/code&gt;&lt;/strong&gt; writes all pages in cache to disk;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DeletePage(pageID PageID) error&lt;/code&gt;&lt;/strong&gt;: deletes page from cache and disk;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;UnpinPage(pageID PageID, isDirty bool) error&lt;/code&gt;&lt;/strong&gt;: indicates that the page is not used any more for the current requesting thread. If no more threads are using this page, the page is considered for eviction.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;how-to-represent-a-page&quot;&gt;How to represent a page&lt;/h2&gt;

&lt;p&gt;We are going to represent our page in a very simple manner. Our focus here is to learn how the buffer pool works.  Our page is just a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[]byte&lt;/code&gt; followed by an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;id&lt;/code&gt;. There is some metadata that needs to be tracked. One is the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pinCount&lt;/code&gt; that tracks the number of concurrent accesses. Each time the page is accessed the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pinCount&lt;/code&gt; is increased. When it is no more in use, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pinCount&lt;/code&gt; is decreased. So, if the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pinCount&lt;/code&gt; is greater than 0, we say the page is &lt;em&gt;pinned&lt;/em&gt; and it is indicated in our figures by the red mark. The other metadata variable is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;isDirty&lt;/code&gt;, which indicates that the page has been modified after it was read from disk.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;// PageID is the type of the page identifier&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PageID&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pageSize&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;5&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;// Page represents a page on disk&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Page&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;id&lt;/span&gt;       &lt;span class=&quot;n&quot;&gt;PageID&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;pinCount&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;isDirty&lt;/span&gt;  &lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;     &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pageSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;byte&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;the-clock-replacement-algorithm&quot;&gt;The Clock Replacement algorithm&lt;/h2&gt;

&lt;p&gt;When the buffer pool is full, and a new page needs to enter the cache, some page needs to be chosen to leave the cache.  This is a classic &lt;strong&gt;cache eviction&lt;/strong&gt; problem and there is no perfect solution because predicting which pages are going to be requested is a complicated problem. If you can’t predict which pages are going to be requested, you don’t know which pages must remain in cache and which pages must be evicted. The best you can do is to reduce the number of wrong page evictions. There are many strategies like &lt;strong&gt;FIFO&lt;/strong&gt; (f&lt;em&gt;irst-in first-out&lt;/em&gt;), &lt;strong&gt;LRU&lt;/strong&gt; (&lt;em&gt;least recently used&lt;/em&gt;), and &lt;strong&gt;LFU&lt;/strong&gt; (&lt;em&gt;least frequently used&lt;/em&gt;). We are not going to explore all of them and their trade-offs. For our buffer pool manager implementation, we have chosen one policy called &lt;strong&gt;CLOCK&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;When a thread from the execution engine no longer needs a page, it calls the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;UnpinPage&lt;/code&gt; method from the buffer pool. This method reduces the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pinCount&lt;/code&gt; of the page, and if the pin count reaches zero, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;frame id&lt;/code&gt; of where that page allocated is stored in the &lt;strong&gt;Clock Replacer&lt;/strong&gt; for future analysis. If, anytime, the same page is accessed again (its pin count increases), and it needs to be removed from the &lt;strong&gt;Clock Replacer&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;Clock Replacer&lt;/strong&gt; keeps tracks of all pages that are not being accessed in a &lt;strong&gt;circular buffer&lt;/strong&gt;. For each page, it also keeps a &lt;strong&gt;reference bit&lt;/strong&gt; to indicate if the page was recently accessed. There is a &lt;strong&gt;Clock Hand&lt;/strong&gt; or &lt;strong&gt;Clock Pointer&lt;/strong&gt; (CP) that iterates through the circular buffer looking for a &lt;em&gt;victim&lt;/em&gt;.
&lt;img src=&quot;/assets/img/clock_replacer.png&quot; alt=&quot;&quot; /&gt;Clock Replacer representation
The &lt;strong&gt;Clock Replacer&lt;/strong&gt; provides the following &lt;strong&gt;API&lt;/strong&gt;:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Victim&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FrameID&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Unpin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;id&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;FrameID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Pin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;id&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;FrameID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Unpin&lt;/code&gt;&lt;/strong&gt; is called when the buffer pool wants to make a page eligible for eviction because it is no longer accessed;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Pin&lt;/code&gt;&lt;/strong&gt; is called when the buffer pool wants to remove a page from cache eviction election because it was accessed and it cannot be evicted;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Victim&lt;/code&gt;&lt;/strong&gt; elects a victim through the clock replacement policy.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let’s see how the clock replacement policy works using an example.&lt;/p&gt;

&lt;p&gt;The state of the Clock Replacer after the calls &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Unpin(1)&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Unpin(2)&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Unpin(3)&lt;/code&gt;.&lt;/p&gt;

&lt;table class=&quot;buffer-pool-table&quot;&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;num&lt;/th&gt;
      &lt;th&gt;ref&lt;/th&gt;
      &lt;th&gt;clock hand&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;&amp;lt;-&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;2&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;3&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;We have &lt;strong&gt;3 frames&lt;/strong&gt; eligible for eviction. When &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Victim()&lt;/code&gt; is called, the clock replacement algorithm is run. The clock hand iterates through the circular buffer checking the reference bit. If the bit is set to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1&lt;/code&gt; (that means that the frame was recently added to the clock replacer), it changes the bit to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0&lt;/code&gt;. Below is how the clock replacer state changes when &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Victim()&lt;/code&gt; is run until it finds a victim for eviction.&lt;/p&gt;

&lt;table class=&quot;buffer-pool-table&quot;&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;num&lt;/th&gt;
      &lt;th&gt;ref&lt;/th&gt;
      &lt;th&gt;clock hand&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;2&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;&amp;lt;-&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;3&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;table class=&quot;buffer-pool-table&quot;&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;num&lt;/th&gt;
      &lt;th&gt;ref&lt;/th&gt;
      &lt;th&gt;clock hand&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;2&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;3&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;&amp;lt;-&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;table class=&quot;buffer-pool-table&quot;&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;num&lt;/th&gt;
      &lt;th&gt;ref&lt;/th&gt;
      &lt;th&gt;clock hand&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;&amp;lt;-&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;2&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;3&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;table class=&quot;buffer-pool-table&quot;&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;num&lt;/th&gt;
      &lt;th&gt;ref&lt;/th&gt;
      &lt;th&gt;clock hand&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;2&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;&amp;lt;-&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;3&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;If we call &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Unpin(4)&lt;/code&gt;, we have&lt;/p&gt;

&lt;table class=&quot;buffer-pool-table&quot;&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;num&lt;/th&gt;
      &lt;th&gt;ref&lt;/th&gt;
      &lt;th&gt;clock hand&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;2&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;&amp;lt;-&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;3&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;4&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;And after a call to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Victim()&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;2&lt;/code&gt; is evicted.&lt;/p&gt;

&lt;table class=&quot;buffer-pool-table&quot;&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;num&lt;/th&gt;
      &lt;th&gt;ref&lt;/th&gt;
      &lt;th&gt;clock hand&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;3&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;&amp;lt;-&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;4&lt;/td&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h3 id=&quot;fetching-a-page&quot;&gt;Fetching a page&lt;/h3&gt;

&lt;p&gt;First, we check if the page is in the buffer pool. If we can find the page in the cache, we increase the pin count (to indicate there is one more thread accessing the page) and make sure it not in the clock replacer by calling &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Pin&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;frameID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pageTable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pageID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;];&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;page&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pages&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;frameID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;page&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pinCount&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;replacer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;frameID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;page&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If it is not in the buffer pool we need to find a frame to allocate this page. The buffer pool can have free frames or it can be full. The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;getFrameID&lt;/code&gt; method handles this. We check our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;freeList&lt;/code&gt; to see if there are any frames available. If we can find an empty frame, we return its &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;id&lt;/code&gt; and remove it from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;freeList&lt;/code&gt;. If we can’t find an empty frame, we have to elect a frame from the buffer pool to free space by running the clock replace algorithm. The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bool&lt;/code&gt; indicates if the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;frame id&lt;/code&gt; came from the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;free list&lt;/code&gt; or from the eviction algorithm.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BufferPoolManager&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;getFrameID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FrameID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;freeList&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;frameID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;newFreeList&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;freeList&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;freeList&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;freeList&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;newFreeList&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;frameID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;true&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;replacer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Victim&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;false&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now that we have a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;frame id&lt;/code&gt; and it didn’t come from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;freeList&lt;/code&gt;, we have to remove the page that is allocated in that frame. But if the page was modified (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;isDirty&lt;/code&gt; set to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;true&lt;/code&gt;), then we have to first write it to disk.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;isFromFreeList&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;// remove page from current frame&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;currentPage&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pages&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;frameID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;currentPage&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;currentPage&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;isDirty&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;diskManager&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;WritePage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;currentPage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

        &lt;span class=&quot;nb&quot;&gt;delete&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pageTable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;currentPage&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Finally, we read the page from disk and allocate it to the chosen &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;frame id&lt;/code&gt;&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;page&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;diskManager&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ReadPage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pageID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;page&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pinCount&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pageTable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pageID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;frameID&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pages&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;frameID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;page&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;requesting-a-new-page&quot;&gt;Requesting a new page&lt;/h3&gt;

&lt;p&gt;Requesting a new page follows the same logic as fetching a page. We need to make sure there is space for a new page in the buffer pool by checking the free list or by replacing a page. After that, the buffer pool requests the disk manager a new page and put it in the cache.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;// allocates new page&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;pageID&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;diskManager&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AllocatePage&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pageID&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;page&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Page&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pageID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pageSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;byte&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{}}&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pageTable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pageID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;frameID&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pages&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;frameID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;page&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;indicating-that-a-page-is-no-longer-accessed&quot;&gt;Indicating that a page is no longer accessed&lt;/h3&gt;

&lt;p&gt;After some work is done by a thread on a page, it needs to call the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;UnpinPage&lt;/code&gt; method to indicate the buffer pool that it not using that page anymore. This call is important because it lets the buffer pool decide if that page can be considered for eviction.&lt;/p&gt;

&lt;p&gt;The method decreases the pin count of the page. If the pin count reaches &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0&lt;/code&gt;, it means that there is no thread accessing that page. So, that page is added to the clock replacer to be considered for eviction.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;page&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pages&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;frameID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;page&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DecPinCount&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;page&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pinCount&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;replacer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Unpin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;frameID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;a-visualization-tool&quot;&gt;A visualization tool&lt;/h2&gt;

&lt;p&gt;I was not satisfied with only implementing the buffer pool. I wanted to able to visualize it working. So I went on a mission to build a small JavaScript application that could interact with the buffer pool. I built a web server that instantiated a buffer pool and could receive HTTP requests and send commands to the buffer pool.&lt;/p&gt;

&lt;p&gt;Here is a small demo:
&lt;img src=&quot;/assets/img/buffer_pool_demo.gif&quot; alt=&quot;&quot; /&gt;Buffer pool demonstration
You can see four initial calls of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;New Page&lt;/code&gt;, loading all frames of our buffer pool. We flush all the pages to disk. Flushing the pages to disk reduces the pin count of all pages by one (that’s why they go from red to black). After that, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Unpin&lt;/code&gt; is called to pages &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;2&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;3&lt;/code&gt;, adding both respective frames to the clock replacer. Delete is called for page 2, removing it from the clock replacer, the buffer pool, and the disk, causing a &lt;strong&gt;disk fragmentation&lt;/strong&gt;. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;New page&lt;/code&gt; is called, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;page 4&lt;/code&gt; enters the buffer pool in the free &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;frame 2&lt;/code&gt;. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;New page&lt;/code&gt; is called again, but the buffer is full. The clock replacement algorithm runs and frees &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;frame 3&lt;/code&gt; for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;page 5&lt;/code&gt;, evicting &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;page 3&lt;/code&gt;. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Flush all&lt;/code&gt; is called writing pages &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;4&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;5&lt;/code&gt; to disk. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Fetch Page&lt;/code&gt; is called on &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;page 0&lt;/code&gt; twice, increasing the pin count by two. That means that the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Unpin&lt;/code&gt; needs to be called twice, in order to make it eligible for eviction. Last, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;New Page&lt;/code&gt; is called, but the buffer is full again. After the clock replacement algorithm is run, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;page 6&lt;/code&gt; goes to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;frame 0&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;You can play yourself by following the instructions at the &lt;a href=&quot;https://github.com/brunocalza/buffer-pool-manager&quot;&gt;repo&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;final-thoughts&quot;&gt;Final thoughts&lt;/h2&gt;

&lt;p&gt;I have been studying databases for awhile mainly through the course &lt;a href=&quot;https://15445.courses.cs.cmu.edu/fall2019/&quot;&gt;Intro to Database Systems&lt;/a&gt; of &lt;a href=&quot;https://twitter.com/andy_pavlo&quot;&gt;Andy Pavlo&lt;/a&gt;. In the course, he talks about how &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap&lt;/code&gt; is most of the time a bad idea for solving the &lt;strong&gt;“data in disk larger than the available memory”&lt;/strong&gt; problem and introduces the concept of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;buffer pool&lt;/code&gt;. One of the assignments is to implement a buffer pool manager in &lt;strong&gt;C++&lt;/strong&gt;. That’s where the idea of implementing one in &lt;strong&gt;Go&lt;/strong&gt; came from.&lt;/p&gt;

&lt;p&gt;Even though &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;buffer pool&lt;/code&gt; are two completely different implementations to solve the same problem, I have come to found both very similar. The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;buffer pool&lt;/code&gt; acts like it has all pages in memory, and if the page is not in the cache, it removes one from the cache and fetches the page from the disk putting it on the cache. Pretty much the same logic of the virtual memory that is behind the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap&lt;/code&gt; call. The difference in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;buffer pool&lt;/code&gt; approach is that everything is in control. The way pages are flushed, locked, and evicted from cache is controlled by the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;buffer pool&lt;/code&gt;. And this kind of control, you completely lose with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap&lt;/code&gt;. And that is why is not to be recommended unless you really know what you are doing.&lt;/p&gt;

&lt;p&gt;A buffer pool is supposed to be a shared resource and it is meant to be used by multiples threads doing queries at the same time. In this implementation, it was not our focus to build a thread-safe buffer pool protecting the data structures from race conditions. A thread-safe implementation is planned.&lt;/p&gt;

&lt;p&gt;I am on a journey to learn more about databases by implementing parts of one. If you like this kind of content, follow me on &lt;a href=&quot;https://twitter.com/brunocalza&quot;&gt;Twitter&lt;/a&gt;. I will be sharing more related stuff.&lt;/p&gt;

&lt;p&gt;Full code at &lt;a href=&quot;https://github.com/brunocalza/buffer-pool-manager&quot;&gt;brunocalza/buffer-pool-manager&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;learn-more&quot;&gt;Learn more&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=qOO0HGVToYA&amp;amp;list=PLzzVuDSjP25Q0YDDDpAgfK_da5Ba357Tg&amp;amp;index=1&quot;&gt;CS186Berkeley Playlist on Buffer Management&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://15445.courses.cs.cmu.edu/fall2019/&quot;&gt;Intro to Database Systems&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://dsf.berkeley.edu/papers/fntdb07-architecture.pdf&quot;&gt;Architecture of a Database System&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://www.postgresql.org/docs/13/storage-page-layout.html&quot;&gt;68.6. Database Page Layout&lt;/a&gt; - PostgreSQL documentation on how database pages are laid out in storage. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://dev.mysql.com/doc/refman/5.7/en/innodb-init-startup-configuration.html&quot;&gt;14.8.1 InnoDB Startup Configuration&lt;/a&gt; - MySQL documentation on InnoDB’s page size configuration. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

      </content>
    </entry>
  
    <entry>
      <title>But how, exactly, databases use mmap?</title>
      <link href="https://bcalza.b-cdn.net/2021/01/18/but-how-exactly-databases-use-mmap.html" />
      <id>https://bcalza.b-cdn.net/2021/01/18/but-how-exactly-databases-use-mmap</id>
      <updated>2021-01-18T00:00:00+00:00</updated>
      <content type="html">
        &lt;p&gt;In a previous post &lt;a href=&quot;/discovering-and-exploring-mmap-using-go/&quot;&gt;Discovering and exploring mmap using Go&lt;/a&gt;, we talked about how databases have a major problem to solve, which is: &lt;strong&gt;how to deal with data stored in disk that is bigger than the available memory&lt;/strong&gt;. We talked about how many databases solve this problem using &lt;strong&gt;memory-mapped files&lt;/strong&gt; and explored &lt;strong&gt;mmap&lt;/strong&gt; capabilities.&lt;/p&gt;

&lt;p&gt;Knowing that databases use &lt;strong&gt;memory-mapped files&lt;/strong&gt; to solve the problem was not enough for me. It solved part of the mystery but a question remained: &lt;strong&gt;how, exactly, databases use &lt;em&gt;mmap&lt;/em&gt; to read and write data from disk?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I decided to dig through a database source code to answer that question. There are plenty of databases that use mmap. Some of them decided to not use anymore. Some examples: &lt;a href=&quot;https://www.sqlite.org/index.html&quot;&gt;SQLite&lt;/a&gt; has an option of accessing disk content directly using memory-mapped I/O&lt;sup id=&quot;fnref:1&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;, it seems &lt;a href=&quot;https://github.com/google/leveldb&quot;&gt;LevelDB&lt;/a&gt; used to use but it changed it&lt;sup id=&quot;fnref:2&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;, &lt;a href=&quot;https://lucene.apache.org/&quot;&gt;Lucene&lt;/a&gt; has an option with &lt;em&gt;MMapDirectory&lt;/em&gt;&lt;sup id=&quot;fnref:3&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;, &lt;a href=&quot;https://lmdb.readthedocs.io/&quot;&gt;LMDB&lt;/a&gt; uses mmap&lt;sup id=&quot;fnref:4&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;, a simple key/value in-memory database from Couchbase called &lt;a href=&quot;https://github.com/couchbase/moss&quot;&gt;moss&lt;/a&gt; uses mmap for durability of in-memory data&lt;sup id=&quot;fnref:5&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;5&lt;/a&gt;&lt;/sup&gt; and &lt;a href=&quot;https://www.mongodb.com/&quot;&gt;MongoDB&lt;/a&gt; removed &lt;strong&gt;mmap&lt;/strong&gt; storage engine for &lt;em&gt;WiredTiger&lt;/em&gt;&lt;sup id=&quot;fnref:6&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;6&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;I chose &lt;a href=&quot;https://github.com/boltdb/bolt&quot;&gt;bolt&lt;/a&gt;, a simple &lt;strong&gt;key/value store&lt;/strong&gt; implemented in &lt;strong&gt;Go&lt;/strong&gt; by &lt;a href=&quot;https://twitter.com/benbjohnson&quot;&gt;Ben Johnson&lt;/a&gt; and inspired by the &lt;strong&gt;LMDB&lt;/strong&gt; project, for this endeavor. Mostly because of source code simplicity and my familiarity with Go language. I know a simple key/value store might not be the most complete source code for learning all the details of reading/writing data to disk, but as I have found out, it was more than enough to get a grasp of it.&lt;/p&gt;

&lt;p&gt;The original bolt repository is no longer maintained. A fork of &lt;strong&gt;bolt&lt;/strong&gt; called &lt;strong&gt;bbolt&lt;/strong&gt; is maintained and used by &lt;a href=&quot;https://github.com/etcd-io/bbolt&quot;&gt;etcd&lt;/a&gt;&lt;sup id=&quot;fnref:7&quot;&gt;&lt;a href=&quot;#fn:7&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;7&lt;/a&gt;&lt;/sup&gt;. If you are not familiar with &lt;strong&gt;bolt&lt;/strong&gt;, I recommend the articles &lt;a href=&quot;https://npf.io/2014/07/intro-to-boltdb-painless-performant-persistence/&quot;&gt;Intro to BoltDB: Painless Performant Persistence&lt;/a&gt;&lt;sup id=&quot;fnref:8&quot;&gt;&lt;a href=&quot;#fn:8&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;8&lt;/a&gt;&lt;/sup&gt; and &lt;a href=&quot;https://www.progville.com/go/bolt-embedded-db-golang/&quot;&gt;Bolt — an embedded key/value database for Go&lt;/a&gt;&lt;sup id=&quot;fnref:9&quot;&gt;&lt;a href=&quot;#fn:9&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;9&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;h2 id=&quot;the-start&quot;&gt;The start&lt;/h2&gt;

&lt;p&gt;I download the code to my machine and opened it in my editor. I thought a good place to start digging was to find out where the database was initialized and look for any references of &lt;strong&gt;mmap&lt;/strong&gt; there. Like most embedded databases, &lt;strong&gt;bolt&lt;/strong&gt; has an &lt;a href=&quot;https://github.com/boltdb/bolt/blob/fd01fc79c553a8e99d512a07e8e0c63d4a3ccfc5/db.go#L150&quot;&gt;Open&lt;/a&gt; method for opening the database or creating a new one if it does not exist. Inside it, I found a reference to a private &lt;strong&gt;mmap&lt;/strong&gt; function. That’s a good start.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;// Memory map the data file.&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mmap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InitialMmapSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;close&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;how-much-memory-should-i-allocate&quot;&gt;How much memory should I allocate?&lt;/h2&gt;

&lt;p&gt;The private &lt;a href=&quot;/p/b43be99e-c260-446b-a1b1-40261aaffcf3/func%20(db%20*DB)%20mmap(minsz%20int)%20error%20%7B&quot;&gt;mmap&lt;/a&gt; is responsible for opening the memory-mapped file. In order to do this, it needs to figure out how much memory it is going to allocate. This task is accomplished by another method called &lt;a href=&quot;https://github.com/boltdb/bolt/blob/fd01fc79c553a8e99d512a07e8e0c63d4a3ccfc5/db.go#L308&quot;&gt;mmapSize&lt;/a&gt;. Given the size of the database, this method figures out how many bytes of memory should be allocated.&lt;/p&gt;

&lt;p&gt;It starts by doubling the size from &lt;strong&gt;32KB&lt;/strong&gt; to &lt;strong&gt;1GB&lt;/strong&gt;. But if the database is larger than &lt;strong&gt;1GB&lt;/strong&gt;, it grows &lt;strong&gt;1GB&lt;/strong&gt; at a time.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;// Double the size from 32KB until 1GB.&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;15&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;30&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;size&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;// If larger than 1GB then grow by 1GB at a time.&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;remainder&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;maxMmapStep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;remainder&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;maxMmapStep&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;remainder&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It is all fine by now. That’s how &lt;strong&gt;bolt&lt;/strong&gt; figured out how much to allocate. But there is a piece of the puzzle now that will be very important when we talk about database storage layout. After figuring out how much to allocate, it needs to ensure that the allocated size is a multiple of the &lt;strong&gt;page size&lt;/strong&gt;. If you are not familiar with database storage and don’t know what a &lt;strong&gt;page&lt;/strong&gt; is, don’t worry, we’ll be back to this.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;// Ensure that the mmap size is a multiple of the page size.&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;// This should always be true since we&apos;re incrementing in MBs.&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;pageSize&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pageSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pageSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pageSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pageSize&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Shouldn’t it check if we are allocating more than we have available? That’s the last piece of the &lt;a href=&quot;https://github.com/boltdb/bolt/blob/fd01fc79c553a8e99d512a07e8e0c63d4a3ccfc5/db.go#L308&quot;&gt;mmapSize&lt;/a&gt; method.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;// If we&apos;ve exceeded the max size then only grow up to the max size.&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maxMapSize&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maxMapSize&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The &lt;a href=&quot;https://github.com/boltdb/bolt/blob/fd01fc79c553a8e99d512a07e8e0c63d4a3ccfc5/bolt_amd64.go#L4&quot;&gt;maxMapSize&lt;/a&gt; constant is set to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0xFFFFFFFFFFFF&lt;/code&gt; on &lt;strong&gt;AMD64&lt;/strong&gt; architectures.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;The &lt;strong&gt;AMD64&lt;/strong&gt; architecture defines a 64-bit virtual address format, of which the low-order 48 bits are used in current implementations. This allows up to 256 TiB (248 bytes) of virtual address space. - &lt;a href=&quot;https://en.wikipedia.org/wiki/X86-64&quot;&gt;x86-64 Wiki&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is the limit of the &lt;strong&gt;bolt&lt;/strong&gt; database file.&lt;/p&gt;

&lt;h2 id=&quot;calling-the-system-call&quot;&gt;Calling the system call&lt;/h2&gt;

&lt;p&gt;Now that &lt;strong&gt;bolt&lt;/strong&gt; knows how much it should allocate, it calls the system call &lt;a href=&quot;/p/b43be99e-c260-446b-a1b1-40261aaffcf3/func%20mmap(db%20*DB,%20sz%20int)%20error%20%7B&quot;&gt;mmap&lt;/a&gt;. Here is the full code for &lt;em&gt;Unix&lt;/em&gt;-like environments:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;// mmap memory maps a DB&apos;s data file.&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mmap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;db&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DB&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;error&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;// Map the data file to memory.&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;syscall&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Mmap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Fd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()),&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;syscall&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PROT_READ&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;syscall&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MAP_SHARED&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MmapFlags&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;// Advise the kernel that the mmap is accessed randomly.&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;madvise&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;syscall&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MADV_RANDOM&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fmt&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Errorf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;madvise: %s&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;// Save the original byte slice and convert to a byte array pointer.&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dataref&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;maxMapSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;byte&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unsafe&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pointer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]))&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;datasz&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It is a pretty straightforward code. I have some observations to make about &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;syscall.PROT_READ&lt;/code&gt;, but I’ll leave for the next session. It is nice to see the call to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;madvise&lt;/code&gt; there, although I don’t know its benefits&lt;sup id=&quot;fnref:10&quot;&gt;&lt;a href=&quot;#fn:10&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;10&lt;/a&gt;&lt;/sup&gt;. I would love to know the importance of that call, if it improves performance or if another flag could be set to get different behavior from OS for different use cases.&lt;/p&gt;

&lt;p&gt;The mapped memory is set to the variables &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;db.dataref&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;db.data&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dataref&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;maxMapSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;byte&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unsafe&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pointer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I would like to know about the importance of keeping track of both variables. I could not grasp what is going on in the conversion to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;db.data&lt;/code&gt;&lt;sup id=&quot;fnref:11&quot;&gt;&lt;a href=&quot;#fn:11&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;11&lt;/a&gt;&lt;/sup&gt;. But anyway, what we have to keep in mind is that is through these variables that &lt;strong&gt;bolt&lt;/strong&gt; will read data from disk.&lt;/p&gt;

&lt;h2 id=&quot;what-about-writes&quot;&gt;What about writes?&lt;/h2&gt;

&lt;p&gt;While skimming through the source code, I looked for evidence of how &lt;strong&gt;mmap&lt;/strong&gt; was used for both reads and writes. I dug both &lt;a href=&quot;https://github.com/boltdb/bolt/blob/fd01fc79c553a8e99d512a07e8e0c63d4a3ccfc5/bucket.go#L266&quot;&gt;Get&lt;/a&gt; and &lt;a href=&quot;https://github.com/boltdb/bolt/blob/fd01fc79c553a8e99d512a07e8e0c63d4a3ccfc5/bucket.go#L285&quot;&gt;Put&lt;/a&gt; method. I could not find any place where the references to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;db.dataref&lt;/code&gt; or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;db.data&lt;/code&gt; were being updated. I discovered that the writes to disk happen when &lt;a href=&quot;https://github.com/boltdb/bolt/blob/fd01fc79c553a8e99d512a07e8e0c63d4a3ccfc5/tx.go#L144&quot;&gt;Commit&lt;/a&gt; is called to a transaction. But there I could only find calls to &lt;a href=&quot;https://golang.org/pkg/os/#File.WriteAt&quot;&gt;WriteAt&lt;/a&gt;&lt;sup id=&quot;fnref:12&quot;&gt;&lt;a href=&quot;#fn:12&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;12&lt;/a&gt;&lt;/sup&gt;. So I gave up my search of trying to understand how &lt;strong&gt;mmap&lt;/strong&gt; was used for writes.&lt;/p&gt;

&lt;p&gt;Then, suddenly, while looking back to the call of &lt;strong&gt;mmap&lt;/strong&gt;, I noticed the  &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;syscall.PROT_READ&lt;/code&gt; flag that I have not noticed the first time I looked at the code. So &lt;strong&gt;mmap&lt;/strong&gt;, is only used for reads in &lt;strong&gt;bolt&lt;/strong&gt;. Another place that indicates this is in the definition of &lt;a href=&quot;https://github.com/boltdb/bolt/blob/fd01fc79c553a8e99d512a07e8e0c63d4a3ccfc5/db.go#L100&quot;&gt;DB struct&lt;/a&gt;:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;dataref  []byte   // mmap&apos;ed readonly, write throws SEGV
data     *[maxMapSize]byte
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;That made perfect sense to me. Since flushes to disk are very hard to control when using &lt;strong&gt;mmap&lt;/strong&gt;&lt;sup id=&quot;fnref:13&quot;&gt;&lt;a href=&quot;#fn:13&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;13&lt;/a&gt;&lt;/sup&gt;, it is probably the safest approach. How &lt;strong&gt;bolt&lt;/strong&gt; does writing is a topic of another post.&lt;/p&gt;

&lt;h2 id=&quot;how-the-database-file-is-structured&quot;&gt;How the database file is structured?&lt;/h2&gt;

&lt;p&gt;We know how and when &lt;strong&gt;bolt&lt;/strong&gt; allocates memory and that &lt;strong&gt;mmap&lt;/strong&gt; is not used for writes. But how, exactly, &lt;strong&gt;bolt&lt;/strong&gt; can find the value of a key? To understand that, we have to understand how typically databases structure their files. I am not going to do deep here. Mostly because I don’t understand enough to go deep. Just going to try to give a glimpse of what is going on.&lt;/p&gt;

&lt;p&gt;A file is just an array of bytes. We have to apply some reasoning to this array of bytes to work with it effectively. Databases structure their files in disk into blocks (chunks of bytes) called &lt;strong&gt;pages&lt;/strong&gt;. &lt;strong&gt;bolt&lt;/strong&gt; is no different. The database file can be seen as
&lt;img src=&quot;/assets/img/image--17-.png&quot; alt=&quot;&quot; /&gt;
Each page has a &lt;strong&gt;fixed length of bytes&lt;/strong&gt;, typically the same size as the OS page (usually &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;4096 bytes&lt;/code&gt;)&lt;sup id=&quot;fnref:14&quot;&gt;&lt;a href=&quot;#fn:14&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;14&lt;/a&gt;&lt;/sup&gt;. Here is the &lt;a href=&quot;https://github.com/boltdb/bolt/blob/fd01fc79c553a8e99d512a07e8e0c63d4a3ccfc5/db.go#L40&quot;&gt;part&lt;/a&gt; of &lt;strong&gt;bolt&lt;/strong&gt; that sets the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pageSize&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;// default page size for db is set to the OS page size.&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;defaultPageSize&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Getpagesize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Every database has its own page layout. The page layout of &lt;strong&gt;bolt&lt;/strong&gt; is defined at &lt;a href=&quot;https://github.com/boltdb/bolt/blob/fd01fc79c553a8e99d512a07e8e0c63d4a3ccfc5/page.go#L28&quot;&gt;page.go&lt;/a&gt; as&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;branchPageFlag&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0x01&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;leafPageFlag&lt;/span&gt;     &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0x02&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;metaPageFlag&lt;/span&gt;     &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0x04&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;freelistPageFlag&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0x10&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bucketLeafFlag&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0x01&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pgid&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint64&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;page&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;id&lt;/span&gt;       &lt;span class=&quot;n&quot;&gt;pgid&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;flags&lt;/span&gt;    &lt;span class=&quot;kt&quot;&gt;uint16&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;count&lt;/span&gt;    &lt;span class=&quot;kt&quot;&gt;uint16&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;overflow&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;uint32&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt;      &lt;span class=&quot;kt&quot;&gt;uintptr&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;id&lt;/code&gt;: it is the page identifier used to index the page. Given a page id, I can locate it in the disk through &lt;strong&gt;mmap&lt;/strong&gt;, since the disk file is just a list of continuous fixed length pages;&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;flags&lt;/code&gt;: it tells the page type. There are four types of page: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;meta&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;freeList&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;leaf&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;branch&lt;/code&gt;;&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;count&lt;/code&gt;: indicates the number of elements stored in the page;&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;overflow&lt;/code&gt;: represents the number of subsequent pages;&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ptr&lt;/code&gt;: indicates the end of page header and start of page data. This is where the keys and values are going to be stored.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A visual representation of a page:
&lt;img src=&quot;/assets/img/image--16-.png&quot; alt=&quot;&quot; /&gt;Layout of a page
With this in mind, we can look at the code that retrieves a &lt;a href=&quot;https://github.com/boltdb/bolt/blob/fd01fc79c553a8e99d512a07e8e0c63d4a3ccfc5/db.go#L792&quot;&gt;page&lt;/a&gt; given its id.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;// page retrieves a page reference from the mmap based on the current page size.&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;db&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DB&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;page&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;id&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pgid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;page&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;pos&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;id&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pgid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pageSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;page&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unsafe&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pointer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]))&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;how-to-perform-an-efficient-search&quot;&gt;How to perform an efficient search?&lt;/h2&gt;

&lt;p&gt;We know how the database file is structured in disk and we know how to retrieve a page from disk. But how a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bucket.Get([]byte(&quot;key&quot;))&lt;/code&gt; search works? We are not going to go into too much detail here. I hope the abstraction I created will be enough to get a clue about what is going on.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What if the pages themselves, in the data part, contained references to other pages? And what if these references build up to form a B+Tree?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That is exactly what &lt;strong&gt;bolt&lt;/strong&gt; does. Thinking of the page as a node of a &lt;strong&gt;B+Tree&lt;/strong&gt;&lt;sup id=&quot;fnref:15&quot;&gt;&lt;a href=&quot;#fn:15&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;15&lt;/a&gt;&lt;/sup&gt;. In a &lt;strong&gt;B+Tree&lt;/strong&gt; we have internal nodes and leaves. That’s the reason for the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;flags&lt;/code&gt; attribute, to indicate what kind of node that page is.
&lt;img src=&quot;/assets/img/image--18-.png&quot; alt=&quot;&quot; /&gt;Abstract representation of a B+Tree in &lt;strong&gt;bolt&lt;/strong&gt;
So to perform the search of a key, you start at the root node a do a B+Tree traversal on mmapped disk pages. Therefore, &lt;strong&gt;bolt&lt;/strong&gt; is a memory-mapped B+Tree file. The more memory you have, the more it will behave like a memory key/value store.&lt;/p&gt;

&lt;p&gt;There are many more details about this process. Most of it I don’t understand myself. So let’s just keep it simple at this abstract level.&lt;/p&gt;

&lt;h2 id=&quot;resizing-mmap&quot;&gt;Resizing mmap&lt;/h2&gt;

&lt;p&gt;When I started looking at the source code, I searched for all calls of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap&lt;/code&gt;. The first was found at the Open method, as explained in the beginning. And the second one was found at the &lt;a href=&quot;https://github.com/boltdb/bolt/blob/fd01fc79c553a8e99d512a07e8e0c63d4a3ccfc5/db.go#L827&quot;&gt;allocate&lt;/a&gt; method.&lt;/p&gt;

&lt;p&gt;When &lt;strong&gt;bolt&lt;/strong&gt; is writing it needs to make sure it is not consuming all the allocated memory. If it sees that it is going to exceed the database size, it resizes the memory.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;// Resize mmap() if we&apos;re at the end.&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;id&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rwtx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;meta&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pgid&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;minsz&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pgid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pageSize&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;minsz&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;datasz&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;db&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mmap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;minsz&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fmt&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Errorf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;mmap allocate error: %s&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;Reading &lt;strong&gt;bolt&lt;/strong&gt; source code is a very nice way of understanding database internals. It was not very intimidating as I thought it would be. We ignored most of its source code, trying to focus only on how &lt;strong&gt;mmap&lt;/strong&gt; is used to retrieve data from disk efficiently. There are so many more concepts we can learn from &lt;strong&gt;bolt&lt;/strong&gt; like transactions, atomic, isolation, concurrency control, but I’ll leave for another posts.&lt;/p&gt;

&lt;p&gt;It is important to remind that the strategy used by &lt;strong&gt;bolt&lt;/strong&gt; is just one of multiple strategies. Others databases uses different page layouts and different data structures. However, on a higher level the logic of mmapped databases should be the same I guess.&lt;/p&gt;

&lt;p&gt;I am on a journey to learn more about databases. If you are interested you can follow me on &lt;a href=&quot;https://twitter.com/brunocalza&quot;&gt;twitter&lt;/a&gt;, where I share more related content.&lt;/p&gt;

&lt;h3 id=&quot;if-you-want-to-learn-more-about-bolt&quot;&gt;If you want to learn more about bolt&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://youjiali1995.github.io/storage/boltdb/&quot;&gt;Boltdb source code analysis&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.jianshu.com/p/b86a69892990&quot;&gt;BoltDB for block persistence (1)&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=bouIpFd9VGM&quot;&gt;Go-nuts and Bolts: An Introduction to BoltDB&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot;&gt;
      &lt;p&gt;SQLite Documentation. &lt;a href=&quot;https://sqlite.org/mmap.html&quot;&gt;Memory-Mapped I/O&lt;/a&gt;. SQLite.org. Accessed January 2021. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot;&gt;
      &lt;p&gt;Daly, S. (2011). &lt;a href=&quot;https://groups.google.com/g/leveldb/c/C5Hh__JfdrQ&quot;&gt;mmap based writing vs. stdio based writing&lt;/a&gt;. LevelDB Google Groups discussion thread. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot;&gt;
      &lt;p&gt;Prante, J. (2012, July 26). &lt;a href=&quot;https://jprante.github.io/lessons/2012/07/26/Mmap-with-Lucene.html&quot;&gt;Memory-mapped files with Lucene: some more aspects&lt;/a&gt;. Jörg Prante’s Blog. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot;&gt;
      &lt;p&gt;Symas Corporation. &lt;a href=&quot;https://symas.com/performance-tradeoffs-in-lmdb/&quot;&gt;Performance Tradeoffs in LMDB&lt;/a&gt;. Lightning Memory-Mapped Database Documentation. &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot;&gt;
      &lt;p&gt;Schoch, M. (2017). &lt;a href=&quot;https://www.youtube.com/watch?v=ttebJcN5bgQ&quot;&gt;Building a High-Performance Key/Value Store in Go&lt;/a&gt;. GopherCon 2017 presentation. YouTube. &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot;&gt;
      &lt;p&gt;MongoDB Documentation. &lt;a href=&quot;https://docs.mongodb.com/manual/core/storage-engines/&quot;&gt;Storage Engines&lt;/a&gt;. MongoDB Manual v4.4. &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:7&quot;&gt;
      &lt;p&gt;etcd-io. &lt;a href=&quot;https://github.com/etcd-io/bbolt&quot;&gt;bbolt: An embedded key/value database for Go&lt;/a&gt;. GitHub repository. Fork of Ben Johnson’s Bolt. &lt;a href=&quot;#fnref:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:8&quot;&gt;
      &lt;p&gt;NPF. (2014, July 7). &lt;a href=&quot;https://npf.io/2014/07/intro-to-boltdb-painless-performant-persistence/&quot;&gt;Intro to BoltDB: Painless Performant Persistence&lt;/a&gt;. npf.io. &lt;a href=&quot;#fnref:8&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:9&quot;&gt;
      &lt;p&gt;ProgVille. &lt;a href=&quot;https://www.progville.com/go/bolt-embedded-db-golang/&quot;&gt;Bolt — an embedded key/value database for Go&lt;/a&gt;. ProgVille Blog. &lt;a href=&quot;#fnref:9&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:10&quot;&gt;
      &lt;p&gt;Stevens, W. R., &amp;amp; Rago, S. A. (2013). &lt;em&gt;Advanced Programming in the UNIX Environment&lt;/em&gt; (3rd ed.). Addison-Wesley Professional. The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;madvise()&lt;/code&gt; system call provides advice to the kernel about the expected usage pattern of the memory region. &lt;a href=&quot;#fnref:10&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:11&quot;&gt;
      &lt;p&gt;Go Documentation. &lt;a href=&quot;https://golang.org/pkg/unsafe/#Pointer&quot;&gt;unsafe.Pointer&lt;/a&gt;. The conversion creates a typed pointer to the memory-mapped region for direct memory access. &lt;a href=&quot;#fnref:11&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:12&quot;&gt;
      &lt;p&gt;Go Documentation. &lt;a href=&quot;https://golang.org/pkg/os/#File.WriteAt&quot;&gt;os.File.WriteAt&lt;/a&gt;. Package os documentation for file operations. &lt;a href=&quot;#fnref:12&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:13&quot;&gt;
      &lt;p&gt;Love, R. (2013). &lt;em&gt;Linux System Programming&lt;/em&gt; (2nd ed.). O’Reilly Media. Chapter on memory mapping discusses the challenges of controlling when mmap’d data is flushed to disk. &lt;a href=&quot;#fnref:13&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:14&quot;&gt;
      &lt;p&gt;Silberschatz, A., Galvin, P. B., &amp;amp; Gagne, G. (2018). &lt;em&gt;Operating System Concepts&lt;/em&gt; (10th ed.). Wiley. The standard page size on most systems is 4KB (4096 bytes). &lt;a href=&quot;#fnref:14&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:15&quot;&gt;
      &lt;p&gt;Cormen, T. H., Leiserson, C. E., Rivest, R. L., &amp;amp; Stein, C. (2009). &lt;em&gt;Introduction to Algorithms&lt;/em&gt; (3rd ed.). MIT Press. Chapter on B-Trees and B+Trees. &lt;a href=&quot;#fnref:15&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

      </content>
    </entry>
  
    <entry>
      <title>Discovering and exploring mmap using Go</title>
      <link href="https://bcalza.b-cdn.net/2021/01/10/discovering-and-exploring-mmap-using-go.html" />
      <id>https://bcalza.b-cdn.net/2021/01/10/discovering-and-exploring-mmap-using-go</id>
      <updated>2021-01-10T00:00:00+00:00</updated>
      <content type="html">
        &lt;p&gt;Recently I’ve come to know the concept of &lt;strong&gt;memory-mapped files&lt;/strong&gt; while watching a lecture of the course &lt;a href=&quot;https://15445.courses.cs.cmu.edu/fall2019/&quot;&gt;Intro to Database Systems&lt;/a&gt; of &lt;a href=&quot;https://twitter.com/andy_pavlo&quot;&gt;Andy Pavlo&lt;/a&gt; on database storage. One of the main problems a database storage engine has to solve is &lt;strong&gt;how to deal with data in disk that is bigger than the available memory&lt;/strong&gt;. At a higher level, the main purpose of a disk-oriented storage engine is to manipulate data files in a disk. But if we assume that the data in the disk will eventually get bigger than the available memory, we cannot simply load the whole data file into memory, do the change, and write it back to disk.&lt;/p&gt;

&lt;p&gt;This is not a new problem in Computer Science. When operational systems were being developed in the early 1960s, a similar problem was faced: &lt;strong&gt;how can we run programs stored in disk that are larger than the available memory?&lt;/strong&gt; A solution to this problem was made by a group in Manchester, implemented on the &lt;a href=&quot;https://en.wikipedia.org/wiki/Atlas_(computer)&quot;&gt;Atlas Computer&lt;/a&gt;, in 1961. It was called &lt;em&gt;virtual memory&lt;/em&gt;. The &lt;em&gt;virtual memory&lt;/em&gt; gives a running program the illusion that it has big enough memory, despite the fact that the computer does not have enough.&lt;/p&gt;

&lt;p&gt;We are not going to go deep on how &lt;em&gt;virtual memory&lt;/em&gt; works. Just have in mind that when a program is accessing memory it is accessing the &lt;em&gt;virtual memory&lt;/em&gt;. And maybe the data the program is trying to access is not actually in memory, but it does not matter. The operational system will make pretend that it is by going to disk, and putting it there, and replace an old chunk of memory that is not going to be used.&lt;/p&gt;

&lt;p&gt;So, one of the ways a database storage engine can solve the larger than memory problem is to make use of &lt;em&gt;virtual memory&lt;/em&gt; and the concept of &lt;strong&gt;memory-mapped files&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In Linux, we can make this use by using the system call &lt;a href=&quot;https://man7.org/linux/man-pages/man2/mmap.2.html&quot;&gt;mmap&lt;/a&gt; that lets you map a file, no matter how big, directly into memory. If your program needs to manipulate the file, all it needs is to manipulate the memory. The operating system handles the writes to disk for you.&lt;/p&gt;

&lt;p&gt;In some occasions, programmers find this method more convenient than the usual system calls: &lt;a href=&quot;https://man7.org/linux/man-pages/man2/open.2.html&quot;&gt;open&lt;/a&gt;, &lt;a href=&quot;https://man7.org/linux/man-pages/man2/read.2.html&quot;&gt;read&lt;/a&gt;, &lt;a href=&quot;https://man7.org/linux/man-pages/man2/write.2.html&quot;&gt;write&lt;/a&gt;, &lt;a href=&quot;https://man7.org/linux/man-pages/man2/lseek.2.html&quot;&gt;lseek&lt;/a&gt; and &lt;a href=&quot;https://man7.org/linux/man-pages/man2/close.2.html&quot;&gt;close&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;a-simple-demonstration&quot;&gt;A simple demonstration&lt;/h3&gt;

&lt;p&gt;Here is a small example of how you can take advantage of this in Go using the package &lt;a href=&quot;https://github.com/edsrzf/mmap-go&quot;&gt;mmap-go&lt;/a&gt;:&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;package&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;os&quot;&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;fmt&quot;&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;github.com/edsrzf/mmap-go&quot;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;OpenFile&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;./file&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;O_RDWR&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0644&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;defer&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Close&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    
    &lt;span class=&quot;n&quot;&gt;mmap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mmap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mmap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RDWR&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;defer&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mmap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Unmap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;fmt&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Println&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mmap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    
    &lt;span class=&quot;n&quot;&gt;mmap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;sc&quot;&gt;&apos;X&apos;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;mmap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Flush&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;img src=&quot;/assets/img/mmap_demo.gif&quot; alt=&quot;asciicast&quot; /&gt;
The beauty is that we could have a much bigger file, and the solution would still work. We would not have to worry about managing memory in order to avoid it filling up.&lt;/p&gt;

&lt;h3 id=&quot;detailing-mmap-capabilities&quot;&gt;Detailing &lt;em&gt;mmap&lt;/em&gt; capabilities&lt;/h3&gt;

&lt;p&gt;We’re going to explore more &lt;em&gt;mmap&lt;/em&gt; functionalities from the point of view of the API provided by &lt;a href=&quot;https://github.com/edsrzf/mmap-go&quot;&gt;mmap-go&lt;/a&gt;. There are probably more features that the &lt;a href=&quot;https://godoc.org/golang.org/x/sys/unix#Mmap&quot;&gt;native syscall&lt;/a&gt; provides that this library does not implement.&lt;/p&gt;

&lt;h4 id=&quot;the-prot-argument&quot;&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;prot&lt;/code&gt; argument&lt;/h4&gt;

&lt;p&gt;Here is the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap.Map&lt;/code&gt; signature&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;File&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;prot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;flags&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MMap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;error&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Let’s look at &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;prot&lt;/code&gt; first. The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;prot&lt;/code&gt; argument lets you specify the protection levels of your mapping: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RDONLY&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RDWR&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;EXEC&lt;/code&gt; are the options provided for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap-go&lt;/code&gt;. These levels are pretty straightforward, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RDONLY&lt;/code&gt; means you can only read from the mapping, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RDWR&lt;/code&gt; means you can also write, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;EXEC&lt;/code&gt; means you can execute code on that mapping.  Here is the description of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;prot&lt;/code&gt; from the Linux &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;man&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;The prot argument describes the desired memory protection of the
mapping (and must not conflict with the open mode of the file).
It is either PROT_NONE or the bitwise OR of one or more of the
following flags:

PROT_EXEC
    Pages may be executed.

PROT_READ
    Pages may be read.

PROT_WRITE
    Pages may be written.

PROT_NONE
    Pages may not be accessed.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In the &lt;a href=&quot;https://godoc.org/golang.org/x/sys/unix&quot;&gt;unix package&lt;/a&gt;, those flags are: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unix.PROT_EXEC&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unix.PROT_READ&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unix.PROT_WRITE&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unix.PROT_NONE&lt;/code&gt;.&lt;/p&gt;

&lt;h4 id=&quot;experimenting-with-prot_exec-flag&quot;&gt;Experimenting with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;PROT_EXEC&lt;/code&gt; flag&lt;/h4&gt;

&lt;p&gt;I’ve become intrigued by the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;EXEC&lt;/code&gt; flag and wanted to see an example of how that works. I’ve Google and could not find any example. So I tried a search in Github by &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;PROT_EXEC&lt;/code&gt; and found a good example in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;C&lt;/code&gt;: &lt;a href=&quot;https://github.com/onesmash/MMapExecDemo&quot;&gt;MMapExecDemo&lt;/a&gt;. I replicated this example in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Go&lt;/code&gt; using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap-go&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The first step was to create a function that I wanted to be put in memory by &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap&lt;/code&gt; allocation, compile it, and get its assembly opcodes.&lt;/p&gt;

&lt;p&gt;I created the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;inc&lt;/code&gt; function in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;inc.go&lt;/code&gt; file&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;package&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;inc&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;inc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;compiled it with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;go tool compile -S -N inc.go&lt;/code&gt;, then got its assembly by calling &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;go tool objdump -S inc.o&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;func inc(n int) int {
  0x22b                 48c744241000000000      MOVQ $0x0, 0x10(SP)
        return n + 1
  0x234                 488b442408              MOVQ 0x8(SP), AX
  0x239                 48ffc0                  INCQ AX
  0x23c                 4889442410              MOVQ AX, 0x10(SP)
  0x241                 c3                      RET
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;With this, we can build represent our function in bytes on our code&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;code&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;byte&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;m&quot;&gt;0x48&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0xc7&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0x44&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0x24&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0x10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0x00&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0x00&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0x00&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0x00&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;m&quot;&gt;0x48&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0x8b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0x44&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0x24&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0x08&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;m&quot;&gt;0x48&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0xff&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0xc0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;m&quot;&gt;0x48&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0x89&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0x44&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0x24&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0x10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;m&quot;&gt;0xc3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We allocate our memory with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;memory&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mmap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MapRegion&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;code&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mmap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;EXEC&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mmap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RDWR&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mmap&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ANON&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;err&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;panic&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;err&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In this call, we’re using a more complete function called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;MapRegion&lt;/code&gt; that lets you specify how much memory you are allocating (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Map&lt;/code&gt; allocates the size of the underlying file) and the offset of the file.&lt;/p&gt;

&lt;p&gt;In the beginning, we said that the main purpose of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap&lt;/code&gt; was to create a mapping between a file and memory. But in this call we are not indicating any file. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap&lt;/code&gt; can be used just a regular memory allocator by setting &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;nil&lt;/code&gt; to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;*os.File&lt;/code&gt; argument and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap.ANON&lt;/code&gt; to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;flags&lt;/code&gt; argument. We will talk about more &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap.ANON&lt;/code&gt;. Since we are not mapping any file, the offset is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;So we have memory allocated with the same size of our code &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;len(code)&lt;/code&gt;. Since we set the flag &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap.RDWR&lt;/code&gt;, we can copy our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;code&lt;/code&gt; to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;memory&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nb&quot;&gt;copy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;memory&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;code&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We have the code of our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;inc&lt;/code&gt; function in memory. In order to execute it, we have to cast that memory address to a function with a signature that matches the signature of our compiled &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;inc&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;memory_ptr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;memory&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;unsafe&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pointer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;memory_ptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;inc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;When we call &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;inc&lt;/code&gt;, we are executing the code we put in memory. That only works because of the flag &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap.EXEC&lt;/code&gt;. If that flag was not set, a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;segmentation violation&lt;/code&gt; would occur.&lt;/p&gt;

&lt;div class=&quot;language-go highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;fmt&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Println&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;inc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;// Prints 11&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I don’t know if this is a real use case. I just wanted to see what it meant to execute code that you put in memory. And there are probably other ways of achieving the same with regular memory allocation and calls to &lt;a href=&quot;https://man7.org/linux/man-pages/man2/mprotect.2.html&quot;&gt;mprotect&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;One question that may come up is: but the code is already in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;code&lt;/code&gt; variable, can’t we just execute it? No, because the memory static allocated to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;code&lt;/code&gt; is not executable. Can we make it executable? I’ve tried to use &lt;a href=&quot;https://man7.org/linux/man-pages/man2/mprotect.2.html&quot;&gt;mprotect&lt;/a&gt; on it but still got &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;segmentation violation&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Here is the full working &lt;a href=&quot;https://gist.github.com/brunoac/b9ff4ad46c27926e5e4f078133d0de79&quot;&gt;gist&lt;/a&gt;.&lt;/p&gt;

&lt;h4 id=&quot;the-flags-argument&quot;&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;flags&lt;/code&gt; argument&lt;/h4&gt;

&lt;p&gt;We can have many processes mapping the same memory region. This argument lets us decide about the visibility of the updates happening in the mapping. There are many flags, and you can check them out at &lt;a href=&quot;https://man7.org/linux/man-pages/man2/mmap.2.html&quot;&gt;mmap&lt;/a&gt;. The important ones are &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unix.MAP_SHARED&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unix.MAP_PRIVATE&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unix.MAP_ANON&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;MAP_SHARED&lt;/code&gt; means that changes to the mapping are visible to all processes and will also occur at the underlying mapped file, although we cannot control when.&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;MAP_PRIVATE&lt;/code&gt; means the changes are private and other processes will not see them. And also, they are not carried through to the underlying file.&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;MAP_ANON&lt;/code&gt; means that there is not going to be a mapped file. It is useful for sub-processes communication with shared memory.&lt;/p&gt;

&lt;p&gt;I’ve got confused about the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap-go&lt;/code&gt; library implementation. It only provides the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap.ANON&lt;/code&gt; flag, that we used in the above example. If you want your mapping to be private, you can set the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap.COPY&lt;/code&gt; flag to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;prot&lt;/code&gt; argument. Anyways, you can always use the flags provided by the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unix&lt;/code&gt; package implementation.&lt;/p&gt;

&lt;h4 id=&quot;locking-and-flushing&quot;&gt;Locking and flushing&lt;/h4&gt;

&lt;p&gt;Two other nice methods, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Lock&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Flush&lt;/code&gt;, are provided by the API of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap-go&lt;/code&gt;. The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Lock&lt;/code&gt; method calls the &lt;a href=&quot;https://man7.org/linux/man-pages/man2/mlock.2.html&quot;&gt;mlock&lt;/a&gt; system call that prevents the mapping to be paged out to disk. And the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Flush&lt;/code&gt; method calls the &lt;a href=&quot;https://man7.org/linux/man-pages/man2/msync.2.html&quot;&gt;msync&lt;/a&gt; system call that forces the data in memory to be written to disk. This is a good way to trying to have more control over how and when data is flushed to disk.&lt;/p&gt;

&lt;h3 id=&quot;wrapping-up&quot;&gt;Wrapping up&lt;/h3&gt;

&lt;p&gt;I felt kind of stupid of knowing about &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap&lt;/code&gt; after so long. I don’t remember it being brought in my college class. For some reason, I felt amazed by it and its capabilities and decided to dig deeper. I like databases and I’m aiming to get a better grasp of them. This means that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap&lt;/code&gt; cannot go unnoticed from my learning. For future posts, I’ll try to bring about the benefits and drawbacks of using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap&lt;/code&gt;, which projects use it, and what kind of problems it is suited for.&lt;/p&gt;

&lt;p&gt;Even though the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap&lt;/code&gt; can be used to solve that database problem we stated in the beginning, and many modern databases use it, &lt;a href=&quot;https://twitter.com/andy_pavlo&quot;&gt;Andy Pavlo&lt;/a&gt; advocates against it and have three lecture on how to databases, that don’t use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mmap&lt;/code&gt;, manage data.&lt;/p&gt;

&lt;p&gt;If you like this kind of content, follow me on &lt;a href=&quot;https://twitter.com/brunocalza&quot;&gt;twitter&lt;/a&gt;. You may find more related stuff there.&lt;/p&gt;

      </content>
    </entry>
  
    <entry>
      <title>My Thoughts on A Plea for Lean Software</title>
      <link href="https://bcalza.b-cdn.net/2020/03/22/my-thoughts-on-a-plea-for-lean-software.html" />
      <id>https://bcalza.b-cdn.net/2020/03/22/my-thoughts-on-a-plea-for-lean-software</id>
      <updated>2020-03-22T00:00:00+00:00</updated>
      <content type="html">
        &lt;p&gt;&lt;strong&gt;A Plea for Lean Software&lt;/strong&gt;&lt;sup id=&quot;fnref:1&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; is a classical paper that presents us with some hints of why software increases in complexity and gives us some advice on how to avoid or minimize complexity. Here I present my interpretation of Niklaus Wirth’s ideas in a non linear way adding some personal reflections on top of it.&lt;/p&gt;

&lt;p&gt;The amazing thing about this paper is that it was written in &lt;strong&gt;1995&lt;/strong&gt;, when programs where only a few KB or MB big. And &lt;strong&gt;Niklaus Wirth&lt;/strong&gt;&lt;sup id=&quot;fnref:2&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; was already claiming that software growth was getting out of control. It amazes me how accurate his observations still are even 25 years later. As an Engineering Manager, I see a lot of what he’s talking about in my mundane activities and I also see myself acting in a lot of ways similar to what he points out as crucial for software complexity growth.&lt;/p&gt;

&lt;p&gt;This paper is famous for the following quote: “&lt;em&gt;Software is getting slower more rapidly than hardware becomes faster&lt;/em&gt;”. Wirth attributes it to &lt;strong&gt;Martin Reiser&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Wirth starts by pointing two factors for ever-growing software:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Rapidly growing hardware performance&lt;/li&gt;
  &lt;li&gt;Customers’ ignorance of features that are essential-versus-nice to have&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The first factor is very easy to grasp and makes total sense. But I had a little trouble understanding the second one. After some reflection, what I could come up from this claim is that the customers’ lack of understanding of what brings value and what does not bring value provides software vendors the wrong kind of information on what to build. And that happens because the userbase is always making requests about the product that they claim are useful but not necessarily are (some of them may be). If the software vendors don’t think critically on what the user is requesting, it will lead to software complexity. This behavior incentivizes a feature factory schema that doesn’t necessarily add value to the customer. I’d would like to add that not only the customers’ ignorance leads to complexity but also the software vendors’ ignorance to recognize customers’ ignorance and have a clear understanding of what adds value to customers’ life. As a software vendor myself, only the fact of being aware of my ignorance and limitation may be a great start for reducing complexity or avoiding it to grow. But that requires a lot of humility.&lt;/p&gt;

&lt;p&gt;Wirth’s talks about what he calls &lt;strong&gt;monolithic design&lt;/strong&gt;. You build a single system to assist a diverse userbase that pays for all features but only use a few or if I may add, each user segment uses a small set of features that is different from the other user segment. That leads to his recommendation that we should be building a thoughtful well designed system core that is extensible, which I think is indeed a good thing. However we all know that software development comes with some constraints. And greatest of all constraints is time. Wirth puts a lot of criticism on time pressure declaring that it discourages careful planning and improving acceptable solutions and on the other end, encourages quickly software additions and corrections.
But time may be in fact a real constraint and the pressure sometimes makes sense. The problem is that we don’t know how to identify when it as a real constraint versus we’re just making this up. I know that there are cases that the constraint is real, like developing a &lt;em&gt;website&lt;/em&gt; for next &lt;em&gt;World Cup&lt;/em&gt;. But in most occasions, the time constraint comes from managers’ anxiety (as manager I recognize this), not well thought sprint timeboxes or simply because we have to. Since it may very hard to distinguish a real time constraint from a fantasy one, a good way of stop fooling ourselves it is to use time as an indicator of when we should be ready shipping a next increment of improvement. And that’s agile. Unfortunately it’s very hard to understand what the next increment of improvement should be and people don’t really understand the increment part. Wirth also elaborates on the importance of iterative improvements, not as an escape from the time pressure as I have elaborated, but as an escape from the hardship that is building software:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Initial designs for sophisticated software applications are invariably complicated, even when developed by competent engineers. Truly good solutions emerge after iterative improvements or after redesigns that exploit new insights, and the most rewarding iterations are those that result in program simplifications&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Another point made by Wirth is that complexity promotes customer dependence on the vendor. So there is an incentive to make things complex in order to create a dependency of the customer generating a more stable stream of income. In a way it makes a lot of sense. A good simple product in which the customer can solve all its problems by yourself without any sales or customer support might not lead to a great engagement. But I don’t really think that this practice happens consciously though. Most of the time, I would guess it happens accidentally. Even because customer support and a sales team may not be very cheap or scalable. But I do recognize that some vendors may consider (consciously) complexity as a strategy for customer lock-in.&lt;/p&gt;

&lt;p&gt;The paper goes on and details more practical technical advice on how to keep complexity low by presenting a practical case called project Oberon and it finishes with 9 lessons learned from this project. Those are very useful and it is worth checking out. I’d love to know more about software complexity and its root causes and I will probably research more about the topic. For now I’ll finish with what I think is best phrase of the paper:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Good engineering is characterized by a gradual, step-wise refinement of products that yields increased performance under given constraints and with given resources.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot;&gt;
      &lt;p&gt;Wirth, Niklaus. “A Plea for Lean Software.” &lt;em&gt;Computer&lt;/em&gt;, vol. 28, no. 2, IEEE Computer Society, Feb. 1995, pp. 64-68. DOI: &lt;a href=&quot;https://doi.org/10.1109/2.348001&quot;&gt;10.1109/2.348001&lt;/a&gt;. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot;&gt;
      &lt;p&gt;“Niklaus Wirth.” &lt;em&gt;Wikipedia&lt;/em&gt;, The Free Encyclopedia, Wikimedia Foundation. Available at: &lt;a href=&quot;https://en.wikipedia.org/wiki/Niklaus_Wirth&quot;&gt;https://en.wikipedia.org/wiki/Niklaus_Wirth&lt;/a&gt;. Swiss computer scientist (1934-2024), designer of Pascal, Modula-2, and Oberon programming languages, and 1984 Turing Award recipient. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

      </content>
    </entry>
  
</feed>
