<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Peter’s Technical Blog]]></title><description><![CDATA[I've done a lot of things with computers, with many programming languages and operating systems, in several different industries. Here are some random thoughts about things I've learned over the years.]]></description><link>https://www.petersmith.net</link><image><url>https://substackcdn.com/image/fetch/$s_!nRSX!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e97714e-1cf3-45e3-8e8c-76398ed581db_798x798.png</url><title>Peter’s Technical Blog</title><link>https://www.petersmith.net</link></image><generator>Substack</generator><lastBuildDate>Sat, 02 May 2026 11:47:28 GMT</lastBuildDate><atom:link href="https://www.petersmith.net/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Peter Smith]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[petersmithphd@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[petersmithphd@substack.com]]></itunes:email><itunes:name><![CDATA[Peter Smith]]></itunes:name></itunes:owner><itunes:author><![CDATA[Peter Smith]]></itunes:author><googleplay:owner><![CDATA[petersmithphd@substack.com]]></googleplay:owner><googleplay:email><![CDATA[petersmithphd@substack.com]]></googleplay:email><googleplay:author><![CDATA[Peter Smith]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Serverless is becoming mainstream, but what exactly does "Serverless" mean?]]></title><description><![CDATA[It means different things to different people, but there are always servers...]]></description><link>https://www.petersmith.net/p/serverless-is-becoming-mainstream</link><guid isPermaLink="false">https://www.petersmith.net/p/serverless-is-becoming-mainstream</guid><dc:creator><![CDATA[Peter Smith]]></dc:creator><pubDate>Sat, 13 Jul 2024 13:25:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!yt5e!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d65cbd3-84c3-4748-ab79-a7b7ff7fce94_2266x1124.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>There&#8217;s been a lot of discussion recently on the topic of <em><strong>Serverless is becoming mainstream</strong></em> (or variations on that wording) and I didn&#8217;t want to miss my chance to contribute. I&#8217;ve been curious what earns a cloud-based feature the <em>Serverless</em> badge of honour, versus simply being <em>Cloud-based</em> or <em>Cloud-native</em>. It seems that lots of really smart people have asked this same question, and there&#8217;s no concise answer.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yt5e!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d65cbd3-84c3-4748-ab79-a7b7ff7fce94_2266x1124.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yt5e!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d65cbd3-84c3-4748-ab79-a7b7ff7fce94_2266x1124.png 424w, https://substackcdn.com/image/fetch/$s_!yt5e!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d65cbd3-84c3-4748-ab79-a7b7ff7fce94_2266x1124.png 848w, https://substackcdn.com/image/fetch/$s_!yt5e!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d65cbd3-84c3-4748-ab79-a7b7ff7fce94_2266x1124.png 1272w, https://substackcdn.com/image/fetch/$s_!yt5e!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d65cbd3-84c3-4748-ab79-a7b7ff7fce94_2266x1124.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yt5e!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d65cbd3-84c3-4748-ab79-a7b7ff7fce94_2266x1124.png" width="728" height="361" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7d65cbd3-84c3-4748-ab79-a7b7ff7fce94_2266x1124.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:722,&quot;width&quot;:1456,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:1976826,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yt5e!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d65cbd3-84c3-4748-ab79-a7b7ff7fce94_2266x1124.png 424w, https://substackcdn.com/image/fetch/$s_!yt5e!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d65cbd3-84c3-4748-ab79-a7b7ff7fce94_2266x1124.png 848w, https://substackcdn.com/image/fetch/$s_!yt5e!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d65cbd3-84c3-4748-ab79-a7b7ff7fce94_2266x1124.png 1272w, https://substackcdn.com/image/fetch/$s_!yt5e!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d65cbd3-84c3-4748-ab79-a7b7ff7fce94_2266x1124.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Over the last month, I&#8217;ve read and listened to people&#8217;s description of what Serverless means to them. My conclusion: There&#8217;s no single definition of whether something is Serverless or not. Instead, it&#8217;s a set of characteristics allowing you to focus on the business logic of your application, rather than the heavy-lifting of managing infrastructure. Any particular service falls somewhere on the Serverless spectrum, based on how well it exhibits these characteristics.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.petersmith.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Peter&#8217;s Technical Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Wait? Serverless is Mainstream?</h2><p>Yes, it&#8217;s definitely moving in that direction. It&#8217;s no longer viewed as an exclusive cutting-edge way of developing applications, with fresh-off-the-press videos showing how little effort is required to bootstrap and execute code, all for the low-low price of $0.00001. It still does those things, but many developers now see it as a &#8220;tool in their tool belt&#8221; to save time and money as they focus on building their application.</p><p>To illustrate the maturity path, let&#8217;s draw a comparison to Object-Oriented Programming (OOP).</p><p>When I was an undergraduate student in the 1990s, we learned Pascal and C as our two main programming languages. We also heard about Smalltalk (from 1972), with objects and messages, but never had a chance to write a program. It all seemed very abstract and I honestly didn&#8217;t understand the point of objects. It wasn&#8217;t until I wrote a bank teller simulation using the Simula language (from 1962) that it felt natural, and started to make sense.</p><p>As a graduate student (late 1990s), I was a regular at the <a href="https://en.wikipedia.org/wiki/OOPSLA">OOPSLA conference</a>, surrounded by academics and industry practitioners who were ahead of the game in Object Oriented Programming and Design. The exhibit halls were flooded with vendors selling their latest tool or technique. I brushed shoulders with people who are now legends, and saw the birth of Java and UML. It was all ground-breaking at the time, but not yet mainstream.</p><p>Fast forward to 2024 - The university students I interact with use TypeScript, Python, or Java, each with objects baked into the standard language. The students learn OO programming as the &#8220;normal&#8221; way of doing things, as if there was never any other way! They take these expectations into industry, and enthusiastically work in OO style. These days, the OOPSLA conference has rebranded into a much smaller format, mostly attended by academics. OOP has become mainstream - the normal way of doing things, rather than something special and buzz-worthy.</p><p>Serverless is not as far along the maturity curve as OOP, but it&#8217;s heading in the correct direction. It&#8217;s not a passing fad that&#8217;ll be gone in a few years. There are plenty of Serverless conferences generating buzz, but students are learning to deploy containers or functions-as-a-service to the cloud, without the cognitive burden of SSHing to a Linux host. They&#8217;re saving money by paying per request, rather than by instance-hour. It&#8217;s becoming a normal and desirable way to operate.</p><p>But, back to the topic - what exactly is Serverless? Or, more to the point, what are the characteristics of the services that are referred to as Serverless?</p><h2>What Do the Experts Say?</h2><p>The obvious place to start is by looking at what the industry or academic experts are saying. Here are some quotes:</p><p>From Wikipedia&#8217;s <a href="https://en.wikipedia.org/wiki/Serverless_computing">Serverless Computing</a> topic&#8230;</p><blockquote><p>Serverless computing is a cloud computing execution model in which the cloud provider allocates machine resources on demand, taking care of the servers on behalf of their customers. "Serverless" is a misnomer in the sense that servers are still used by cloud service providers to execute code for developers. However, developers of serverless applications are not concerned with capacity planning, configuration, management, maintenance, fault tolerance, or scaling of containers, virtual machines, or physical servers. When an app is not in use, there are no computing resources allocated to the app. Pricing is based on the actual amount of resources consumed by an application.</p></blockquote><p>From the <a href="https://aws.amazon.com/serverless/">AWS Serverless</a> page&#8230;</p><blockquote><p>AWS offers technologies for running code, managing data, and integrating applications, all without managing servers. Serverless technologies feature automatic scaling, built-in high availability, and a pay-for-use billing model to increase agility and optimize costs. These technologies also eliminate infrastructure management tasks like capacity provisioning and patching, so you can focus on writing code that serves your customers.</p></blockquote><p>From <a href="https://www.pluralsight.com/resources/blog/cloud/what-is-serverless">Pluralsight</a>&#8230;</p><blockquote><p>Serverless is one of those terms that&#8217;s used for all sorts of stuff &#8212; some legitimate and some marketing fluff. But, in essence, serverless is a design approach that lets you build and run entire applications without having to directly manage servers.&nbsp;</p></blockquote><p>From a <a href="https://arxiv.org/pdf/1902.03383">UC Berkeley research paper</a>&#8230;</p><blockquote><p>Serverless cloud computing handles virtually all the system administration operations needed to make it easier for programmers to use the cloud. It provides an interface that greatly simplifies cloud programming, and represents an evolution that parallels the transition from assembly language to high-level programming languages</p></blockquote><p>In summary, the goal of Serverless is to allow developers to focus on their application&#8217;s business logic, rather than on undifferentiated heavy lifting.  Although each service does this differently, I&#8217;ve noticed some common characteristics:</p><ul><li><p>Avoiding the need for command-shell access (the administration is done for you).</p></li><li><p>Avoiding the need to size/scale servers, or to be aware of them at all.</p></li><li><p>Paying only when constructive work is being performed.</p></li><li><p>Thinking about upgrading software</p></li></ul><p>Let&#8217;s discuss in more detail.</p><h2>Command Shell Access</h2><p>With traditional infrastructure (cloud-based or on-premise), administrators have command-shell access to their servers, typically via SSH or RDP. This is a basic need when you&#8217;re responsible for security, upgrades, backups, or other server-side tasks. In contrast, the promise of Serverless is for cloud providers to handle this mundane work for you.</p><p>There&#8217;s clearly a benefit of having somebody else do the admin tasks, but for many developers this alleviates the need to learn how it&#8217;s done in the first place. Most application developers I&#8217;ve met prefer to focus their time and skills on developing business logic, without the distraction of skilling-up on IT responsibilities. </p><p>Which AWS services fit in this category? Clearly <a href="https://aws.amazon.com/ec2/">Amazon EC2</a> is not Serverless because the first thing you do is log into the command line or console. Container services, such as <a href="https://aws.amazon.com/ecs/">Amazon ECS</a> are closer to being Serverless if you use <a href="https://aws.amazon.com/fargate/">ECS on AWS Fargate</a>, but if you run ECS containers on EC2 you still have access to the underlying servers. Even with Fargate, it&#8217;s possible to <a href="https://aws.amazon.com/blogs/containers/new-using-amazon-ecs-exec-access-your-containers-fargate-ec2/">run a command shell inside a container</a>, similar to running <code>docker exec</code> on your local machine.</p><p>Perhaps the most famous Serverless offering from Amazon is <a href="https://aws.amazon.com/lambda/">AWS Lambda</a>, providing time-limited general-purpose compute (running on Python, JavaScript, Java etc), without requiring access to the underlying servers. Instead, the developer selects a run-time environment (such as NodeJS), then uploads their code to the service. The service manages the complexity.</p><p>Looking further into the list of Serverless offerings, there are many services where command-shell access is not even possible, such as <a href="https://aws.amazon.com/sqs/">Amazon SQS</a>, <a href="https://aws.amazon.com/sns/">Amazon SNS</a>, <a href="https://aws.amazon.com/step-functions/">AWS Step Functions</a>, <a href="https://aws.amazon.com/eventbridge/">Amazon EventBridge</a>, and <a href="https://aws.amazon.com/dynamodb/">Amazon DynamoDB</a>. All of these services have very specific purposes, but none of them provide access to general-purpose compute, nor a command shell.</p><p>Interestingly, we might question whether <a href="https://aws.amazon.com/rds/postgresql/">Amazon RDS Postgres</a> meets this requirement, given that users are not permitted command-shell access to the underlying servers. This is true, but there are other reasons RDS doesn&#8217;t fit the Serverless model.</p><h2>Awareness of Servers and Their Size</h2><p>Another expectation of Serverless is that you don&#8217;t <em>think</em> about servers. That is, you don&#8217;t need to worry how large the servers are (RAM and CPU), how much disk space is attached, or how many servers you&#8217;ve been allocated. All of those details are managed by the cloud provider.</p><p>This is where RDS Postgres immediately fails the Serverless test, since the first thing you do is specify server size. You then worry if you have large enough servers, or whether you&#8217;ve over-provisioned and are paying too much. An ideal Serverless environment takes away these concerns, with Amazon DynamoDB going a long way in this direction (you just create a table, and don&#8217;t worry about how it&#8217;s stored/accessed).</p><p>Services such as AWS Lambda or AWS Fargate eliminate the need to worry about server sizing, as they automatically find a suitable server for your workload, allocating new servers whenever necessary to scale. However, at the function/container level, both Lambda and Fargate require you to specify the RAM and CPU needed for running your software. Tweaking these parameters can be time-consuming, given that RAM vs CPU trade-off significantly impacts software performance.</p><p>The other Serverless offerings mentioned earlier (SQS, SNS, Step Functions, EventBridge, DynamoDB) fit nicely into this model, since you never think about quantity or size of servers, nor about how much memory is allocated to your software. Of course, these services don&#8217;t provide general-purpose compute, which is how they differ from Lambda.</p><p>In terms of developer burden, a couple of edge cases are worth mentioning. First, a Lambda function can be impacted by <em>cold-start latency</em> where the service takes slightly longer than normal to scale the underlying server fleet on your behalf. Second, all services have limits/quotas to stop customers scaling up more than anticipated. Both these important scenarios place a burden on developers to be aware that servers are a real thing.</p><h2>Elasticity and Pay-Per-Use</h2><p>A significant benefit of Serverless technology, and of cloud computing in general, is paying for what you use. Capacity can be allocated and deallocated almost instantly, and you only pay while those resources are active. This is in stark contrast to the weeks (or months) required to install new equipment in your own data centre, paying the bulk of the cost before the hardware is even plugged in.</p><p>The interesting difference with Serverless is that you pay per request (or message/event, or state transition), rather than pre-allocating the resources and paying for them, even if they&#8217;re idle. For example, allocating an EC2 instance involves paying for every second the instance is dedicated for your use, even if the server isn&#8217;t doing meaningful work. In contrast, Lambda functions only execute when there&#8217;s meaningful work to be done, then immediately terminate rather than being idle. There are no charges for Lambda when there&#8217;s no work to do.</p><p>Anything in the Serverless world should therefore &#8220;scale to zero&#8221; and not charge you anything when idle. With this in mind, AWS Lambda, Step Functions, EventBridge, SNS, and SQS all fit into this model.</p><p>However, it&#8217;s debatable whether Amazon DynamoDB or AWS Fargate meet this requirement. For DynamoDB, you&#8217;ll always be charged for data storage, even in the absence of read/write requests. For Fargate, you can quickly scale up/down the number of instances, but you&#8217;ll still need to pay for idle instances that aren&#8217;t currently processing requests.</p><h2>Upgrades</h2><p>Although it&#8217;s not often discussed in the literature, there&#8217;s a cost of thinking about software upgrades. Serverless alleviates the need for upgrading or patching the underlying operating system, but that&#8217;s not always true for the run-time environment.</p><p>For example, with AWS Fargate, you&#8217;re still responsible for managing the Dockerfile, describing which libraries, frameworks, or applications run inside the container. If third-party software is used, there&#8217;s ongoing effort to upgrade packages to adopt new features, avoid EOLed versions, or to fix security vulnerabilities.</p><p>For AWS Lambda, there&#8217;s less effort involved, but with new versions of the runtime environment (e.g. NodeJS) being available every year, there&#8217;s ongoing effort to perform software upgrades.</p><p>On the other hand, services such as Step Functions, EventBridge, SQS and SNS maintain backward compatibility, not requiring the code or configuration to be updated on a periodic basis.</p><h2>Event-Driven Architectures</h2><p>One perspective I&#8217;ve seen is that Serverless is closely tied to Event-Driven Architectures (EDA). That is, your software consists of multiple collaborating services, each generating events of their own, and/or reacting to events from other services. In this model, there&#8217;s a clear benefit from only paying when there are events to process, or scaling-to-zero when the service is idle.</p><p>But, it wouldn&#8217;t be fair to say that EDA implies Serverless, or that Serverless implies EDA. In fact, if you have a predictable workload passing through your system, and you rarely scale-to-zero, it can be more efficient to use pre-allocated containers or VMs. In this approach, each service processes a high number of concurrent requests within the same container/VM, rather than starting up a new Function-as-a-Service environment for each request.</p><h2>Serverless Compute</h2><p>Another interesting perspective is that Serverless implies general-purpose compute, such as AWS Lambda, where you can run any type of code written in Java, JavaScript, Python, Golang, Rust, etc. In contrast, services such as DynamoDB, EventBridge, SNS or SQS don&#8217;t fit this definition of Serverless, as they have very specific modes of operation (fetching/storing data records, or propagating events).</p><p>I haven&#8217;t discovered the source of this thinking, but it&#8217;s likely because databases and message brokers were traditionally managed by dedicated teams (DBAs, or IT), rather than by application developers. From that perspective, we never thought much about servers in the first place.</p><h2>And My Point Was?</h2><p>After reading a lot of opinion, I learned that Serverless is not black or white&#8230; instead, it&#8217;s a colourful spectrum of characteristics making a service easier to use. By eliminating the undifferentiated heavy lifting, developers can focus more on building out their applications.</p><p>I&#8217;m not the first person to have these thoughts, and many have done a great job of documenting this topic. I encourage you to read/view:</p><ul><li><p><a href="https://martinfowler.com/articles/serverless.html">Mike Roberts / Thoughtworks</a> - A detailed perspective of all the characteristics of Serverless, including the history, the pros, the cons, and forward-looking trends. A long document to read, but it&#8217;s worth it.</p></li><li><p><a href="https://www.youtube.com/watch?v=vuWiB3vNiHc">Jeremy Daly</a> - A fun presentation to watch, referring to a bunch of other experts who have equally interesting opinions.</p></li><li><p><a href="https://ben11kehoe.medium.com/the-serverless-spectrum-147b02cb2292">Ben Kehoe</a> - Describes many of the edge cases of the Serverless definition, and includes the idea of it being a spectrum.</p></li></ul><p>Regardless of what Serverless really means, and whether we&#8217;re using the word correctly, the characteristics we&#8217;ve discussed are super useful, especially if you&#8217;d rather focus on business logic. For this reason, I feel the characteristics of Serverless have a bright future, no matter what they&#8217;re called.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.petersmith.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Peter&#8217;s Technical Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The Incompatible Mindsets of Software Development Communities]]></title><description><![CDATA[Why you should understand software development communities, not just your own.]]></description><link>https://www.petersmith.net/p/the-incompatible-mindsets-of-software</link><guid isPermaLink="false">https://www.petersmith.net/p/the-incompatible-mindsets-of-software</guid><dc:creator><![CDATA[Peter Smith]]></dc:creator><pubDate>Sat, 08 Jun 2024 02:05:52 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!MUzs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6060f0d-bb16-453c-b7d8-7eec5806fc2a_728x408.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div><hr></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MUzs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6060f0d-bb16-453c-b7d8-7eec5806fc2a_728x408.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MUzs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6060f0d-bb16-453c-b7d8-7eec5806fc2a_728x408.jpeg 424w, https://substackcdn.com/image/fetch/$s_!MUzs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6060f0d-bb16-453c-b7d8-7eec5806fc2a_728x408.jpeg 848w, https://substackcdn.com/image/fetch/$s_!MUzs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6060f0d-bb16-453c-b7d8-7eec5806fc2a_728x408.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!MUzs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6060f0d-bb16-453c-b7d8-7eec5806fc2a_728x408.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MUzs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6060f0d-bb16-453c-b7d8-7eec5806fc2a_728x408.jpeg" width="728" height="408" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c6060f0d-bb16-453c-b7d8-7eec5806fc2a_728x408.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:408,&quot;width&quot;:728,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;In this dynamic office setting, a team of focused professionals gather around a glass wall plastered with vibrant sticky notes. This interactive brainstorming session is a testament to their collaborative spirit, as they engage in discussion, share ideas, and meticulously organize their project strategies. Amidst the casual work environment, the concentration and commitment of the individuals are palpable, all contributing to the innovative atmosphere where creativity and cooperation thrive.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="In this dynamic office setting, a team of focused professionals gather around a glass wall plastered with vibrant sticky notes. This interactive brainstorming session is a testament to their collaborative spirit, as they engage in discussion, share ideas, and meticulously organize their project strategies. Amidst the casual work environment, the concentration and commitment of the individuals are palpable, all contributing to the innovative atmosphere where creativity and cooperation thrive." title="In this dynamic office setting, a team of focused professionals gather around a glass wall plastered with vibrant sticky notes. This interactive brainstorming session is a testament to their collaborative spirit, as they engage in discussion, share ideas, and meticulously organize their project strategies. Amidst the casual work environment, the concentration and commitment of the individuals are palpable, all contributing to the innovative atmosphere where creativity and cooperation thrive." srcset="https://substackcdn.com/image/fetch/$s_!MUzs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6060f0d-bb16-453c-b7d8-7eec5806fc2a_728x408.jpeg 424w, https://substackcdn.com/image/fetch/$s_!MUzs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6060f0d-bb16-453c-b7d8-7eec5806fc2a_728x408.jpeg 848w, https://substackcdn.com/image/fetch/$s_!MUzs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6060f0d-bb16-453c-b7d8-7eec5806fc2a_728x408.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!MUzs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6060f0d-bb16-453c-b7d8-7eec5806fc2a_728x408.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I&#8217;ve recently been intrigued by the dynamics <em>within</em> and <em>between</em> software development communities, and particularly the <em>mindset</em> (mental models and attitudes) presented by each. If you find yourself needing to collaborate with a software development community different from your own, then understanding their culture is important.</p><p>This could mean working with a different team inside your own company, contributing code to an open source project, or perhaps selling a tool, library, or service for the community to use. You shouldn&#8217;t assume your approach to developing software will be looked upon favourably by other communities.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.petersmith.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Peter&#8217;s Technical Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>In this blog post, I&#8217;ll provide my definition of &#8220;Software Development Community&#8221;, explain why it&#8217;s important to understand communities (not just your own) and will describe four of the communities I&#8217;ve been exposed to in recent years.</p><div><hr></div><h3>What is a Software Development Community?</h3><p>Back in the good-old-days (sometime before the 2000s) it was common to develop applications from scratch. You either wrote the source code yourself, or searched various <a href="https://en.wikipedia.org/wiki/File_Transfer_Protocol">FTP</a> sites or <a href="https://en.wikipedia.org/wiki/Usenet">Usenet</a> groups for &#8220;free&#8221; software to incorporate. It was painful to construct a full application, causing you to implement the same code over and over again for each new project.</p><p>The 1990s saw the introduction of downloadable language runtimes and compilers, package repositories providing reusable code libraries, and frameworks with pre-architected skeletons for a new application. All of these worked together to make application development far more efficient than ever before.</p><p>To give examples, we now have:</p><ul><li><p>The Java language, with the <a href="https://mvnrepository.com/">Maven package repository,</a> and frameworks such as <a href="https://spring.io/projects/spring-framework">Spring</a> and <a href="https://akka.io/">Akka</a>.</p></li><li><p>The JavaScript language, with the <a href="http://npmjs.com">NPM package repository</a>, with backend runtime environments such as <a href="https://nodejs.org/">NodeJS</a> with the <a href="https://expressjs.com/">Express framework</a>, or front-end frameworks such as <a href="https://react.dev/">React</a> or <a href="https://angular.dev/">Angular</a>, running in modern web browsers.</p></li><li><p>The Python language, with the <a href="https://pypi.org/">PyPi</a> package repository, using the <a href="https://pandas.pydata.org/">pandas</a> library to perform data science operations, or <a href="https://pytorch.org/">PyTorch</a> for machine learning, or <a href="https://www.djangoproject.com/">Django</a> for web applications, or many other libraries for many other purposes (Python is very popular).</p></li><li><p>The Ruby language, using the <a href="https://rubygems.org/">RubyGems</a> repository, and the <a href="https://rubyonrails.org/">Rails</a> framework for developing web-based applications.</p></li><li><p>Using a range of languages, the <a href="https://sebgoa.medium.com/a-new-definition-of-modern-apps-177e7474ee09">Serviceful</a> community creates applications using a collection of pre-built services, typically running on a cloud provider such as <a href="https://aws.amazon.com/">Amazon Web Services</a> (AWS).</p></li><li><p>And the list goes on&#8230;</p></li></ul><p>Each of these development environments has been successful enough that communities formed around them. Senior members of the community contributed content (libraries,  documentation, training videos) attracting more consumers, resulting in an ever-growing community. Rather than building applications from scratch, new community members could build upon the frameworks and libraries that were already proven effective.</p><div><hr></div><h3>The Community Mindset</h3><p>What&#8217;s interesting about communities is they share a common mindset, a set of attitudes or beliefs about how software should be developed. This is because communities shape their members toward a common set of values, but also those communities attract similar people in the first place.</p><p>For example, a common agreement in the Ruby-on-Rails community is that developing applications should be fun and easy. Therefore, the Rails framework and the corresponding libraries (known as Ruby Gems) are designed to support minimal configuration and therefore &#8220;just work&#8221;. As a result, it&#8217;s possible to construct very impressive Ruby-on-Rails applications with very little code, in a short amount of time.</p><p>In the JavaScript community, using NodeJS, all I/O calls are expected to be asynchronous in nature, not allowing the main application thread to block. Any library function using blocking code is frowned upon by the community as it negatively impacts system performance. In contrast, the Ruby-on-Rails community avoids this asynchronous approach, due to the excessive complexity of reasoning about program flow.</p><p>Finally, the Serviceful community is comfortable using Infrastructure as Code (such as <a href="https://aws.amazon.com/cloudformation/">AWS CloudFormation</a>) to deploy and stitch together services, which is something the Ruby and Rails and NodeJS communities find unnecessarily low-level, therefore shying away from.</p><p>Despite their benefits, the downside of communities is that members are at risk of developing a single-minded approach, believing there&#8217;s one preferred way to solve a problem. A particular library will be recommended, a specific coding style must be used, or that &#8220;everybody does it this way&#8221;. This may be true within their community, using their agreed-upon set of values to make a decision, but it won&#8217;t always make sense across community lines.</p><p>I find it&#8217;s rare for developers to be deeply involved in more than one community. It can take years to become an expert in a programming language, a set of libraries, a collection of tools, or a framework, so there&#8217;s usually no time to stay actively involved with multiple communities. Not to mention, monitoring the community&#8217;s blog posts and YouTube videos requires significant effort. Focusing on one language or framework is therefore common.</p><p>Most developers have a community they feel very comfortable with, but will dabble lightly in other languages or frameworks when necessary to solve a problem. While branching out, it&#8217;ll take them longer to learn the language syntax, which tool to use, or the specific functionality of the framework they&#8217;re now working with. Overall development will be slower, but if motivated enough they&#8217;ll eventually finish the task.</p><div><hr></div><h3>Why You Should Care</h3><p>Simply put, it&#8217;s easier to work alongside a community if you take time to understand their values, and the historical context around why they prefer certain approaches.</p><p>This can take many forms - you may be collaborating with an arms-length team within your own company, or need to dabble in a code base written in a different language, or perhaps you&#8217;re interviewing a job candidate from another community. You might even be a software vendor selling a tool or service into a new market.</p><p>Here are some differences to keep in mind:</p><ul><li><p>Different communities use different words for the same concepts, so being an outsider can be confusing. That is, until you take five minutes to read the documentation, only to realize you&#8217;re familiar with the concept but by a different name! Of course, you must also start using those words when interacting with the other community. Is the correct word <em>stream</em>, or <em>channel</em>, or <em>iterator</em>, or &#8230;?</p></li><li><p>Along those same lines, the frameworks and libraries each have their specific set of APIs, which are similar to those of other frameworks. For example, every framework has a &#8220;read&#8221; function, but I can never remember the exact name of the function, or the exact parameters. I find my IDE&#8217;s auto-suggest functionality to be super helpful for prompting me, but bookmarking the online API documentation is also a must.</p></li><li><p>Every development community has an opinion about &#8220;speed of execution&#8221; versus &#8220;speed of writing code&#8221;, based on the application they&#8217;re building. A C/C++, or Rust programmer spends effort on efficient use of CPU and RAM, with the goal of faster execution. In contrast, Ruby-on-Rails developers want to minimize the time to develop a new feature, being less concerned about execution performance. Both approaches are perfectly valid, and depend on their end customer&#8217;s needs.</p></li><li><p>Similarly, the style of coding will differ between communities, depending on domain of the application being constructed. Systems-level programming (C/C++, Rust, or sometimes Java) can involve the implementation of complex algorithms using advanced computer science concepts designed to achieve performance and reliability. In contrast, a data scientist using Python, or a business application developer, using Ruby-on-Rails, will use a more domain-focused programming style to meet their customer&#8217;s needs.</p></li><li><p>Types or no types? This has been an ongoing argument for many years, with different communities preferring different styles. It can be faster to author code if you don&#8217;t define types for your variables and don&#8217;t need to worry about type mismatch errors. In contrast, many argue that type annotations make their program more solid and less likely to fail, which in the end speeds up development. With types being added to Python and TypeScript (on top of JavaScript), the communities are gradually converging.</p></li><li><p>Distributed system versus monoliths? It can be argued that distributed systems (multiple programs communicating via a network) are important when scaling your software. But, it&#8217;s equally argued that developing in a monolithic style (all code in a single program image) is far easier to comprehend, and easy to debug. Both groups are correct, but expecting the monolithic community to use a message queue or a distributed workflow service will be challenging.</p></li><li><p>Finally, each community has their preferred blogs and video channels where they share knowledge. Unfortunately, there&#8217;s a risk of becoming insular because recommendation engines are good at showing you similar content to what you&#8217;ve already viewed. It takes a deliberate effort to search for content produced by a different community.</p></li></ul><p>To close off the discussion, let&#8217;s finish with a quick survey of some of the communities I&#8217;ve been exposed to in recent years. I won&#8217;t claim to be an expert in any one of these areas, but I&#8217;ve done plenty of dabbling, and have encountered more than my fair share of friction in doing so.</p><div><hr></div><h3>Example: The NodeJS Community</h3><p><a href="https://nodejs.org/">NodeJS</a> is a server-side runtime environment for executing JavaScript code, working uniformly across Linux, MacOS, and Windows. Besides sharing a common language with web browsers, NodeJS provides additional functionality for file system access, network access, and many other traditional server-side features. Although general purpose in nature, NodeJS is commonly used to build light-weight web application servers, using frameworks such as <a href="https://expressjs.com/">Express</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PBtr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb00e3b38-29eb-44a4-b094-9eb734e9184c_1200x734.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PBtr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb00e3b38-29eb-44a4-b094-9eb734e9184c_1200x734.png 424w, https://substackcdn.com/image/fetch/$s_!PBtr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb00e3b38-29eb-44a4-b094-9eb734e9184c_1200x734.png 848w, https://substackcdn.com/image/fetch/$s_!PBtr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb00e3b38-29eb-44a4-b094-9eb734e9184c_1200x734.png 1272w, https://substackcdn.com/image/fetch/$s_!PBtr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb00e3b38-29eb-44a4-b094-9eb734e9184c_1200x734.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PBtr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb00e3b38-29eb-44a4-b094-9eb734e9184c_1200x734.png" width="246" height="150.47" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b00e3b38-29eb-44a4-b094-9eb734e9184c_1200x734.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:734,&quot;width&quot;:1200,&quot;resizeWidth&quot;:246,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Node.js - Wikipedia&quot;,&quot;title&quot;:&quot;Node.js - Wikipedia&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Node.js - Wikipedia" title="Node.js - Wikipedia" srcset="https://substackcdn.com/image/fetch/$s_!PBtr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb00e3b38-29eb-44a4-b094-9eb734e9184c_1200x734.png 424w, https://substackcdn.com/image/fetch/$s_!PBtr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb00e3b38-29eb-44a4-b094-9eb734e9184c_1200x734.png 848w, https://substackcdn.com/image/fetch/$s_!PBtr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb00e3b38-29eb-44a4-b094-9eb734e9184c_1200x734.png 1272w, https://substackcdn.com/image/fetch/$s_!PBtr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb00e3b38-29eb-44a4-b094-9eb734e9184c_1200x734.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>NodeJS is the most popular framework, according to the <a href="https://survey.stackoverflow.co/2023/#technology">2023 Stack Overflow survey</a>. It can be used for long-running web application backends, text-based command-line tools, or short-lived functions that execute in a Function-as-a-Service environment, such as <a href="https://aws.amazon.com/lambda/">AWS Lambda</a>. Due to this flexibility, and the prominence of JavaScript as a web-development language, it&#8217;s no surprise NodeJS has a large community following.</p><p>NodeJS provides a single-thread programming model, which seems counter-intuitive for performance-critical applications. However, due to the <a href="https://en.wikipedia.org/wiki/Futures_and_promises">Promise-based programming model</a>, NodeJS does an effective job of time-slicing a large amount of work onto that single thread, keeping the CPU busy. This is important to the community members who care about performance.</p><p>A &#8220;Promise&#8221; is a chunk of code to be executed some time in the future, whenever the CPU is available to do the work. They are typically chained together in a sequence, and can be dependent on the completion of other Promises, or on the completion of <em>asynchronous</em> I/O. In contrast, many other programming frameworks block the thread while an <em>synchronous</em> I/O operation is in progress. For NodeJS, we instead schedule a Promise to be executed once the asynchronous I/O is complete, without blocking the overall thread of execution. Experience shows this is a high-performance and low-latency approach to performing work.</p><p>Like other communities, NodeJS users share pre-written software libraries, tools, or frameworks in a repository. The <a href="https://www.npmjs.com/">NPM Repository</a> contains a large number of packages that are easily downloaded and imported into a user&#8217;s application. As you might expect, packages that gain a positive reputation within the community become more popular and are typically held to a higher standard of quality than less popular packages. </p><p>In addition to using NPM packages, NodeJS developers are comfortable writing large amounts of custom code. However, due to the somewhat unstructured history of JavaScript (at least until 2015), there&#8217;s a wide range of programming styles in effect. A newer object-oriented syntax (with class and method definitions) is available for fans of object-oriented programming, but it&#8217;s also possible to write code in a functional style (data is passed as arguments and returned as a result). Even though it&#8217;s harder to debug, use of global variables is still commonplace in a lot of software.</p><p>There&#8217;s no disputing that the NodeJS community is very strong, for a variety of reasons. Sharing the JavaScript language and package repository with frontend (web browser) code is a key advantage. From an efficiency point of view, the use of Promises and the light-weight runtime environment is a key benefit for faster development cycles. This community will likely be active for a long time.</p><div><hr></div><h3>Example: The Ruby on Rails Community</h3><p><a href="https://rubyonrails.org/">Ruby on Rails</a> (2004) is the web application framework which made the Ruby programming language (1995) famous. You can create a complete web application in minutes by generating a templated application, then incrementally adding new features with just a few commands. The whole premise is that development should be fun and easy, allowing you to focus on your business logic rather than the underlying plumbing (which tends to be the same for every web application).</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-cKM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F560f3107-cf54-44c3-8081-4668c7af2f0c_1200x453.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-cKM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F560f3107-cf54-44c3-8081-4668c7af2f0c_1200x453.png 424w, https://substackcdn.com/image/fetch/$s_!-cKM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F560f3107-cf54-44c3-8081-4668c7af2f0c_1200x453.png 848w, https://substackcdn.com/image/fetch/$s_!-cKM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F560f3107-cf54-44c3-8081-4668c7af2f0c_1200x453.png 1272w, https://substackcdn.com/image/fetch/$s_!-cKM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F560f3107-cf54-44c3-8081-4668c7af2f0c_1200x453.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-cKM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F560f3107-cf54-44c3-8081-4668c7af2f0c_1200x453.png" width="282" height="106.455" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/560f3107-cf54-44c3-8081-4668c7af2f0c_1200x453.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:453,&quot;width&quot;:1200,&quot;resizeWidth&quot;:282,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Ruby on Rails - Wikipedia&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Ruby on Rails - Wikipedia" title="Ruby on Rails - Wikipedia" srcset="https://substackcdn.com/image/fetch/$s_!-cKM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F560f3107-cf54-44c3-8081-4668c7af2f0c_1200x453.png 424w, https://substackcdn.com/image/fetch/$s_!-cKM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F560f3107-cf54-44c3-8081-4668c7af2f0c_1200x453.png 848w, https://substackcdn.com/image/fetch/$s_!-cKM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F560f3107-cf54-44c3-8081-4668c7af2f0c_1200x453.png 1272w, https://substackcdn.com/image/fetch/$s_!-cKM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F560f3107-cf54-44c3-8081-4668c7af2f0c_1200x453.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Rails follows a &#8220;convention over configuration&#8221; approach where a pre-defined set of standards (such as naming conventions) allows the parts of your application to work together seamlessly. You are welcome to go against these conventions, but doing so makes your coding effort more challenging and could break parts of the framework in unexpected ways. It&#8217;s therefore common wisdom to always follow the standards.</p><p>For example, if you create a <code>User</code> class as a subclass of the <code>ApplicationRecord</code> class, Rails creates an underlying SQL database table named <code>users</code>, with all the necessary columns. Querying that table is as simple as writing <code>User.where(name: 'David')</code> which invokes the relevant SQL command under the hood.</p><p>Overall, the Rails architecture is prescribed for you. Using the Model-View-Controller pattern, developers add their code into the correct classes (again, with naming conventions), allowing the framework to know exactly where to find the relevant code. Once a Rails developer has learned these conventions, navigating the code base is highly efficient (although it can be very challenging for outsiders!)</p><p>The Rails philosophy is to write as little code as possible. Developers prefer to solve problems by downloading Ruby Gems (packages), which ideally should just work without much configuration. A one-line solution to a problem (which invokes a Gem) is preferable to a 10-line custom-built solution. Writing a long Ruby method (&gt; 10 lines) is often frowned upon.</p><p>The Ruby language is very dynamic, both with its variable type system and with how it allows dynamic interpretation of code. This flexibility allows very powerful programming constructs, such as DSLs (Domain Specific Languages) to be added into a Ruby-based application. On the downside, it&#8217;s nearly impossible to know that a Ruby program won&#8217;t have any syntax errors, until the code is actually executed. Unit testing has become a vital part of any Ruby on Rails application, to find errors at development time.</p><p>Overall, Ruby on Rails is one of the less common frameworks (see the <a href="https://survey.stackoverflow.co/2023/#technology">2023 Stack Overflow survey</a>), but does have a devoted user base who enjoy the flexibility of the Ruby language and the conciseness of adding new functionality to an application. The community&#8217;s mindset is has therefore developed around these qualities.</p><div><hr></div><h4>Example: The Serviceful Community</h4><p>The concept of <a href="https://sebgoa.medium.com/a-new-definition-of-modern-apps-177e7474ee09">Serviceful</a> is relatively new, but is loosely-defined as being a programming model connecting together an array of existing services, often running in the cloud (such as AWS). A Serviceful program invokes service APIs, orchestrates the flow of data from one service to another, and handles error conditions that may arise.  The community has developed around the tools and patterns used to build Serviceful applications.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!d-rU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c02a66b-549a-4fb4-b833-a92aec7dee61_2560x1533.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!d-rU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c02a66b-549a-4fb4-b833-a92aec7dee61_2560x1533.png 424w, https://substackcdn.com/image/fetch/$s_!d-rU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c02a66b-549a-4fb4-b833-a92aec7dee61_2560x1533.png 848w, https://substackcdn.com/image/fetch/$s_!d-rU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c02a66b-549a-4fb4-b833-a92aec7dee61_2560x1533.png 1272w, https://substackcdn.com/image/fetch/$s_!d-rU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c02a66b-549a-4fb4-b833-a92aec7dee61_2560x1533.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!d-rU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c02a66b-549a-4fb4-b833-a92aec7dee61_2560x1533.png" width="234" height="140.14285714285714" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5c02a66b-549a-4fb4-b833-a92aec7dee61_2560x1533.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:872,&quot;width&quot;:1456,&quot;resizeWidth&quot;:234,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;File:Amazon Web Services Logo.svg - Wikipedia&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="File:Amazon Web Services Logo.svg - Wikipedia" title="File:Amazon Web Services Logo.svg - Wikipedia" srcset="https://substackcdn.com/image/fetch/$s_!d-rU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c02a66b-549a-4fb4-b833-a92aec7dee61_2560x1533.png 424w, https://substackcdn.com/image/fetch/$s_!d-rU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c02a66b-549a-4fb4-b833-a92aec7dee61_2560x1533.png 848w, https://substackcdn.com/image/fetch/$s_!d-rU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c02a66b-549a-4fb4-b833-a92aec7dee61_2560x1533.png 1272w, https://substackcdn.com/image/fetch/$s_!d-rU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c02a66b-549a-4fb4-b833-a92aec7dee61_2560x1533.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>There are many characteristics of Serviceful development, including:</p><ol><li><p>Existing services provide functionality that we don&#8217;t need to implement ourselves. These services are easy to use when managed by a cloud provider.</p></li><li><p>Cloud-based services can scale in an elastic way. That is, you only pay for what you use, and can scale rapidly if demand for that service suddenly increases.</p></li><li><p>You&#8217;ll still need to write some custom code, but it&#8217;s often quite short. The paradigm of <a href="https://en.wikipedia.org/wiki/Event-driven_programming">Event Driven Development</a> allows small functions (such as using AWS Lambda) to execute in response to events taking place (a file is uploaded, an HTTP request is received, a timer expires).</p></li><li><p>Communication between services is done via creating and dispatching network messages (e.g. HTTP), rather than calling functions within the same (monolithic) program. This introduces the chance of partial failure if the remote service is unresponsive.</p></li><li><p>Existing application frameworks (such as NodeJS or Ruby and Rails) still have a place in this model, but are considered to just be one of the many services in the overall application (aka &#8220;microservices&#8221;). Communication with these services is usually done via the HTTP protocol.<br></p></li></ol><p>Much of the time, the <em>Serviceful</em> community is referred to as the <em>Serverless</em> community. Although there are similarities, we shouldn&#8217;t exclude services that require knowledge of the underlying servers. Instead, a better comparison would be the <em>Distributed Systems</em> community,  which focuses on communication between services over a computer network. The key observation is that many of the community&#8217;s challenges (distributed programming, network protocols, resilience) have existed for many decades.</p><p>In the Serviceful community, it&#8217;s normal to include Infrastructure as Code, such as <a href="https://aws.amazon.com/cloudformation/">CloudFormation</a> or <a href="https://aws.amazon.com/cdk/">CDK</a>, as part of the application. For example, creating and deploying code to AWS Lambda functions, provisioning databases, defining message queues, and specifying event routing rules are an important part of application development. Members of the Serviceful community must possess a strong understanding of cloud concepts, such as services, resources types, and permission rules.</p><p>In many companies, the Serviceful community is a partner to other communities, such as NodeJS or Ruby on Rails. That is, the Serviceful community are the members of the &#8220;Platform Team&#8221; who focus on the cloud infrastructure. Meanwhile, the NodeJS or Ruby on Rails community focus on the business applications that execute on the cloud-based servers, or within the Docker containers or Lambda functions. This separation of responsibility is quite common, especially in larger organizations.</p><div><hr></div><h3>Example: The Data Science Community</h3><p>The final group I&#8217;ll mention is the Data Science community. These developers focus on the hidden meaning buried in large amounts of data. With the internet filling up with more raw data every second, there&#8217;s a lot to analyze! </p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_vwx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a1d08cb-3efc-43df-a8e3-e46a4c63a791_849x548.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_vwx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a1d08cb-3efc-43df-a8e3-e46a4c63a791_849x548.png 424w, https://substackcdn.com/image/fetch/$s_!_vwx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a1d08cb-3efc-43df-a8e3-e46a4c63a791_849x548.png 848w, https://substackcdn.com/image/fetch/$s_!_vwx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a1d08cb-3efc-43df-a8e3-e46a4c63a791_849x548.png 1272w, https://substackcdn.com/image/fetch/$s_!_vwx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a1d08cb-3efc-43df-a8e3-e46a4c63a791_849x548.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_vwx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a1d08cb-3efc-43df-a8e3-e46a4c63a791_849x548.png" width="264" height="170.40282685512366" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3a1d08cb-3efc-43df-a8e3-e46a4c63a791_849x548.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:548,&quot;width&quot;:849,&quot;resizeWidth&quot;:264,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Pandas Python Library&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Pandas Python Library" title="Pandas Python Library" srcset="https://substackcdn.com/image/fetch/$s_!_vwx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a1d08cb-3efc-43df-a8e3-e46a4c63a791_849x548.png 424w, https://substackcdn.com/image/fetch/$s_!_vwx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a1d08cb-3efc-43df-a8e3-e46a4c63a791_849x548.png 848w, https://substackcdn.com/image/fetch/$s_!_vwx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a1d08cb-3efc-43df-a8e3-e46a4c63a791_849x548.png 1272w, https://substackcdn.com/image/fetch/$s_!_vwx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a1d08cb-3efc-43df-a8e3-e46a4c63a791_849x548.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Data Scientists have a statistics or mathematics background, rather than a computer science or engineering education. Their community is focused on languages (such as <a href="https://www.python.org/">Python</a>, <a href="https://www.r-project.org/">R</a>, or <a href="https://en.wikipedia.org/wiki/SQL">SQL</a>)  that support querying and summarization of data. Their code reads data into tables (rows and columns), performs analysis of the data, then outputs a summary of the key findings.</p><p>Common operations for data scientists include translating data from one format to another (such JSON to tabular format), cleaning the data by removing outliers and invalid data cells, querying tabular data to find trends, and joining data from different sources. Many of these operations are performed interactively using command line scripts, or <a href="https://jupyter.org/">Jupyter Notebooks</a> that allow visual representation of the results.</p><p>Members of the Data Science community focus their skills on writing scripts to perform data transformation, often running the scripts on small datasets on their local machine (desktop or laptop). The community is large, and a variety of data manipulation packages have been written and shared widely. One of the most well-known Python libraries is <a href="https://pandas.pydata.org/">pandas</a>, providing a set of functionality for manipulating tabular data.</p><p>Given their statistics background, Data Scientists often rely on Data Engineers to put their code &#8220;in production&#8221;, using their computer science skill set. That is, instead of running their scripts on their local desktop, a Data Engineer has the skills to automate running the script on a large amounts of data, often using a compute cluster (such as <a href="https://spark.apache.org/">Spark</a>, or <a href="https://docs.aws.amazon.com/step-functions/latest/dg/use-dist-map-orchestrate-large-scale-parallel-workloads.html">AWS Step Functions Distributed Map</a>). </p><p>As you can see, the approach and mindset of a data scientist is quite different from those in the other communities we&#8217;ve discussed.</p><div><hr></div><h3>Conclusion</h3><p>The takeaway message is that software development communities exist as a mechanism for sharing tools, libraries, frameworks, and documentation, to help build software applications. They form a common mindset by which the community members can make choices, all with the goal of building software more efficiently.</p><p>If we wish to interact with another community (other than our own), it&#8217;s important to understand the day to day life of a developer in that community, and the values by which they operate.</p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.petersmith.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Peter&#8217;s Technical Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Are We Correctly Thanking Software Engineers?]]></title><description><![CDATA[We all need to feel appreciated, but sometimes we forget to thank people...]]></description><link>https://www.petersmith.net/p/are-we-correctly-thanking-software</link><guid isPermaLink="false">https://www.petersmith.net/p/are-we-correctly-thanking-software</guid><dc:creator><![CDATA[Peter Smith]]></dc:creator><pubDate>Thu, 01 Feb 2024 04:45:22 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!MAii!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf5e69cf-1dd1-4461-a2c6-225d6bc56f5c_612x408.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MAii!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf5e69cf-1dd1-4461-a2c6-225d6bc56f5c_612x408.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MAii!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf5e69cf-1dd1-4461-a2c6-225d6bc56f5c_612x408.jpeg 424w, https://substackcdn.com/image/fetch/$s_!MAii!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf5e69cf-1dd1-4461-a2c6-225d6bc56f5c_612x408.jpeg 848w, https://substackcdn.com/image/fetch/$s_!MAii!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf5e69cf-1dd1-4461-a2c6-225d6bc56f5c_612x408.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!MAii!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf5e69cf-1dd1-4461-a2c6-225d6bc56f5c_612x408.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MAii!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf5e69cf-1dd1-4461-a2c6-225d6bc56f5c_612x408.jpeg" width="612" height="408" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bf5e69cf-1dd1-4461-a2c6-225d6bc56f5c_612x408.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:408,&quot;width&quot;:612,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Thank you alphabet letter Thank you alphabet letter thank you stock pictures, royalty-free photos &amp; images&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Thank you alphabet letter Thank you alphabet letter thank you stock pictures, royalty-free photos &amp; images" title="Thank you alphabet letter Thank you alphabet letter thank you stock pictures, royalty-free photos &amp; images" srcset="https://substackcdn.com/image/fetch/$s_!MAii!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf5e69cf-1dd1-4461-a2c6-225d6bc56f5c_612x408.jpeg 424w, https://substackcdn.com/image/fetch/$s_!MAii!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf5e69cf-1dd1-4461-a2c6-225d6bc56f5c_612x408.jpeg 848w, https://substackcdn.com/image/fetch/$s_!MAii!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf5e69cf-1dd1-4461-a2c6-225d6bc56f5c_612x408.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!MAii!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf5e69cf-1dd1-4461-a2c6-225d6bc56f5c_612x408.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Receiving appreciation is a basic human need, with software engineers being no exception. Whether it be promotion, a pay increase, an award, or just a simple &#8220;Thank you!&#8221;, our self esteem needs validation. When properly thanked, we&#8217;re capable of pushing ourselves harder, achieving more, and enjoying work for longer. In contrast, lack of appreciation leads to frustration, resentment, and is one of the main reasons <a href="https://www.indeed.com/career-advice/career-development/reasons-employees-leave">people leave their jobs</a>.</p><p>Over the years, I&#8217;ve noticed situations where the &#8220;Thank you!&#8221; felt missing, or was attributed to the wrong person. I want to share some scenarios to explain why this happens, identify who received the praise, then consider who else should have been thanked.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.petersmith.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Peter&#8217;s Technical Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h3>Customer-Facing versus Behind the Scenes</h3><p>In the software world, there&#8217;s work that&#8217;s visible to customers, and there&#8217;s work that&#8217;s &#8220;behind the scenes&#8221;. The customer-facing work includes frontend UI construction, as well as any backend feature development visible via the UI. In contrast, the behind-the-scenes work involves platform development, security, performance, latency, and system internals, things that are either hidden from the customer, or are less visible.</p><p>Both types of development are important, but the customer-facing work is easier to see, easier to understand, and easier to praise. Your colleagues in sales, marketing, developer advocacy, corporate leadership, and customer support, tend to have outgoing personalities and show excitement for each new feature. It&#8217;s easy to feel appreciated when so many are discussing your work. </p><p>As an example, I recall a UI developer demoing how they moved a button to a more prominent location on the screen. A simple code change, but the impact to the customer was significant, and the cries of &#8220;Ship It!&#8221; were free-flowing. On that same day, another developer describing their optimization of Docker images was met with polite applause, but an obvious lack of understanding or appreciation from many in the room. The developer clearly noticed.</p><p>If you see this in your organization, consider raising the profile of the behind-the-scenes work, describing it in terms important to customers. For example:</p><ul><li><p>When significantly refactoring the software, publicize how the work increases velocity of new feature development. </p></li><li><p>When an improvement is made to the software&#8217;s reliability, advertise the expected decrease in system outages (e.g. we&#8217;ll achieve 99.99% uptime).</p></li><li><p>Likewise, an improvement in performance provides a less sluggish application.</p></li><li><p>Finally, an improvement in security directly impacts the software&#8217;s compliance report, as well as the customer&#8217;s peace of mind.</p></li></ul><p>In other words, identify why people should be thankful for the work that was done. All developers deserve thanks for their work, not just those building highly-visible features.</p><h3>Being Reactive versus Being Proactive</h3><p>Escalations are common in the software industry, ranging from individual customer complaints through to complete system outages. No matter what, it&#8217;s important to identify the severity of the problem and resolve the issue accordingly. When a developer fixes a problem, they&#8217;ll surely be hoping for praise.</p><p>Be careful though - praise for reactive behaviour can give the wrong message. For example, John and Mary stay after work to resolve a system failure, working long hours into the night to address the problem. In the morning, customers are happy again and management is appreciative of John and Mary&#8217;s work. Should they receive thanks? Of course they should!</p><p>But, what about a different scenario where Frank works long hours to test his code, addressing review feedback from his peers. He believes in releasing quality code, putting in effort to ensure positive customer experience. For six whole months, he goes home at the end of the day with no outages, and no customer complaints. Should Frank receive thanks? He should, but often he doesn&#8217;t hear a thing!</p><p>To make sure praise is given where it&#8217;s due, you should always track metrics on the number of problems encountered. Although you should praise those who fight fires, much more appreciation is due to those avoiding fires in the first place (assuming they&#8217;re still able to deliver customer value!)</p><h3>Individual Contributor vs Leader</h3><p>As a software engineering leader, an important part of your role is to thank the members of your team. They look to you for validation of their effort, often going out of their way to please you, with praise being heavily sought after. It&#8217;s important for you to recognize good work, both from individuals as well as the team.</p><p>But what if you&#8217;re the leader? Who thanks you? You&#8217;ve put in countless hours of hard work, running meetings, scheduling the team&#8217;s activities, dealing with escalations, supporting low-performing team members, and even reviewing and writing code. Unfortunately, praise for your work can be hard to find.</p><p>Should the members of the team be thanking you? I&#8217;d like to think so, but the power dynamics makes it hard for individual contributors to feel comfortable praising their leader. After a while, they become accustomed to you performing those tasks, and don&#8217;t see it worthy of thanks. Ironically, you&#8217;re more likely to receive thanks if you do their work for them, such as finding the source of a complex bug.</p><p>Alternatively, should your own manager make an effort to recognize your work? I&#8217;d like to think so, but they&#8217;re just a single person who&#8217;s involved in numerous projects. They might forget to thank you, or in order to avoid bias they&#8217;ll thank the whole team, rather than thanking you personally. Unfortunately, that&#8217;s not the same feeling.</p><h3>Small Tasks versus Large Actions</h3><p>In my experience, it&#8217;s more common to receive a &#8220;Thank You&#8221; if you interact with somebody face-to-face to help with an immediate problem. For example, it&#8217;s easy to get praise if you spend time explaining a difficult concept, or if you pair with somebody to fix an annoying bug. It&#8217;s minimal effort to thank somebody, so it happens more often.</p><p>The challenge lies when your work is longer-term in nature, and people are at arms-length. You might be a leader, or might achieve great things for your company or community, but you rarely hear words of appreciation.</p><p>In one example, I once received company-wide praise for helping arrange chairs for an evening meet-up. The task required 15 minutes of my time, but was apparently worthy of a call-out to the whole company. This seemed odd to me, as somebody who spearheaded meet-ups for the past 10 years, spending 100s of hours to organize events. I had noticed the distinct lack of praise for my ongoing leadership, but why was that 15 minutes of chair moving worthy of mention?</p><p>In reality, I had received praise for my longer-term actions, but it was &#8220;behind my back&#8221;. From time to time, I&#8217;d overhear mention that I was a well-known leader in the community. I also came across email threads stating &#8220;Peter is the organizer of&#8230;&#8221;. Even though people were not giving thanks to my face, I&#8217;d instead been developing a reputation in the community. In the end (after 10 years), I did receive a significant community award, so the thanks eventually arrived.</p><h3>Demo Presenter versus Innovator</h3><p>If you&#8217;re somebody who likes to innovate, you may have seen the following scenario play out. Imagine you think of a great idea to solve a challenging customer problem, then you explain the solution to your colleague, William, to get his feedback. You draw pictures on a whiteboard, and spend time convincing William it&#8217;s worth implementing. William agrees to build a proof of concept, demonstrating the value of the idea.</p><p>A few months later, you start hearing about William&#8217;s great new idea! Apparently he gave a demo to senior staff members who loved William&#8217;s proposal. William is now the hero for solving a challenging problem, and is receiving lots of recognition for doing so. How does this make you feel? It was your idea!</p><p>At this point, you might be tempted to say &#8220;But it was my idea in the first place!&#8221;. In reality, William likely modified your idea and improved upon it, so now it&#8217;s unclear who actually deserves the credit. At the very least, you should share the praise.</p><p>The solution comes in the form of a document. Rather than white-boarding the idea,  with somebody else building the prototype, you should personally write a two-pager document that explains the vision. This document should be quick to write, and can be refined over time, but there&#8217;ll be no doubt you had the original idea.</p><h3>Short-Term Delivery versus Long-term Strategy</h3><p>Along similar lines to the previous problem. You may be the innovator of a project that ends up being a long-term venture for your company. Imagine you spent six months implementing a prototype, gaining buy-in from leadership, campaigning for budget, and putting together the project plan. Your great idea is finally in motion!</p><p>The project ends up taking 18 months to build, with a team of ten developers. They work very hard to productize the idea, building a solid solution to delight customers. You trust the team to do a good job, so you switch your focus to the next big project on the roadmap. You don&#8217;t need to be watching over their work.</p><p>Of course, when the project is eventually delivered, there&#8217;ll be plenty of praise for the whole team. But wait! The first six months of work was entirely your effort, so shouldn&#8217;t you receive 25% of the praise? The reality of being an innovator is that it takes a whole team to turn your idea into a viable product, and to support the software in the longer term. That&#8217;s not something you could do by yourself. As an innovator, you must share the praise, and also be comfortable with delayed gratification - possibly years later.</p><h3>At the Project End versus at Each Milestone</h3><p>A large boost to a software team&#8217;s morale is when you find reasons to celebrate. This is easier when you release software on a regular cadence, providing that all-important praise from customers. However, many projects take multiple years to complete, leaving long periods of time with no visible reward.</p><p>As any good project manager will tell you, it&#8217;s important to break a large project into achievable milestones. This helps with resource allocation and budgeting, but also provides a reason to thank people, and to celebrate. Even if you&#8217;ve only got half the work done, that&#8217;s a milestone to celebrate.</p><p>One of my favourite memories of being a TPM in the IP networking industry was the &#8220;ping party&#8221; we held when our newly-built network devices first sent a &#8220;ping&#8221; message across the network. We would always celebrate with ice cream (in California it&#8217;s always time for ice-cream!). There would still be several months of development before the product was fully tested and ready for release, but reaching the intermediate milestone was a good excuse to stop, celebrate success, and to thank people!</p><h3>Conclusion</h3><p>If you&#8217;re working on a software project, and you don&#8217;t feel appreciated, I encourage you to think about why. What&#8217;s missing? Do you feel ignored or resentful? Is somebody else being praised for the hard work you&#8217;re doing, or perhaps their work is less complex but more visible? All are common reasons for feeling unappreciated in the work place.</p><p>I won&#8217;t claim to have a perfect solution, but I do find that identifying the root cause is the first step to feeling better about the situation. You might be able to change the way you work - to increase the visibility of your contribution, or to raise awareness of how important your work has been to the organization.</p><p>Thanks for reading!</p><h3></h3><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.petersmith.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Peter&#8217;s Technical Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Some Indications You Might be a Principal Software Engineer]]></title><description><![CDATA[Or at least, ready to think about that promotion path...]]></description><link>https://www.petersmith.net/p/some-indications-you-might-be-a-principal</link><guid isPermaLink="false">https://www.petersmith.net/p/some-indications-you-might-be-a-principal</guid><dc:creator><![CDATA[Peter Smith]]></dc:creator><pubDate>Tue, 09 Jan 2024 18:11:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!kXuQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f0675e5-cdbb-426c-b6e2-1dbb3d3efaaa_1132x750.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I want to share some recent thoughts on when a <em>Senior Software Engineer</em> is ready for promotion to the <em>Principal Software Engineer</em> (PE) rank. These are observations I&#8217;ve made in past years, in the various organizations I&#8217;ve worked.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kXuQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f0675e5-cdbb-426c-b6e2-1dbb3d3efaaa_1132x750.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kXuQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f0675e5-cdbb-426c-b6e2-1dbb3d3efaaa_1132x750.jpeg 424w, https://substackcdn.com/image/fetch/$s_!kXuQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f0675e5-cdbb-426c-b6e2-1dbb3d3efaaa_1132x750.jpeg 848w, https://substackcdn.com/image/fetch/$s_!kXuQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f0675e5-cdbb-426c-b6e2-1dbb3d3efaaa_1132x750.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!kXuQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f0675e5-cdbb-426c-b6e2-1dbb3d3efaaa_1132x750.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kXuQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f0675e5-cdbb-426c-b6e2-1dbb3d3efaaa_1132x750.jpeg" width="572" height="378.9752650176678" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0f0675e5-cdbb-426c-b6e2-1dbb3d3efaaa_1132x750.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:750,&quot;width&quot;:1132,&quot;resizeWidth&quot;:572,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Free Person Encoding in Laptop Stock Photo&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Free Person Encoding in Laptop Stock Photo" title="Free Person Encoding in Laptop Stock Photo" srcset="https://substackcdn.com/image/fetch/$s_!kXuQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f0675e5-cdbb-426c-b6e2-1dbb3d3efaaa_1132x750.jpeg 424w, https://substackcdn.com/image/fetch/$s_!kXuQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f0675e5-cdbb-426c-b6e2-1dbb3d3efaaa_1132x750.jpeg 848w, https://substackcdn.com/image/fetch/$s_!kXuQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f0675e5-cdbb-426c-b6e2-1dbb3d3efaaa_1132x750.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!kXuQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f0675e5-cdbb-426c-b6e2-1dbb3d3efaaa_1132x750.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The PE role isn&#8217;t well defined across the industry, with each company having their own definition of when somebody is ready for promotion. In smaller companies, you likely gained the role by being the most productive developer on the team. In larger companies (such as the FAANG companies) there are strict guidelines on promotion readiness, including leadership qualities, coding skill, breadth and depth of experience, scope of your impact, and the ability to think strategically.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.petersmith.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Peter&#8217;s Technical Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>I doubt my observations will appear on a formal job description, but they are early signs of readiness for the role. Specifically:</p><ol><li><p><strong>Does the engineer say &#8220;we&#8221; to mean their team, or the whole organization?</strong></p></li><li><p><strong>Does the engineer raise problems, or do they solve problems?</strong></p></li></ol><p>Let&#8217;s see some examples of each&#8230;</p><p></p><h3>1. Saying &#8220;we&#8221; to mean the team, or the organization</h3><p>As a Principal Engineer, you don&#8217;t belong to a single team (of 6-8 developers), but will instead oversee many teams - perhaps an entire organization (100+ people) with multiple managers, directors, and a VP. You must always think about the organization as a whole, without excluding any of the teams, and without &#8220;taking sides&#8221;.</p><p>Therefore&#8230; when I hear a Senior Engineer say the following things, I know they&#8217;re not ready for the PE role.</p><ul><li><p><em><strong>We</strong> (on my team) have adopted this approach. <strong>You</strong> (on your team) might want to try it.</em></p></li><li><p><em><strong>We</strong> made a code change on the backend side, and <strong>you&#8217;ll</strong> need to make a similar change on the frontend.</em></p></li><li><p><em><strong>We</strong> did the work on the JavaScript side, but <strong>your</strong> team can port it to Kotlin if they want to.</em></p></li></ul><p>These innocent statements tell me the engineer thinks in terms of <strong>we</strong> and <strong>you</strong>, rather than thinking on behalf of the organization as a whole. There&#8217;s nothing wrong with team pride, and ownership over what the team builds, but it&#8217;s not a good approach for the Principal Engineer to only think from one perspective.</p><p>Instead, they should redefine &#8220;we&#8221; to be the organization as a whole, and to think on behalf of all teams, not just one. For example:</p><ul><li><p>The platform team has seen good results with this approach, and <strong>we</strong> should have other teams use it too.</p></li><li><p><strong>We</strong> made the code change on the backend, and <strong>we&#8217;ll</strong> also need to make a similar change on the frontend.</p></li><li><p><strong>We</strong> did the necessary work in JavaScript, but <strong>we</strong> could also make use of a Kotlin implementation.</p></li></ul><p>It&#8217;s a subtle difference in wording, but is a good indication the engineer is thinking on behalf of the whole organization, not just a single team.</p><p>Well&#8230; that&#8217;s easy to say, and I&#8217;ve encountered problems with achieving this goal. For example, if you&#8217;re a newly-minted PE who was once a member of a single team (you were promoted into the role), you&#8217;ll likely struggle with perceptions.</p><ol><li><p>Other engineers who knew you before your promotion will likely play the &#8220;we&#8221; and &#8220;you&#8221; game, and they&#8217;ll pigeon-hole you into your previous role. This is not intentional, but they were so comfortable with you in your previous role, it&#8217;s hard to adjust. Try working with different teams on different projects to break that perception.</p></li><li><p>If you&#8217;re an expert in a specific programming language, and you continue to work in that area, you&#8217;ll be closely associated with the teams using that language. Try working in different programming languages to break the perception. Be very public about the code you&#8217;re writing so people can see your range of skills.</p></li><li><p>Be careful where you sit in the office - if you sit too close to a single team, others think you&#8217;re part of that team. Move around from time to time, or sit near to the senior management who are seen as team-agnostic.</p></li><li><p>Be sure to work on projects that are cross-team, and spend equal amounts of time with each team. Bond with the team members to be accepted as an honorary member, rather than as a outsider.</p></li><li><p>Give presentations to the whole organization on a range of different topics, breaking the perception you&#8217;re associated with a single team, or one technology.</p></li></ol><p>But most importantly, when you say &#8220;we&#8221;, you should be talking about the whole organization, not just a single team.</p><p></p><h3>2. Raising problems versus solving problems</h3><p>A typical indication an engineer has reached <em>Senior Software Engineer</em> status is they feel comfortable complaining about things. They know the software well enough to understand the weaknesses, and have been a team member long enough to know the organizational problems. They&#8217;ve seen the same issues occurring over and over again, and feel confident in voicing their concerns. They&#8217;re also able to anticipate problems before they actually occur. A great skill to have!</p><p>For example, I&#8217;ve heard these complaints many times, including from my own mouth:</p><ul><li><p>We never spend enough time refactoring our code to remove technical debt.</p></li><li><p>Management doesn&#8217;t give us enough time to test our code properly.</p></li><li><p>Why don&#8217;t these new engineers know what they&#8217;re doing?!?</p></li></ul><p>Unfortunately, complaints of this nature occur when the engineer starts to be Senior, not when they start to be Principal. Identifying problems is a great skill, but to become a Principal-level thinker, it&#8217;s vital to understand the root cause of the issue and to phrase the complaints as solutions:</p><p>Let&#8217;s rephrase:</p><ul><li><p>We need to collect metrics to show our feature delivery velocity is slowing down due to poorly-written code. By identifying key areas of code rot, then spending 15% of our time refactoring those areas, we expect to see great than 15% in increased feature development velocity.</p></li><li><p>Based on the customer feedback we&#8217;ve collected, we have evidence that spending more time writing automated tests before releasing the software will noticeably increase customer satisfaction. We suggest delaying software releases by 2-3 weeks to write more tests, without being tempted to squeeze in new features.</p></li><li><p>Our new engineers are encountering friction in their work because they don&#8217;t yet have the necessary domain knowledge. I&#8217;ll set up some Lunch and Learn sessions to ensure they receive the training they need. I feel that 2-3 of our best engineers can share the training effort, with each delivering two L&amp;L sessions.</p></li></ul><p>Now, those statements are what you expect to hear from a Principal Engineer! Not complaining, but identifying solutions and putting those solutions into action. Even using fancy words (that sound &#8220;progressive&#8221;) enable your colleagues to trust your opinion more, and be more willing to buy-in to your ideas.<br></p><h3>Conclusion</h3><p>Hopefully these two ideas have been thought-provoking. They&#8217;re certainly not the only things indicating you&#8217;re ready to be a Principal Engineer, but if you don&#8217;t exhibit these qualities, I encourage you to think twice about asking for that promotion. Managers - you should insist on seeing these qualities on a regular basis before allowing a promotion - if you don&#8217;t see these skills, then promoting the engineer &#8220;because they&#8217;re the best we have&#8221; or &#8220;because they&#8217;ll quit if I don&#8217;t&#8221; is just misleading and not at all helpful for your organization.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.petersmith.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Peter&#8217;s Technical Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Deploying state machines incrementally with versions and aliases in AWS Step Functions]]></title><description><![CDATA[Please refer to the blog post I wrote for the AWS Compute Blog, announcing the launch of the versions and aliases feature for AWS Step Functions.Thanks for reading Peter&#8217;s Technical Blog!]]></description><link>https://www.petersmith.net/p/deploying-state-machines-incrementally</link><guid isPermaLink="false">https://www.petersmith.net/p/deploying-state-machines-incrementally</guid><dc:creator><![CDATA[Peter Smith]]></dc:creator><pubDate>Sun, 08 Oct 2023 03:02:27 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ulHH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ae30bda-4751-439e-80e5-96dfa0e233da_874x520.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Please refer to the blog post I wrote for the <a href="https://aws.amazon.com/blogs/compute/">AWS Compute Blog</a>, announcing the <a href="https://aws.amazon.com/blogs/compute/deploying-state-machines-incrementally-with-versions-and-aliases-in-aws-step-functions/">launch of the versions and aliases feature for AWS Step Functions.</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ulHH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ae30bda-4751-439e-80e5-96dfa0e233da_874x520.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ulHH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ae30bda-4751-439e-80e5-96dfa0e233da_874x520.png 424w, https://substackcdn.com/image/fetch/$s_!ulHH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ae30bda-4751-439e-80e5-96dfa0e233da_874x520.png 848w, https://substackcdn.com/image/fetch/$s_!ulHH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ae30bda-4751-439e-80e5-96dfa0e233da_874x520.png 1272w, https://substackcdn.com/image/fetch/$s_!ulHH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ae30bda-4751-439e-80e5-96dfa0e233da_874x520.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ulHH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ae30bda-4751-439e-80e5-96dfa0e233da_874x520.png" width="874" height="520" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6ae30bda-4751-439e-80e5-96dfa0e233da_874x520.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:520,&quot;width&quot;:874,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ulHH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ae30bda-4751-439e-80e5-96dfa0e233da_874x520.png 424w, https://substackcdn.com/image/fetch/$s_!ulHH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ae30bda-4751-439e-80e5-96dfa0e233da_874x520.png 848w, https://substackcdn.com/image/fetch/$s_!ulHH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ae30bda-4751-439e-80e5-96dfa0e233da_874x520.png 1272w, https://substackcdn.com/image/fetch/$s_!ulHH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ae30bda-4751-439e-80e5-96dfa0e233da_874x520.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.petersmith.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Peter&#8217;s Technical Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[What Truly Amazed Me in 40 Years of Computing]]></title><description><![CDATA[Experiences with computers over the last 40 years that made me say "Wow". Most changes in tech are incremental, but certain things just surprised me!]]></description><link>https://www.petersmith.net/p/what-truly-amazed-me-in-40-years</link><guid isPermaLink="false">https://www.petersmith.net/p/what-truly-amazed-me-in-40-years</guid><dc:creator><![CDATA[Peter Smith]]></dc:creator><pubDate>Thu, 10 Aug 2023 20:30:02 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!gg2k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76259d97-9d4d-4177-bf46-626d280bc5fc_1280x720.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>It&#8217;s been exactly 30 years since I packed my bags and relocated to Canada. I completed my undergrad in Computer Science in New Zealand, but was encouraged to attend graduate school in another country. This three-decade milestone got me thinking about the technical advances in my lifetime, dating back to 1982 when my family purchased a home computer. </p><p>Computer technology has dramatically changed, but there&#8217;s only a small list of things that really blew my mind. Most changes were incremental, building on top of existing ideas I&#8217;d already seen. Sure, the numbers keep on getting larger  - CPU speeds and RAM sizes increased by a factor of millions - but that wasn&#8217;t too surprising, nor exciting.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.petersmith.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Peter&#8217;s Technical Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>But, what was exciting? What were the things that made my jaw drop when I saw them? Let&#8217;s go from 1982 to 2023, looking at those milestone events and think about why they were so meaningful. Of course, in 2023 many of these technologies seem old and boring, but at the time they were amazing!</p><div><hr></div><h4>My First Computer - Sinclair ZX81 (1982)</h4><p>The first computer my family owned was a <a href="https://en.wikipedia.org/wiki/ZX81">Sinclair ZX81</a>. It had 1KB of RAM (expandable to 16KB), ran at 3.25MHz clock speed, had 64x44 black-and-white pixel resolution, plugged into the family TV, and used cassette tapes for data storage (roughly 300bps of audio screeches as data loaded). </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gg2k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76259d97-9d4d-4177-bf46-626d280bc5fc_1280x720.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gg2k!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76259d97-9d4d-4177-bf46-626d280bc5fc_1280x720.jpeg 424w, https://substackcdn.com/image/fetch/$s_!gg2k!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76259d97-9d4d-4177-bf46-626d280bc5fc_1280x720.jpeg 848w, https://substackcdn.com/image/fetch/$s_!gg2k!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76259d97-9d4d-4177-bf46-626d280bc5fc_1280x720.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!gg2k!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76259d97-9d4d-4177-bf46-626d280bc5fc_1280x720.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gg2k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76259d97-9d4d-4177-bf46-626d280bc5fc_1280x720.jpeg" width="488" height="274.5" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/76259d97-9d4d-4177-bf46-626d280bc5fc_1280x720.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:488,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;ZX81 Classic PC - YouTube&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="ZX81 Classic PC - YouTube" title="ZX81 Classic PC - YouTube" srcset="https://substackcdn.com/image/fetch/$s_!gg2k!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76259d97-9d4d-4177-bf46-626d280bc5fc_1280x720.jpeg 424w, https://substackcdn.com/image/fetch/$s_!gg2k!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76259d97-9d4d-4177-bf46-626d280bc5fc_1280x720.jpeg 848w, https://substackcdn.com/image/fetch/$s_!gg2k!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76259d97-9d4d-4177-bf46-626d280bc5fc_1280x720.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!gg2k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76259d97-9d4d-4177-bf46-626d280bc5fc_1280x720.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The ZX81 was a about the size of a modern tablet computer, but plugged into the family TV to display images, and the family cassette player to load software.</figcaption></figure></div><p>My first program was to print &#8220;Peter&#8221; in a continuous loop on the screen.</p><pre><code>10 PRINT "PETER"
20 GOTO 10</code></pre><p>I&#8217;d seen computers on TV shows, and my Dad&#8217;s office had a mainframe of some kind, but this was the first time I could program a computer for myself. I wrote software in <a href="https://en.wikipedia.org/wiki/Sinclair_BASIC">BASIC</a>, but also taught myself <a href="https://en.wikipedia.org/wiki/Zilog_Z80">Z80 machine code</a> to write &#8220;high-performance&#8221; games. Here&#8217;s an example of a chess game that we purchased for the ZX81.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2mlo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffddc1f27-67e9-4299-a558-8f15f075dbfd_256x192.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2mlo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffddc1f27-67e9-4299-a558-8f15f075dbfd_256x192.jpeg 424w, https://substackcdn.com/image/fetch/$s_!2mlo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffddc1f27-67e9-4299-a558-8f15f075dbfd_256x192.jpeg 848w, https://substackcdn.com/image/fetch/$s_!2mlo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffddc1f27-67e9-4299-a558-8f15f075dbfd_256x192.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!2mlo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffddc1f27-67e9-4299-a558-8f15f075dbfd_256x192.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2mlo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffddc1f27-67e9-4299-a558-8f15f075dbfd_256x192.jpeg" width="384" height="288" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fddc1f27-67e9-4299-a558-8f15f075dbfd_256x192.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:192,&quot;width&quot;:256,&quot;resizeWidth&quot;:384,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;ZX Chess - Chessprogramming wiki&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="ZX Chess - Chessprogramming wiki" title="ZX Chess - Chessprogramming wiki" srcset="https://substackcdn.com/image/fetch/$s_!2mlo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffddc1f27-67e9-4299-a558-8f15f075dbfd_256x192.jpeg 424w, https://substackcdn.com/image/fetch/$s_!2mlo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffddc1f27-67e9-4299-a558-8f15f075dbfd_256x192.jpeg 848w, https://substackcdn.com/image/fetch/$s_!2mlo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffddc1f27-67e9-4299-a558-8f15f075dbfd_256x192.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!2mlo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffddc1f27-67e9-4299-a558-8f15f075dbfd_256x192.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">ZX81 Chess - I still have the original audio cassette for this game!</figcaption></figure></div><p>Why was this jaw-dropping? Suddenly the world was a different place for me, and I  spent hours each day sitting in front of the keyboard, rather than playing outdoors in the yard. Most families didn&#8217;t have a computer until 10 years later, and there was no (public) internet yet. The world was still based on knowledge-sharing via paper (magazines, newspapers), with broadcast radio and TV being our main source of news and entertainment.</p><p>Life had changed for me, and was about to change for the rest of the world.</p><div><hr></div><h4>Marketing for my Second Computer (1982-83)</h4><p>Growing up in New Zealand had only one downside that I cared about - we were so far away from the UK or US that our exposure to new technology was via printed magazines (remember, no public internet yet). These magazines were shipped by sea and often arrived in our shops about 2-3 months after being published. Despite the delay, these were our lifeline for learning about new technology.</p><p>My next jaw-dropping experience was seeing the following dual-page advertisement for the <a href="https://en.wikipedia.org/wiki/ZX_Spectrum">ZX Spectrum</a> computer (successor to the ZX81 we already owned). </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4APL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18908661-59e3-4f3a-acd5-b1cb1c7c8c07_800x559.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4APL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18908661-59e3-4f3a-acd5-b1cb1c7c8c07_800x559.jpeg 424w, https://substackcdn.com/image/fetch/$s_!4APL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18908661-59e3-4f3a-acd5-b1cb1c7c8c07_800x559.jpeg 848w, https://substackcdn.com/image/fetch/$s_!4APL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18908661-59e3-4f3a-acd5-b1cb1c7c8c07_800x559.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!4APL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18908661-59e3-4f3a-acd5-b1cb1c7c8c07_800x559.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4APL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18908661-59e3-4f3a-acd5-b1cb1c7c8c07_800x559.jpeg" width="800" height="559" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/18908661-59e3-4f3a-acd5-b1cb1c7c8c07_800x559.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:559,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;This dual-page advert for the Spectrum could be found across all manner of magazines in 1982.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="This dual-page advert for the Spectrum could be found across all manner of magazines in 1982." title="This dual-page advert for the Spectrum could be found across all manner of magazines in 1982." srcset="https://substackcdn.com/image/fetch/$s_!4APL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18908661-59e3-4f3a-acd5-b1cb1c7c8c07_800x559.jpeg 424w, https://substackcdn.com/image/fetch/$s_!4APL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18908661-59e3-4f3a-acd5-b1cb1c7c8c07_800x559.jpeg 848w, https://substackcdn.com/image/fetch/$s_!4APL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18908661-59e3-4f3a-acd5-b1cb1c7c8c07_800x559.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!4APL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F18908661-59e3-4f3a-acd5-b1cb1c7c8c07_800x559.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Yes, that&#8217;s correct - it was the glossy advertising that changed my life, not the computer itself. It was the promise of colour graphics, higher-resolution (256 x 192 pixels) and sound/music! For at least six months (until early 1983), I would excitedly read the advertisements, study the magazine&#8217;s program listings to learn the new variant of BASIC, and dream about the new capabilities of the ZX Spectrum.</p><p>Sure, it was great to eventually own a ZX Spectrum, but the jaw-dropping part came from the magazines themselves.</p><div><hr></div><h4>Knight Lore (1984)</h4><p>The ZX Spectrum was the first computer for many teenagers in the 1980s (at least, in much of Europe, Australia, and New Zealand). Many of those kids are now 50+ years old, and fondly recall the five-minutes of screeching audio as our games loaded off cassette tape. There are even <a href="https://www.facebook.com/groups/WorldOfSpectrum.org/">Facebook groups</a> devoted to reminiscing about the good old days.</p><p><a href="https://en.wikipedia.org/wiki/Knight_Lore">Knight Lore</a> was a game produced by <a href="https://en.wikipedia.org/wiki/Ultimate_Play_the_Game">Ultimate Play the Game</a>, a small UK-based software company. The goal of Knight Lore was to navigate a 3D world, find the ingredients for a magic potion, then place them in the cauldron, all without losing your lives (<a href="https://www.youtube.com/watch?v=7n7qtErhF-A">see walkthrough video</a>).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2O1T!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56f59598-784e-4b4d-8269-9404f65bfb84_616x531.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2O1T!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56f59598-784e-4b4d-8269-9404f65bfb84_616x531.jpeg 424w, https://substackcdn.com/image/fetch/$s_!2O1T!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56f59598-784e-4b4d-8269-9404f65bfb84_616x531.jpeg 848w, https://substackcdn.com/image/fetch/$s_!2O1T!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56f59598-784e-4b4d-8269-9404f65bfb84_616x531.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!2O1T!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56f59598-784e-4b4d-8269-9404f65bfb84_616x531.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2O1T!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56f59598-784e-4b4d-8269-9404f65bfb84_616x531.jpeg" width="482" height="415.4902597402597" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/56f59598-784e-4b4d-8269-9404f65bfb84_616x531.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:531,&quot;width&quot;:616,&quot;resizeWidth&quot;:482,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Knight Lore | Eurogamer.net&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Knight Lore | Eurogamer.net" title="Knight Lore | Eurogamer.net" srcset="https://substackcdn.com/image/fetch/$s_!2O1T!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56f59598-784e-4b4d-8269-9404f65bfb84_616x531.jpeg 424w, https://substackcdn.com/image/fetch/$s_!2O1T!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56f59598-784e-4b4d-8269-9404f65bfb84_616x531.jpeg 848w, https://substackcdn.com/image/fetch/$s_!2O1T!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56f59598-784e-4b4d-8269-9404f65bfb84_616x531.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!2O1T!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F56f59598-784e-4b4d-8269-9404f65bfb84_616x531.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>There were plenty of awesome games that pre-dated Knight Lore, but there was something inspiring about this game. It was a combination of the amazing 3D graphics, the challenging game play, and the marketing. Ultimate Play the Game was known for their full page magazine advertisements providing no detail, other than the game&#8217;s title and the company logo. This is what made Knight Lore stand out as jaw-dropping.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HknC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F396107fa-71b7-4171-8766-45ef460d988c_561x800.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HknC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F396107fa-71b7-4171-8766-45ef460d988c_561x800.jpeg 424w, https://substackcdn.com/image/fetch/$s_!HknC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F396107fa-71b7-4171-8766-45ef460d988c_561x800.jpeg 848w, https://substackcdn.com/image/fetch/$s_!HknC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F396107fa-71b7-4171-8766-45ef460d988c_561x800.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!HknC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F396107fa-71b7-4171-8766-45ef460d988c_561x800.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HknC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F396107fa-71b7-4171-8766-45ef460d988c_561x800.jpeg" width="301" height="429.2335115864528" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/396107fa-71b7-4171-8766-45ef460d988c_561x800.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:561,&quot;resizeWidth&quot;:301,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Knight Lore Other (World of Spectrum > Additional material): Official company shipped poster&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Knight Lore Other (World of Spectrum > Additional material): Official company shipped poster" title="Knight Lore Other (World of Spectrum > Additional material): Official company shipped poster" srcset="https://substackcdn.com/image/fetch/$s_!HknC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F396107fa-71b7-4171-8766-45ef460d988c_561x800.jpeg 424w, https://substackcdn.com/image/fetch/$s_!HknC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F396107fa-71b7-4171-8766-45ef460d988c_561x800.jpeg 848w, https://substackcdn.com/image/fetch/$s_!HknC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F396107fa-71b7-4171-8766-45ef460d988c_561x800.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!HknC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F396107fa-71b7-4171-8766-45ef460d988c_561x800.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h4>Desktop-Switching on the Commodore Amiga (1985)</h4><p>The <a href="https://en.wikipedia.org/wiki/Amiga">Commodore Amiga</a> was an amazing computer when it jumped onto the market. We were impressed by the quality of graphics producing near-realistic images (4096 colours!) , and the sound was approaching the quality of music on a CD (fairly new at the time). We now had the power of full-sized arcade games in our living room, without inserting 50 cents for each game.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CiCJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F125270ab-b63b-40b4-b9c7-f8cb193d190c_1200x932.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CiCJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F125270ab-b63b-40b4-b9c7-f8cb193d190c_1200x932.jpeg 424w, https://substackcdn.com/image/fetch/$s_!CiCJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F125270ab-b63b-40b4-b9c7-f8cb193d190c_1200x932.jpeg 848w, https://substackcdn.com/image/fetch/$s_!CiCJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F125270ab-b63b-40b4-b9c7-f8cb193d190c_1200x932.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!CiCJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F125270ab-b63b-40b4-b9c7-f8cb193d190c_1200x932.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CiCJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F125270ab-b63b-40b4-b9c7-f8cb193d190c_1200x932.jpeg" width="556" height="431.82666666666665" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/125270ab-b63b-40b4-b9c7-f8cb193d190c_1200x932.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:932,&quot;width&quot;:1200,&quot;resizeWidth&quot;:556,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Amiga - Wikipedia&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Amiga - Wikipedia" title="Amiga - Wikipedia" srcset="https://substackcdn.com/image/fetch/$s_!CiCJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F125270ab-b63b-40b4-b9c7-f8cb193d190c_1200x932.jpeg 424w, https://substackcdn.com/image/fetch/$s_!CiCJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F125270ab-b63b-40b4-b9c7-f8cb193d190c_1200x932.jpeg 848w, https://substackcdn.com/image/fetch/$s_!CiCJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F125270ab-b63b-40b4-b9c7-f8cb193d190c_1200x932.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!CiCJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F125270ab-b63b-40b4-b9c7-f8cb193d190c_1200x932.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>For me, the jaw-dropping experience was quite specific. The Amiga Workbench (desktops, icons, files, windows) allowed you to have multiple desktops open, each with different content and graphics resolution. Amazingly, you could grab the top menu bar of the desktop with your mouse pointer, quickly and smoothly dragging it downward to reveal the desktop behind it, or even a portion of the other desktop. These days, this &#8220;multiple desktop switching&#8221; feature is a standard part of the MacOS or Windows experience.</p><p>When developing games on the ZX Spectrum, I was accustomed to the main CPU  writing to display memory. At a rate of 50 times/second (TV refresh rate), you&#8217;d erase the previous graphics character (e.g. space invader, monster, game piece), then redraw it at the new location in the display memory, therefore providing the illusion of animation. To implement the smooth scrolling of this "desktop switching&#8221; feature, we&#8217;d need to copy the entire display memory 50 times/second.</p><p>With the Amiga, this was the first time I&#8217;d experienced <a href="https://en.wikipedia.org/wiki/Original_Chip_Set">dedicated graphics hardware</a>, and the smooth animation was done via instructions to the Graphics Processing Unit (GPU), rather than the main CPU copying bits of data around. This jaw-dropping experience was my introduction to <a href="https://en.wikipedia.org/wiki/Application-specific_integrated_circuit">ASICs</a> (Application-Specific Integrated Circuits).</p><div><hr></div><h4>Seeing Europeans Posting on Usenet (1990)</h4><p>When I first attended university (1989-1992), we had access to both <a href="https://en.wikipedia.org/wiki/SunOS">SunOS</a> and <a href="https://en.wikipedia.org/wiki/OpenVMS">VMS</a>-based computer systems. I had no idea how large they were physically, but at 128MB of RAM, that was impressive. We always logged in remotely from dumb terminals, and knew these servers were somehow connected to the internet - mostly a university-based network at that time.</p><p>The jaw-dropping moment was my discovery of the <a href="https://en.wikipedia.org/wiki/Usenet">Usenet</a> news service, the primary mechanism for sharing news and discussions on the internet (there was no world-wide web at this time). Being in far-off New Zealand, where magazines arrived by sea, I was amazed to log in and instantly read postings from somebody in Europe, or somebody in the USA. Suddenly the world became a much smaller place, and for the low-low price of $5/MB, I too could communicate with the rest of the world.</p><div><hr></div><h4>The Amoeba Operating System (1992)</h4><p>While I was still an undergraduate student, I had the good fortune of beta testing the <a href="https://en.wikipedia.org/wiki/Amoeba_(operating_system)">Amoeba Operating System</a>. This research OS was built ground-up to be fully distributed, using 128-bit unique IDs to identify each resource in the system (e.g. files, devices, and processes). To access a resource, just provide the 128-bit ID and the system would figure out where on the network the resource was. The developer had no idea whether the resource was local to the current host, on some other host across the network, or whether it had recently moved from one place to another. This resource-location approach provided in the appearance of a single large machine, rather than multiple smaller machines connected via a network.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hLmJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6efdbd2-65a2-4693-a520-c848a5b08562_400x200.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hLmJ!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6efdbd2-65a2-4693-a520-c848a5b08562_400x200.gif 424w, https://substackcdn.com/image/fetch/$s_!hLmJ!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6efdbd2-65a2-4693-a520-c848a5b08562_400x200.gif 848w, https://substackcdn.com/image/fetch/$s_!hLmJ!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6efdbd2-65a2-4693-a520-c848a5b08562_400x200.gif 1272w, https://substackcdn.com/image/fetch/$s_!hLmJ!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6efdbd2-65a2-4693-a520-c848a5b08562_400x200.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hLmJ!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6efdbd2-65a2-4693-a520-c848a5b08562_400x200.gif" width="472" height="236" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f6efdbd2-65a2-4693-a520-c848a5b08562_400x200.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:200,&quot;width&quot;:400,&quot;resizeWidth&quot;:472,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Distributed Operating System Amoeba - general&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Distributed Operating System Amoeba - general" title="Distributed Operating System Amoeba - general" srcset="https://substackcdn.com/image/fetch/$s_!hLmJ!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6efdbd2-65a2-4693-a520-c848a5b08562_400x200.gif 424w, https://substackcdn.com/image/fetch/$s_!hLmJ!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6efdbd2-65a2-4693-a520-c848a5b08562_400x200.gif 848w, https://substackcdn.com/image/fetch/$s_!hLmJ!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6efdbd2-65a2-4693-a520-c848a5b08562_400x200.gif 1272w, https://substackcdn.com/image/fetch/$s_!hLmJ!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6efdbd2-65a2-4693-a520-c848a5b08562_400x200.gif 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>In contrast, I&#8217;d already learned about the Unix system, and how it manages resources on a per-host basis. To access resources on a local machine, you&#8217;d make a local system call. However, if the resource is on a remote machine, you must explicitly send a TCP or UDP message to the remote host&#8217;s unique IP address, where a server process would access the resource for you. In this sense, accessing a local resource (a system call on the current host) versus a remote resource (via a network message) is very different.</p><p>Amoeba was an elegant and full-featured operating system that impacted my thinking about system design, and I&#8217;m sad it never got the traction it deserved. Although the elegance of the OS design was eye-opening, it did suffer from performance issues, and wasn&#8217;t fully compatible with existing Unix software. In the end, it was eclipsed by the Linux system (as a side note, I had the good fortune of reading the famous <a href="https://en.wikipedia.org/wiki/Tanenbaum%E2%80%93Torvalds_debate">Tanenbaum-Torvalds debate</a> of Minix vs Linux as it was happening!)</p><div><hr></div><h4>HTML, The WWW and Mosaic Browser (1993)</h4><p>When I moved to Canada in 1993 to attend graduate school, I was fortunate to have an early glimpse of the WWW (World-Wide Web). My research lab installed the <a href="https://en.wikipedia.org/wiki/Mosaic_(web_browser)">Mosaic Browser</a> on their servers, allowing access to the fledgling web. I recall the NASA web site, some museum sites (perhaps the Smithsonian?), and the Internet Movie Database (IMDB). We also had the ability to create our own personal web sites, using a new description language known as HTML.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!duqO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6c54aa7-ebdc-4e2d-a19b-2e50d7154a57_1600x1180.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!duqO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6c54aa7-ebdc-4e2d-a19b-2e50d7154a57_1600x1180.png 424w, https://substackcdn.com/image/fetch/$s_!duqO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6c54aa7-ebdc-4e2d-a19b-2e50d7154a57_1600x1180.png 848w, https://substackcdn.com/image/fetch/$s_!duqO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6c54aa7-ebdc-4e2d-a19b-2e50d7154a57_1600x1180.png 1272w, https://substackcdn.com/image/fetch/$s_!duqO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6c54aa7-ebdc-4e2d-a19b-2e50d7154a57_1600x1180.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!duqO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6c54aa7-ebdc-4e2d-a19b-2e50d7154a57_1600x1180.png" width="590" height="435.20604395604397" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c6c54aa7-ebdc-4e2d-a19b-2e50d7154a57_1600x1180.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1074,&quot;width&quot;:1456,&quot;resizeWidth&quot;:590,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;undefined&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="undefined" title="undefined" srcset="https://substackcdn.com/image/fetch/$s_!duqO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6c54aa7-ebdc-4e2d-a19b-2e50d7154a57_1600x1180.png 424w, https://substackcdn.com/image/fetch/$s_!duqO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6c54aa7-ebdc-4e2d-a19b-2e50d7154a57_1600x1180.png 848w, https://substackcdn.com/image/fetch/$s_!duqO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6c54aa7-ebdc-4e2d-a19b-2e50d7154a57_1600x1180.png 1272w, https://substackcdn.com/image/fetch/$s_!duqO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6c54aa7-ebdc-4e2d-a19b-2e50d7154a57_1600x1180.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Mosaic was one of the original web browsers</figcaption></figure></div><p>The jaw-dropping moment for me was seeing hyperlinking - the ability to click a link on a web page and instantly be transferred to another page, possibly on a completely different web site. Everything on the internet had a unique URL, and I could easily add my own content for others to link to.</p><p>It was exciting, but little did we realize the WWW would become synonymous with &#8220;the internet&#8221;, and that everybody in 2023 would spend so many hours each day surfing the web</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DGw8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12b5507d-470a-453a-a24c-e6791f9f9584_994x1324.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DGw8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12b5507d-470a-453a-a24c-e6791f9f9584_994x1324.jpeg 424w, https://substackcdn.com/image/fetch/$s_!DGw8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12b5507d-470a-453a-a24c-e6791f9f9584_994x1324.jpeg 848w, https://substackcdn.com/image/fetch/$s_!DGw8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12b5507d-470a-453a-a24c-e6791f9f9584_994x1324.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!DGw8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12b5507d-470a-453a-a24c-e6791f9f9584_994x1324.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DGw8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12b5507d-470a-453a-a24c-e6791f9f9584_994x1324.jpeg" width="402" height="535.4607645875252" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/12b5507d-470a-453a-a24c-e6791f9f9584_994x1324.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1324,&quot;width&quot;:994,&quot;resizeWidth&quot;:402,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DGw8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12b5507d-470a-453a-a24c-e6791f9f9584_994x1324.jpeg 424w, https://substackcdn.com/image/fetch/$s_!DGw8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12b5507d-470a-453a-a24c-e6791f9f9584_994x1324.jpeg 848w, https://substackcdn.com/image/fetch/$s_!DGw8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12b5507d-470a-453a-a24c-e6791f9f9584_994x1324.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!DGw8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12b5507d-470a-453a-a24c-e6791f9f9584_994x1324.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">My daily commute in 2017 - almost everybody was staring at their phones as they waited for the ferry.</figcaption></figure></div><div><hr></div><h4>The Java Programming Language (1996) and Java Duke</h4><p>I first heard of the Java programming language at the <a href="https://www.sigplan.org/Conferences/OOPSLA/">OOPSLA &#8216;96 conference</a>. People were raving about this newly-designed language that addressed many of C++&#8217;s weaknesses, and incorporated idea from Smalltalk and LISP. For me, it was the first &#8220;new&#8221; language I&#8217;d seen, as languages like BASIC, C, C++, and Pascal, were well-established by the time I learned them.</p><p>The jaw-dropping experience was the ability to execute Java code inside somebody else&#8217;s web browser (aka <a href="https://en.wikipedia.org/wiki/Java_applet">Java Applets</a>) running on top of a virtual machine (VM). It was only a couple of years since I&#8217;d learned HTML, but now I could run general-purpose program code inside an end-user&#8217;s browser. I&#8217;d always loved the idea of &#8220;remote procedure calls&#8221;, but this was going to a new level.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5Z9U!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b2d2b03-6dbe-4ee4-b8ab-f579d6e38bb3_568x1023.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5Z9U!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b2d2b03-6dbe-4ee4-b8ab-f579d6e38bb3_568x1023.png 424w, https://substackcdn.com/image/fetch/$s_!5Z9U!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b2d2b03-6dbe-4ee4-b8ab-f579d6e38bb3_568x1023.png 848w, https://substackcdn.com/image/fetch/$s_!5Z9U!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b2d2b03-6dbe-4ee4-b8ab-f579d6e38bb3_568x1023.png 1272w, https://substackcdn.com/image/fetch/$s_!5Z9U!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b2d2b03-6dbe-4ee4-b8ab-f579d6e38bb3_568x1023.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5Z9U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b2d2b03-6dbe-4ee4-b8ab-f579d6e38bb3_568x1023.png" width="120" height="216.1267605633803" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1b2d2b03-6dbe-4ee4-b8ab-f579d6e38bb3_568x1023.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1023,&quot;width&quot;:568,&quot;resizeWidth&quot;:120,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;File:Duke (Java mascot) waving.svg - Wikimedia Commons&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="File:Duke (Java mascot) waving.svg - Wikimedia Commons" title="File:Duke (Java mascot) waving.svg - Wikimedia Commons" srcset="https://substackcdn.com/image/fetch/$s_!5Z9U!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b2d2b03-6dbe-4ee4-b8ab-f579d6e38bb3_568x1023.png 424w, https://substackcdn.com/image/fetch/$s_!5Z9U!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b2d2b03-6dbe-4ee4-b8ab-f579d6e38bb3_568x1023.png 848w, https://substackcdn.com/image/fetch/$s_!5Z9U!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b2d2b03-6dbe-4ee4-b8ab-f579d6e38bb3_568x1023.png 1272w, https://substackcdn.com/image/fetch/$s_!5Z9U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b2d2b03-6dbe-4ee4-b8ab-f579d6e38bb3_568x1023.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>One example was <a href="https://dev.java/duke/">Java Duke</a>, who had previously appeared as a fixed image on a web page. But now, it was possible to animate Duke, and have him actually wave his hand! My jaw dropped when I first saw that particular web page on the Sun Microsystems site! Gone were the days where HTML pages would load once, then remain fixed until the next page load.</p><div><hr></div><h4>Handheld GPS (1998)</h4><p>The <a href="https://en.wikipedia.org/wiki/Global_Positioning_System">Global Positioning System</a> (GPS) was first launch in 1978, but exclusively for the US military. In the late 1990s it had become accessible to the public, and I purchased a hand-held receiver unit. This was my first exposure to yet another technology that&#8217;s now in everyone&#8217;s pockets.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!calF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89bc050d-63e7-4788-b11b-28f44df64b2b_650x650.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!calF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89bc050d-63e7-4788-b11b-28f44df64b2b_650x650.jpeg 424w, https://substackcdn.com/image/fetch/$s_!calF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89bc050d-63e7-4788-b11b-28f44df64b2b_650x650.jpeg 848w, https://substackcdn.com/image/fetch/$s_!calF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89bc050d-63e7-4788-b11b-28f44df64b2b_650x650.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!calF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89bc050d-63e7-4788-b11b-28f44df64b2b_650x650.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!calF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89bc050d-63e7-4788-b11b-28f44df64b2b_650x650.jpeg" width="334" height="334" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/89bc050d-63e7-4788-b11b-28f44df64b2b_650x650.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:650,&quot;width&quot;:650,&quot;resizeWidth&quot;:334,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;How Does GPS Work? | NASA Space Place &#8211; NASA Science for Kids&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="How Does GPS Work? | NASA Space Place &#8211; NASA Science for Kids" title="How Does GPS Work? | NASA Space Place &#8211; NASA Science for Kids" srcset="https://substackcdn.com/image/fetch/$s_!calF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89bc050d-63e7-4788-b11b-28f44df64b2b_650x650.jpeg 424w, https://substackcdn.com/image/fetch/$s_!calF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89bc050d-63e7-4788-b11b-28f44df64b2b_650x650.jpeg 848w, https://substackcdn.com/image/fetch/$s_!calF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89bc050d-63e7-4788-b11b-28f44df64b2b_650x650.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!calF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89bc050d-63e7-4788-b11b-28f44df64b2b_650x650.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>My GPS receiver was very simple, taking about five minutes to find your location. The small LCD display listed the satellites it was syncing with, and after a minimum number (I think it was 3?) it would report your longitude and latitude (no map drawing yet). We didn&#8217;t yet have <a href="https://en.wikipedia.org/wiki/Assisted_GNSS">Assisted GPS</a>, so the start-up time was still fairly high.</p><p>Around that same time, I attend a university lecture given by the Chief Strategy Officer of <a href="https://en.wikipedia.org/wiki/Sun_Microsystems">Sun Microsystems</a>. He presented a wild idea of having GPS in your car, which would direct you to the nearest McDonalds restaurant and provide you with discount coupons. What an awesome futuristic idea!</p><p>I recall riding the local buses and trains with my GPS device in hand, taking note of the coordinates of the local landmarks. Sadly that&#8217;s as far as I got with that idea, but it was certainly a revolution in the making.</p><div><hr></div><h4>YouTube&#8217;s Storage Size (2005)</h4><p>One of my grad school colleagues did his research on video streaming, so I was no stranger to the topic. I also had plenty of short videos downloaded to my home computer, which were costly to store given their multi-megabyte file size. </p><p>What shocked me about <a href="https://en.wikipedia.org/wiki/YouTube">YouTube</a> was the magnitude of the storage required. If anybody could create and upload their own videos, possibly 15-30 minutes in duration, I was perplexed by the number of disks required to store all that data! Wouldn&#8217;t it be terabytes of new disk space required every hour?</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!beUc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95850b1c-7f3f-4c2e-8ae4-ce02b5dff186_280x60.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!beUc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95850b1c-7f3f-4c2e-8ae4-ce02b5dff186_280x60.jpeg 424w, https://substackcdn.com/image/fetch/$s_!beUc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95850b1c-7f3f-4c2e-8ae4-ce02b5dff186_280x60.jpeg 848w, https://substackcdn.com/image/fetch/$s_!beUc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95850b1c-7f3f-4c2e-8ae4-ce02b5dff186_280x60.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!beUc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95850b1c-7f3f-4c2e-8ae4-ce02b5dff186_280x60.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!beUc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95850b1c-7f3f-4c2e-8ae4-ce02b5dff186_280x60.jpeg" width="280" height="60" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/95850b1c-7f3f-4c2e-8ae4-ce02b5dff186_280x60.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:60,&quot;width&quot;:280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;undefined&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="undefined" title="undefined" srcset="https://substackcdn.com/image/fetch/$s_!beUc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95850b1c-7f3f-4c2e-8ae4-ce02b5dff186_280x60.jpeg 424w, https://substackcdn.com/image/fetch/$s_!beUc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95850b1c-7f3f-4c2e-8ae4-ce02b5dff186_280x60.jpeg 848w, https://substackcdn.com/image/fetch/$s_!beUc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95850b1c-7f3f-4c2e-8ae4-ce02b5dff186_280x60.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!beUc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95850b1c-7f3f-4c2e-8ae4-ce02b5dff186_280x60.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>I&#8217;m still not clear how they manage all that storage, but I&#8217;ve always imagined a convoy of trucks heading toward each data centre, carrying nothing but disk drives. You&#8217;d then require dozens of people working around the clock to plug them in, just to keep up with the amount of data being uploaded. It must be more of a problem these days with modern social media and photo applications, constantly uploading new images to the cloud.</p><div><hr></div><h4>A First Glance at Amazon EC2 (2006)</h4><p>As far back as 1993, I had installed the Linux operating system on numerous hosts, mostly from floppy disk or CD-ROM. The first distribution I used was <a href="https://en.wikipedia.org/wiki/Slackware">Slackware</a>, requiring many hours of installation effort, and a lot of manual typing. Years later I purchased a license for <a href="https://en.wikipedia.org/wiki/Red_Hat_Linux">Redhat Linux</a>, and received the CD-ROMs in the mail. Even as recently as 2017, I downloaded an Linux ISO file and ran <a href="https://en.wikipedia.org/wiki/CentOS">CentOS</a> in a virtual machine on my desktop computer. </p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2JK0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe92f7e23-4595-4b44-9481-38cfd065c8cc_401x257.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2JK0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe92f7e23-4595-4b44-9481-38cfd065c8cc_401x257.png 424w, https://substackcdn.com/image/fetch/$s_!2JK0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe92f7e23-4595-4b44-9481-38cfd065c8cc_401x257.png 848w, https://substackcdn.com/image/fetch/$s_!2JK0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe92f7e23-4595-4b44-9481-38cfd065c8cc_401x257.png 1272w, https://substackcdn.com/image/fetch/$s_!2JK0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe92f7e23-4595-4b44-9481-38cfd065c8cc_401x257.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2JK0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe92f7e23-4595-4b44-9481-38cfd065c8cc_401x257.png" width="285" height="182.65586034912718" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e92f7e23-4595-4b44-9481-38cfd065c8cc_401x257.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:257,&quot;width&quot;:401,&quot;resizeWidth&quot;:285,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;What is Amazon EC2 in AWS? | DevOps Automateinfra Learning&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="What is Amazon EC2 in AWS? | DevOps Automateinfra Learning" title="What is Amazon EC2 in AWS? | DevOps Automateinfra Learning" srcset="https://substackcdn.com/image/fetch/$s_!2JK0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe92f7e23-4595-4b44-9481-38cfd065c8cc_401x257.png 424w, https://substackcdn.com/image/fetch/$s_!2JK0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe92f7e23-4595-4b44-9481-38cfd065c8cc_401x257.png 848w, https://substackcdn.com/image/fetch/$s_!2JK0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe92f7e23-4595-4b44-9481-38cfd065c8cc_401x257.png 1272w, https://substackcdn.com/image/fetch/$s_!2JK0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe92f7e23-4595-4b44-9481-38cfd065c8cc_401x257.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>When I started using <a href="https://aws.amazon.com/ec2/">Amazon EC2</a> in 2014, I had memories of an article I read in <a href="https://www.linux-magazine.com/">Linux Magazine</a>, probably from 2006. The article talked about the new EC2 service from Amazon allowing you to create a Linux VM within minutes. You would ssh to your new VM to perform your work (no direct console access). Not only was installation time impressive, but you were only charged for the number of minutes the server was running, with no upfront hardware setup cost! When you were done, you&#8217;d simply delete the VM and no longer be charged.</p><p>I didn&#8217;t need EC2 at the time, because I already had my Linux hardware configured and &#8220;paid for&#8221;. It wasn&#8217;t until 2013 that I fully appreciated the value of the cloud - I was working for a company that insisted on their own private data centres, literally requiring months of careful project management and cross-team communication to get a new server running. The cloud changed all of that.</p><div><hr></div><h4>Interactivity of Google Maps</h4><p>Google Maps has improved dramatically over the years, but the one improvement I remember the most is interactive scrolling. Before this feature, you needed to explicitly click the left, right, up, or down button to see the map scroll to a different location, triggering another page load each time. This was typical of web applications at the time.</p><p>One day though, I discovered you could simply drag the map left, right, up, or down, and the page would smoothly scroll, with new parts of the map appearing like magic. In fact, you could scroll infinitely in any direction. We&#8217;d now reached the time when JavaScript (running in the browser) could dynamically load data, and redraw parts of the canvas, without reloading the full web page.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!djwc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98f6ac7f-1e72-4ca2-993a-6fd3f865ef9c_864x1156.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!djwc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98f6ac7f-1e72-4ca2-993a-6fd3f865ef9c_864x1156.png 424w, https://substackcdn.com/image/fetch/$s_!djwc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98f6ac7f-1e72-4ca2-993a-6fd3f865ef9c_864x1156.png 848w, https://substackcdn.com/image/fetch/$s_!djwc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98f6ac7f-1e72-4ca2-993a-6fd3f865ef9c_864x1156.png 1272w, https://substackcdn.com/image/fetch/$s_!djwc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98f6ac7f-1e72-4ca2-993a-6fd3f865ef9c_864x1156.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!djwc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98f6ac7f-1e72-4ca2-993a-6fd3f865ef9c_864x1156.png" width="402" height="537.8611111111111" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/98f6ac7f-1e72-4ca2-993a-6fd3f865ef9c_864x1156.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1156,&quot;width&quot;:864,&quot;resizeWidth&quot;:402,&quot;bytes&quot;:641599,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!djwc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98f6ac7f-1e72-4ca2-993a-6fd3f865ef9c_864x1156.png 424w, https://substackcdn.com/image/fetch/$s_!djwc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98f6ac7f-1e72-4ca2-993a-6fd3f865ef9c_864x1156.png 848w, https://substackcdn.com/image/fetch/$s_!djwc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98f6ac7f-1e72-4ca2-993a-6fd3f865ef9c_864x1156.png 1272w, https://substackcdn.com/image/fetch/$s_!djwc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98f6ac7f-1e72-4ca2-993a-6fd3f865ef9c_864x1156.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I&#8217;m really not sure why this feature impressed me so much, but for some reason it did. It didn&#8217;t fit my mental model of how web browsers work, so this feature caused me to rethink what was possible.</p><div><hr></div><h4>The iPhone (2007)</h4><p>It&#8217;s clear now that the iPhone has changed the world, but it was hardly the first device of its kind. I remember seeing the <a href="https://en.wikipedia.org/wiki/Apple_Newton">Apple Newton</a> in 1993, and I personally owned a <a href="https://en.wikipedia.org/wiki/PalmPilot">Palm Pilot</a> in the early 2000s. Both of these devices used a stylus for data input, and had a crude form of handwriting recognition instead of a keyboard. Although some of the later Palm devices had wireless connectivity, they were generally standalone devices.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sApn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f2be51f-3bc0-4bf6-86e7-b7e4c41f4d2a_1280x600.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sApn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f2be51f-3bc0-4bf6-86e7-b7e4c41f4d2a_1280x600.jpeg 424w, https://substackcdn.com/image/fetch/$s_!sApn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f2be51f-3bc0-4bf6-86e7-b7e4c41f4d2a_1280x600.jpeg 848w, https://substackcdn.com/image/fetch/$s_!sApn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f2be51f-3bc0-4bf6-86e7-b7e4c41f4d2a_1280x600.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!sApn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f2be51f-3bc0-4bf6-86e7-b7e4c41f4d2a_1280x600.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sApn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f2be51f-3bc0-4bf6-86e7-b7e4c41f4d2a_1280x600.jpeg" width="506" height="237.1875" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6f2be51f-3bc0-4bf6-86e7-b7e4c41f4d2a_1280x600.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:1280,&quot;resizeWidth&quot;:506,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;All the New Swipe Gestures on Your New iPhone XS, XS Max, or XR &#171; iOS &amp;  iPhone :: Gadget Hacks&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="All the New Swipe Gestures on Your New iPhone XS, XS Max, or XR &#171; iOS &amp;  iPhone :: Gadget Hacks" title="All the New Swipe Gestures on Your New iPhone XS, XS Max, or XR &#171; iOS &amp;  iPhone :: Gadget Hacks" srcset="https://substackcdn.com/image/fetch/$s_!sApn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f2be51f-3bc0-4bf6-86e7-b7e4c41f4d2a_1280x600.jpeg 424w, https://substackcdn.com/image/fetch/$s_!sApn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f2be51f-3bc0-4bf6-86e7-b7e4c41f4d2a_1280x600.jpeg 848w, https://substackcdn.com/image/fetch/$s_!sApn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f2be51f-3bc0-4bf6-86e7-b7e4c41f4d2a_1280x600.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!sApn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f2be51f-3bc0-4bf6-86e7-b7e4c41f4d2a_1280x600.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>The iPhone was jaw-dropping for several reasons, but for me it was the use of finger-swiping gestures instead of the stylus (which I often feared losing). I was amazed the first time I saw somebody in a coffee shop making flicking and pinching motions on their phone screen, with the display updating via smooth animation to give immediate feedback on the operation.</p><div><hr></div><h4>IBM Watson (2011)</h4><p>With all the news about ChatGPT, I would be remiss in mentioning <a href="https://en.wikipedia.org/wiki/IBM_Watson">IBM Watson</a> from 2011. I excitedly watched the episode of Jeopardy where a computer faced off against two human contestants, and did extremely well. What boggled my mind was that Watson had so much knowledge about so many things, and was able to interpret English sentences and respond in mere seconds.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qTOK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee77d2e4-b299-4e8b-9e4b-b6252e3ca1aa_422x236.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qTOK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee77d2e4-b299-4e8b-9e4b-b6252e3ca1aa_422x236.jpeg 424w, https://substackcdn.com/image/fetch/$s_!qTOK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee77d2e4-b299-4e8b-9e4b-b6252e3ca1aa_422x236.jpeg 848w, https://substackcdn.com/image/fetch/$s_!qTOK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee77d2e4-b299-4e8b-9e4b-b6252e3ca1aa_422x236.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!qTOK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee77d2e4-b299-4e8b-9e4b-b6252e3ca1aa_422x236.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qTOK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee77d2e4-b299-4e8b-9e4b-b6252e3ca1aa_422x236.jpeg" width="422" height="236" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ee77d2e4-b299-4e8b-9e4b-b6252e3ca1aa_422x236.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:236,&quot;width&quot;:422,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;undefined&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="undefined" title="undefined" srcset="https://substackcdn.com/image/fetch/$s_!qTOK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee77d2e4-b299-4e8b-9e4b-b6252e3ca1aa_422x236.jpeg 424w, https://substackcdn.com/image/fetch/$s_!qTOK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee77d2e4-b299-4e8b-9e4b-b6252e3ca1aa_422x236.jpeg 848w, https://substackcdn.com/image/fetch/$s_!qTOK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee77d2e4-b299-4e8b-9e4b-b6252e3ca1aa_422x236.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!qTOK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee77d2e4-b299-4e8b-9e4b-b6252e3ca1aa_422x236.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Of course, ChatGPT has taken this one step further, being accessible to the general public. But I&#8217;ll never forget the first time I saw IBM Watson in action.</p><div><hr></div><h4>This Person Does Not Exist (2020)</h4><p>The most recent technology that blew my mind was <a href="https://thispersondoesnotexist.com/">This Person Does Not Exist</a>. You can literally spend hours looking for new friends! Give it a try!</p><div><hr></div><h3></h3><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.petersmith.net/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Peter&#8217;s Technical Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Software Naming— The Test of Time]]></title><description><![CDATA[Thoughts on how to choose better names in your software, so they don&#8217;t cause confusion in the longer term.]]></description><link>https://www.petersmith.net/p/software-naming-the-test-of-time-1f636bc309dd</link><guid isPermaLink="false">https://www.petersmith.net/p/software-naming-the-test-of-time-1f636bc309dd</guid><dc:creator><![CDATA[Peter Smith]]></dc:creator><pubDate>Mon, 12 Jul 2021 00:01:04 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/a2725126-8525-4eeb-b2a0-cb9fbc09b2d6_800x473.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Yk1S!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F024d9c79-f36e-4869-ad1d-947504c56bb1_800x473.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Yk1S!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F024d9c79-f36e-4869-ad1d-947504c56bb1_800x473.png 424w, https://substackcdn.com/image/fetch/$s_!Yk1S!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F024d9c79-f36e-4869-ad1d-947504c56bb1_800x473.png 848w, https://substackcdn.com/image/fetch/$s_!Yk1S!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F024d9c79-f36e-4869-ad1d-947504c56bb1_800x473.png 1272w, https://substackcdn.com/image/fetch/$s_!Yk1S!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F024d9c79-f36e-4869-ad1d-947504c56bb1_800x473.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Yk1S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F024d9c79-f36e-4869-ad1d-947504c56bb1_800x473.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/024d9c79-f36e-4869-ad1d-947504c56bb1_800x473.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Yk1S!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F024d9c79-f36e-4869-ad1d-947504c56bb1_800x473.png 424w, https://substackcdn.com/image/fetch/$s_!Yk1S!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F024d9c79-f36e-4869-ad1d-947504c56bb1_800x473.png 848w, https://substackcdn.com/image/fetch/$s_!Yk1S!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F024d9c79-f36e-4869-ad1d-947504c56bb1_800x473.png 1272w, https://substackcdn.com/image/fetch/$s_!Yk1S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F024d9c79-f36e-4869-ad1d-947504c56bb1_800x473.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>You&#8217;ll have a hard time finding an experienced software engineer who doesn&#8217;t have a preferred naming convention. This includes function or method names, classes, files, directories, projects, products, or potentially any concept that we talk about, or any identifier in our source code.</p><p>This blog post isn&#8217;t about typical naming conventions. We won&#8217;t discuss whether to use <code>camelCase</code> or <code>snake_case</code>, or whether an identifier should contain a noun, or a verb. We won&#8217;t specify how to pluralize, or whether suffixing with <code>Manager</code> or <code>Util</code> is a bad idea. The Internet is full of blog posts covering those topics.</p><p>Instead, we&#8217;ll focus on the long-term effects of your naming choices. The identifiers you select today may still be in use 5&#8211;10 years from now. Most developers want their software to be used long term, yet they choose identifiers that are not resilient to the passage of time. This discussion is based on more than 20 years experience of seeing these issues in the software industry, so the problems are real!</p><p>What&#8217;s the impact of non-resilient names? To be honest, there&#8217;s plenty of other technical debt you could focus on first, but learning how to select a <em>resilient</em> name avoids confusion in the long-term, once your product is successful and mature.</p><h3>What in the World Might&nbsp;Change?</h3><p>To illustrate the passage of time, we&#8217;ll use an example of a SaaS application. Imagine that <em>Kitty Incorporated</em> is a small software company with a single web-based product, <em>KittyPics</em>, allowing customers to upload and share their favourite cat photos.</p><p>Here are some events in the life of <em>Kitty Incorporated</em>:</p><ul><li><p><strong>Scaling Web Traffic</strong>&#8202;&#8212;&#8202;Initially <em>KittyPics</em> only required a single web server, with a single disk drive to store images. Due to the site&#8217;s growing popularity, there&#8217;s now a need for multiple web servers and multiple disks.</p></li><li><p><strong>Projects Come and Go</strong>&#8202;&#8212;&#8202;After successfully serving cat pictures, a new internal project (code named <em>Tiger</em>) is started at <em>Kitty Inc</em>. The goal of the <em>Tiger</em> project is to allow upload of cat videos, rather than just photos. The project is active for six months, with all code being added to the same <em>KittyPics</em> code base.</p></li><li><p><strong>More Flexibility</strong>&#8212; Although the original version of <em>KittyPics</em> did one thing (and did it well), countless new features and configuration settings were added over time. Users now have many ways to adjust their experience, requiring updated algorithms in the code.</p></li><li><p><strong>Changing Technology&#8202;</strong>&#8212;&#8202;The CTO decides that maintaining a large number of disk drives isn&#8217;t cost effective, so a project is initiated to move all cat pictures/videos to Amazon Web Services using S3 Buckets. Similarly, a switch was made from storing meta-data in MySQL, to instead using the more scalable DynamoDB database.</p></li><li><p><strong>Kitty Incorporate is Acquired&#8202;</strong>&#8212;&#8202;After successfully operating for two years, <em>Kitty Inc</em> is acquired by a competitor&#8202;&#8212;&#8202;<em>Cat Pictures Corporation</em>.</p></li><li><p><strong>KittyPics is Renamed&#8202;</strong>&#8212;&#8202;After careful market analysis, it&#8217;s determine that KittyPics should be renamed to a much trendier name: <em>SabreShots</em>.</p></li></ul><p>These events are common in a software product&#8217;s lifecycle. We&#8217;ll now focus on how these changes cause our identifiers to become outdated. If they&#8217;re outdated, the code becomes harder to read and more costly to maintain, therefore increasing the technical debt load.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TCkM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71857788-cc88-4376-a691-369e7c417663_250x354.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TCkM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71857788-cc88-4376-a691-369e7c417663_250x354.png 424w, https://substackcdn.com/image/fetch/$s_!TCkM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71857788-cc88-4376-a691-369e7c417663_250x354.png 848w, https://substackcdn.com/image/fetch/$s_!TCkM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71857788-cc88-4376-a691-369e7c417663_250x354.png 1272w, https://substackcdn.com/image/fetch/$s_!TCkM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71857788-cc88-4376-a691-369e7c417663_250x354.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TCkM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71857788-cc88-4376-a691-369e7c417663_250x354.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/71857788-cc88-4376-a691-369e7c417663_250x354.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TCkM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71857788-cc88-4376-a691-369e7c417663_250x354.png 424w, https://substackcdn.com/image/fetch/$s_!TCkM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71857788-cc88-4376-a691-369e7c417663_250x354.png 848w, https://substackcdn.com/image/fetch/$s_!TCkM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71857788-cc88-4376-a691-369e7c417663_250x354.png 1272w, https://substackcdn.com/image/fetch/$s_!TCkM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F71857788-cc88-4376-a691-369e7c417663_250x354.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>How Names Become Confusing or Inconsistent</h3><p>Let&#8217;s look at some real-world examples of how the names we choose can become confusing or inconsistent, largely due to the passage of time. For each example, we&#8217;ll see a naming choice that made sense when the software was first written, but stopped making sense at a later time. We&#8217;ll also see an example of names that are more resilient to change.</p><h4>1&#8202;&#8212;&#8202;Expect New Implementations, but Avoid Calling Them&nbsp;&#8220;New&#8221;</h4><p>Let&#8217;s start with a simple example. Imagine our <em>KittyPics</em> software contains an algorithm for automatically positioning cat pictures on the screen. The algorithm has worked well, but we want to add a second algorithm, while still keeping the first as an optional feature.</p><p>If we started with the following code:</p><pre><code>def layoutAlgorithm { ... }</code></pre><p>There&#8217;s a natural tendency to define the second algorithm as:</p><pre><code>def newLayoutAlgorithm { ... }</code></pre><p>Clearly this makes sense at the time when transitioning from old to new algorithm, but the word <code>new</code> becomes less relevant over time. After a few years, developers consider both implementations as &#8220;old&#8221;, and there&#8217;s no hint about what the algorithms actually do. Also, if there was a third implementation added, would it be called <code>muchNewerLayoutAlgorithm</code>?</p><p>A better practice is to name the algorithms in a descriptive way to indicate how they differ. For example:</p><pre><code>def gridLayoutAlgorithm { ... }
def circularLayoutAlgorithm { ... }</code></pre><p>These names remain relevant as time passes, since their purpose is well-described, even years after their &#8220;new-ness&#8221; fades.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7t_B!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe06ded2a-5411-4e6d-b5b3-58408382677f_185x191.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7t_B!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe06ded2a-5411-4e6d-b5b3-58408382677f_185x191.png 424w, https://substackcdn.com/image/fetch/$s_!7t_B!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe06ded2a-5411-4e6d-b5b3-58408382677f_185x191.png 848w, https://substackcdn.com/image/fetch/$s_!7t_B!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe06ded2a-5411-4e6d-b5b3-58408382677f_185x191.png 1272w, https://substackcdn.com/image/fetch/$s_!7t_B!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe06ded2a-5411-4e6d-b5b3-58408382677f_185x191.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7t_B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe06ded2a-5411-4e6d-b5b3-58408382677f_185x191.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e06ded2a-5411-4e6d-b5b3-58408382677f_185x191.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7t_B!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe06ded2a-5411-4e6d-b5b3-58408382677f_185x191.png 424w, https://substackcdn.com/image/fetch/$s_!7t_B!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe06ded2a-5411-4e6d-b5b3-58408382677f_185x191.png 848w, https://substackcdn.com/image/fetch/$s_!7t_B!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe06ded2a-5411-4e6d-b5b3-58408382677f_185x191.png 1272w, https://substackcdn.com/image/fetch/$s_!7t_B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe06ded2a-5411-4e6d-b5b3-58408382677f_185x191.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h4>2&#8202;&#8212;&#8202;Remember That Names Last Longer Than&nbsp;Projects</h4><p><em>Projects</em> and <em>Products</em> differ is significant ways. <em>Projects</em> are a time-bounded (for example, six months in duration) and have the primary purpose of adding a discrete set of features to an existing code base. In contrast, a <em>Product</em> has a much longer life cycle, possibly existing for 5&#8211;10 years, or even longer.</p><p>It&#8217;s important to keep in mind that any name you choose in your software will last beyond the scope of the current project. In our example of the <em>Tiger</em> project (to add cat videos to the <em>KittyPics</em> product), any use of the name <code>Tiger</code> becomes increasingly meaningless after the project finishes.</p><pre><code>module Tiger
def tigerAlgorithm {
   count = getTigerCount()
   ...
}</code></pre><p>Newly-hired developers will struggle to wrap their minds around what <code>Tiger</code> means, especially as the Tiger project is now ancient history.</p><p>Instead, use names that are meaningful after the project has completed, when the newly-added features have simply become part of the long-term product code base. For example:</p><pre><code>module VideoManagement
def videoPlacementAlgorithm {
  count = getVideoCount()
  ...
}</code></pre><p>This advice may seem obvious, but it&#8217;s very common to see modules, classes, algorithms, databases, or DNS names with project names embedded into them. With the passage of time, code bases are littered with temporary project names in their identifiers.</p><p>One example of this problem is using the project&#8217;s name to identify newly-created software modules. For example, <code>Tiger</code> may become the internal code name for the new software module that handles videos. Over time, people think of the Tiger module, rather than the Tiger project, when they hear the name <code>Tiger</code>, even if the original project had a much wider scope.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Hk5S!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda94ed18-d646-4796-9754-c1f5baccf771_190x182.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Hk5S!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda94ed18-d646-4796-9754-c1f5baccf771_190x182.png 424w, https://substackcdn.com/image/fetch/$s_!Hk5S!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda94ed18-d646-4796-9754-c1f5baccf771_190x182.png 848w, https://substackcdn.com/image/fetch/$s_!Hk5S!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda94ed18-d646-4796-9754-c1f5baccf771_190x182.png 1272w, https://substackcdn.com/image/fetch/$s_!Hk5S!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda94ed18-d646-4796-9754-c1f5baccf771_190x182.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Hk5S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda94ed18-d646-4796-9754-c1f5baccf771_190x182.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/da94ed18-d646-4796-9754-c1f5baccf771_190x182.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Hk5S!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda94ed18-d646-4796-9754-c1f5baccf771_190x182.png 424w, https://substackcdn.com/image/fetch/$s_!Hk5S!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda94ed18-d646-4796-9754-c1f5baccf771_190x182.png 848w, https://substackcdn.com/image/fetch/$s_!Hk5S!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda94ed18-d646-4796-9754-c1f5baccf771_190x182.png 1272w, https://substackcdn.com/image/fetch/$s_!Hk5S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda94ed18-d646-4796-9754-c1f5baccf771_190x182.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h4>3&#8202;&#8212;&#8202;Consider That Technology Changes, but Names Don&#8217;t Need&nbsp;To</h4><p>Throughout a product&#8217;s lifespan, it&#8217;s common to change the underlying technology used. For example, our <em>KittyPics</em> product moved from using MySQL as the primary database, to using the more scalable DynamoDB. Ideally, this change should be transparent to most of the software, but that&#8217;s not always the case.</p><p>In the first implementation of KittyPics it made sense to write code such as:</p><pre><code>mySql.getImageCount()</code></pre><p>but that clearly doesn&#8217;t make sense when you switch to using DynamoDB. Instead, use a more generic name for your data store, such as:</p><pre><code>metaData.getImageCount() </code></pre><p>You&#8217;ll still need to modify the internal code for the <code>getImageCount()</code> method, but if you choose your names more carefully, you won&#8217;t need to modify the code everywhere the <code>getImageCount()</code> method is called.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kKCn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29897642-0dce-441e-a55b-c6bef0016eb7_190x208.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kKCn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29897642-0dce-441e-a55b-c6bef0016eb7_190x208.png 424w, https://substackcdn.com/image/fetch/$s_!kKCn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29897642-0dce-441e-a55b-c6bef0016eb7_190x208.png 848w, https://substackcdn.com/image/fetch/$s_!kKCn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29897642-0dce-441e-a55b-c6bef0016eb7_190x208.png 1272w, https://substackcdn.com/image/fetch/$s_!kKCn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29897642-0dce-441e-a55b-c6bef0016eb7_190x208.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kKCn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29897642-0dce-441e-a55b-c6bef0016eb7_190x208.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/29897642-0dce-441e-a55b-c6bef0016eb7_190x208.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kKCn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29897642-0dce-441e-a55b-c6bef0016eb7_190x208.png 424w, https://substackcdn.com/image/fetch/$s_!kKCn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29897642-0dce-441e-a55b-c6bef0016eb7_190x208.png 848w, https://substackcdn.com/image/fetch/$s_!kKCn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29897642-0dce-441e-a55b-c6bef0016eb7_190x208.png 1272w, https://substackcdn.com/image/fetch/$s_!kKCn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29897642-0dce-441e-a55b-c6bef0016eb7_190x208.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h4>4&#8202;&#8212;&#8202;Marketing Names Can Change Outside Your&nbsp;Control</h4><p>The name of your product (or the domain objects within the product), can change due to marketing or sales decisions. In our example, <em>KittyPics</em> was rebranded to <em>SabreShots</em>, with the goal of increasing sales. Likewise, names of domain objects within the product (such as <code>Collection</code> or <code>Image</code>), may change to something new (such as <code>Album</code> or <code>Photo</code>). Experience shows that products can go through 2&#8211;3 name changes in a ten year span, largely due to corporate acquisitions or rebranding.</p><p>To solve the problem, you might choose to go through a major refactoring exercise to keep the internal code base aligned with the latest external names. This makes sense for small code bases, but for larger products it&#8217;ll only be a source of frustration and newly-introduced bugs.</p><p>Also, be very cautious about mixing the old name with the new name in the same code base (for example, passing a <code>Photo</code> object into a method expecting an <code>Image</code> object, even if they&#8217;re the same thing). Inter-mixing the old and new names can be very confusing for developers, especially when they&#8217;re developing in the code base for the first time.</p><p>Perhaps the best advice is to avoid adapting the code base at all, but instead be comfortable with internal names being different from external names. That is, always use the original names of <code>Collection</code> or <code>Image</code>, regardless of the current customer-facing name. Of course, developers need to remember the mapping from internal name to external name, but at least the code will be consistent and easier to work with.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!A5QO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63df87f3-e63e-43c5-9b43-a5391221d826_204x180.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!A5QO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63df87f3-e63e-43c5-9b43-a5391221d826_204x180.png 424w, https://substackcdn.com/image/fetch/$s_!A5QO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63df87f3-e63e-43c5-9b43-a5391221d826_204x180.png 848w, https://substackcdn.com/image/fetch/$s_!A5QO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63df87f3-e63e-43c5-9b43-a5391221d826_204x180.png 1272w, https://substackcdn.com/image/fetch/$s_!A5QO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63df87f3-e63e-43c5-9b43-a5391221d826_204x180.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!A5QO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63df87f3-e63e-43c5-9b43-a5391221d826_204x180.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/63df87f3-e63e-43c5-9b43-a5391221d826_204x180.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!A5QO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63df87f3-e63e-43c5-9b43-a5391221d826_204x180.png 424w, https://substackcdn.com/image/fetch/$s_!A5QO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63df87f3-e63e-43c5-9b43-a5391221d826_204x180.png 848w, https://substackcdn.com/image/fetch/$s_!A5QO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63df87f3-e63e-43c5-9b43-a5391221d826_204x180.png 1272w, https://substackcdn.com/image/fetch/$s_!A5QO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63df87f3-e63e-43c5-9b43-a5391221d826_204x180.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h4>5&#8202;&#8212;&#8202;Your Company Name Can Also&nbsp;Change</h4><p>It probably comes as no surprise, but it&#8217;s good advice to not include your company&#8217;s name in source code identifiers. Mergers and acquisitions are common, and acquired company names disappear into history.</p><p>Regardless of whether your company name changes, you should always think twice before using it in identifiers. What value does it add to have <code>Kitty</code> in these names? You probably want more descriptive names than these:</p><pre><code>kittyCollectionCount = 100
def displayKittyImage() { ... }
kittyDb.executeSql(...)</code></pre><p>Of course, DNS names are an obvious places where the company name is a requirement, so it&#8217;s not a hard rule. Interestingly, many smaller companies only sell a single product, so their company name and product name end up being synonymous, with a lot of work involved to break that association as the company grows. Was <code>kitty.com</code> the product, or the company?</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!q3Un!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4c33b8c-5f54-4d27-92cd-d4820ea8db18_216x182.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!q3Un!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4c33b8c-5f54-4d27-92cd-d4820ea8db18_216x182.png 424w, https://substackcdn.com/image/fetch/$s_!q3Un!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4c33b8c-5f54-4d27-92cd-d4820ea8db18_216x182.png 848w, https://substackcdn.com/image/fetch/$s_!q3Un!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4c33b8c-5f54-4d27-92cd-d4820ea8db18_216x182.png 1272w, https://substackcdn.com/image/fetch/$s_!q3Un!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4c33b8c-5f54-4d27-92cd-d4820ea8db18_216x182.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!q3Un!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4c33b8c-5f54-4d27-92cd-d4820ea8db18_216x182.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b4c33b8c-5f54-4d27-92cd-d4820ea8db18_216x182.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!q3Un!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4c33b8c-5f54-4d27-92cd-d4820ea8db18_216x182.png 424w, https://substackcdn.com/image/fetch/$s_!q3Un!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4c33b8c-5f54-4d27-92cd-d4820ea8db18_216x182.png 848w, https://substackcdn.com/image/fetch/$s_!q3Un!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4c33b8c-5f54-4d27-92cd-d4820ea8db18_216x182.png 1272w, https://substackcdn.com/image/fetch/$s_!q3Un!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4c33b8c-5f54-4d27-92cd-d4820ea8db18_216x182.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h4>6&#8202;&#8212;&#8202;Avoid Being Too Specific About a&nbsp;Purpose</h4><p>When implementing code, there&#8217;s a tendency to focus on solving the specific problem you have at that moment. However, as the product changes, there&#8217;s a desire to reuse code you wrote in the past. Refactoring has become a common activity for most developers, including the selection of more suitable names.</p><p>For example, imagine you had written a <code>PhotoArranger</code> class, providing the ability for users to manually change the order of the photos on the product home page. However, now that the home page also supports videos, the <code>PhotoArranger</code> name becomes confusing. One solution would be to rename the class to <code>PhotoAndVideoArranger</code>, but that gets even more problematic when the ability to add static text and hyperlinks is added to this class (should we now call it <code>PhotoAndVideoAndStaticTextWithHyperlinksArranger</code>&nbsp;?)</p><p>Perhaps the original name should have been something less specific, making it more flexible and resilient to change. Perhaps <code>HomePageArranger</code>? However, something like <code>MediaLayoutArranger</code> would be even more generic, allowing it to be used on other pages (not just the home page).</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VkKO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc17b5e42-f2c6-443b-901c-a9b4fe787752_190x151.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VkKO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc17b5e42-f2c6-443b-901c-a9b4fe787752_190x151.png 424w, https://substackcdn.com/image/fetch/$s_!VkKO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc17b5e42-f2c6-443b-901c-a9b4fe787752_190x151.png 848w, https://substackcdn.com/image/fetch/$s_!VkKO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc17b5e42-f2c6-443b-901c-a9b4fe787752_190x151.png 1272w, https://substackcdn.com/image/fetch/$s_!VkKO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc17b5e42-f2c6-443b-901c-a9b4fe787752_190x151.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VkKO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc17b5e42-f2c6-443b-901c-a9b4fe787752_190x151.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c17b5e42-f2c6-443b-901c-a9b4fe787752_190x151.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VkKO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc17b5e42-f2c6-443b-901c-a9b4fe787752_190x151.png 424w, https://substackcdn.com/image/fetch/$s_!VkKO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc17b5e42-f2c6-443b-901c-a9b4fe787752_190x151.png 848w, https://substackcdn.com/image/fetch/$s_!VkKO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc17b5e42-f2c6-443b-901c-a9b4fe787752_190x151.png 1272w, https://substackcdn.com/image/fetch/$s_!VkKO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc17b5e42-f2c6-443b-901c-a9b4fe787752_190x151.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h4>7&#8202;&#8212;&#8202;You&#8217;ll Likely Have More Than&nbsp;One</h4><p>Software is often written with time-to-market as a driving factor, rather than worrying about long-term growth (which may never happen if you don&#8217;t get to market quickly). One downside of this approach is that software is designed to support <em>only one</em> of a particular component, rather than many.</p><p>In our <em>KittyPics</em> example, the following components needed to go from one instance to many instances:</p><ul><li><p>The number of web application servers running the software.</p></li><li><p>The number of disks used for storing image files.</p></li><li><p>The number of countries in which servers are running.</p></li><li><p>The number of databases required for storing meta-information.</p></li></ul><p>Aside from all the software design decisions you&#8217;ll need to make (such as using an array of server names, rather than a single server name), there are naming issues too.</p><p>For example, for API server names, you&#8217;d probably start with:</p><pre><code>api.kitty.com</code></pre><p>but when you need to support servers in multiple countries, what do you call them? The common solution is:</p><pre><code>api.kitty.com
api-nz.kitty.com
api-uk.kitty.com
...</code></pre><p>This example is certainly on the low-end of the technical debt scale, but it does look odd that there&#8217;s no country tag on the first entry, and it might cause confusion in your automatic deployment scripts. To avoid these scenarios, you should keep expansion in mind and add a descriptive tag on the first DNS name you use. In the case of countable things, add a <code>-1</code> to the name by default, such as <code>api-1.kitty.com</code>.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2cGT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02d0365f-7cab-46f1-970d-46d48676fe88_190x203.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2cGT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02d0365f-7cab-46f1-970d-46d48676fe88_190x203.png 424w, https://substackcdn.com/image/fetch/$s_!2cGT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02d0365f-7cab-46f1-970d-46d48676fe88_190x203.png 848w, https://substackcdn.com/image/fetch/$s_!2cGT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02d0365f-7cab-46f1-970d-46d48676fe88_190x203.png 1272w, https://substackcdn.com/image/fetch/$s_!2cGT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02d0365f-7cab-46f1-970d-46d48676fe88_190x203.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2cGT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02d0365f-7cab-46f1-970d-46d48676fe88_190x203.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/02d0365f-7cab-46f1-970d-46d48676fe88_190x203.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2cGT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02d0365f-7cab-46f1-970d-46d48676fe88_190x203.png 424w, https://substackcdn.com/image/fetch/$s_!2cGT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02d0365f-7cab-46f1-970d-46d48676fe88_190x203.png 848w, https://substackcdn.com/image/fetch/$s_!2cGT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02d0365f-7cab-46f1-970d-46d48676fe88_190x203.png 1272w, https://substackcdn.com/image/fetch/$s_!2cGT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02d0365f-7cab-46f1-970d-46d48676fe88_190x203.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Summary&#8202;&#8212;&#8202;How Much Should You&nbsp;Care?</h3><p>In this blog post we&#8217;ve seen a number of different ways in which names fail to be resilient over time, therefore becoming stale, confusing, or inconsistent.</p><p>What should you do to fix these problems?</p><p>To be honest, although these problems occur in numerous products, in almost every company, people always manage to adjust to the inconsistencies. There&#8217;s confusion when internal names don&#8217;t match external names, especially for new employees who are wrapping their minds around a new code base. However, given enough time, people always seem to overcome the learning curve and just live with the inconsistencies.</p><p>Should we refactor our product to resolve these inconsistencies?</p><p>That decision is totally up to you, based on how much effort it&#8217;ll be to fix the problem, versus how much pain it&#8217;ll be to live with it. This is the ever-present <em>technical debt</em> problem we all face with legacy code. Refactoring an internal class name is a simple and pragmatic improvement, but changing a customer-facing DNS name is a challenge you might choose to ignore.</p><p>Perhaps the best advice is to try and use meaningful names when you first write the code. However, no matter how much effort you put into choosing names for the long term, you&#8217;ll never quite get it right. In the end, you must be prepared to accept and ignore the remaining problems.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LHDE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ba7337c-218f-4770-b116-4d7321a34a75_236x308.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LHDE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ba7337c-218f-4770-b116-4d7321a34a75_236x308.png 424w, https://substackcdn.com/image/fetch/$s_!LHDE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ba7337c-218f-4770-b116-4d7321a34a75_236x308.png 848w, https://substackcdn.com/image/fetch/$s_!LHDE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ba7337c-218f-4770-b116-4d7321a34a75_236x308.png 1272w, https://substackcdn.com/image/fetch/$s_!LHDE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ba7337c-218f-4770-b116-4d7321a34a75_236x308.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LHDE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ba7337c-218f-4770-b116-4d7321a34a75_236x308.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0ba7337c-218f-4770-b116-4d7321a34a75_236x308.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LHDE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ba7337c-218f-4770-b116-4d7321a34a75_236x308.png 424w, https://substackcdn.com/image/fetch/$s_!LHDE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ba7337c-218f-4770-b116-4d7321a34a75_236x308.png 848w, https://substackcdn.com/image/fetch/$s_!LHDE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ba7337c-218f-4770-b116-4d7321a34a75_236x308.png 1272w, https://substackcdn.com/image/fetch/$s_!LHDE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ba7337c-218f-4770-b116-4d7321a34a75_236x308.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div>]]></content:encoded></item><item><title><![CDATA[Fast Authorization with DynamoDB]]></title><description><![CDATA[Using AWS DynamoDB to manage access to a SaaS platform&#8217;s domain objects.]]></description><link>https://www.petersmith.net/p/fast-authorization-with-dynamodb-cd1f133437e3</link><guid isPermaLink="false">https://www.petersmith.net/p/fast-authorization-with-dynamodb-cd1f133437e3</guid><dc:creator><![CDATA[Peter Smith]]></dc:creator><pubDate>Mon, 14 Jun 2021 12:37:10 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/704d1720-9546-4b07-b3df-2cde67440e11_800x492.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uJiK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45e0abcf-d809-46c9-a5f0-24b23bf199fc_800x492.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uJiK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45e0abcf-d809-46c9-a5f0-24b23bf199fc_800x492.png 424w, https://substackcdn.com/image/fetch/$s_!uJiK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45e0abcf-d809-46c9-a5f0-24b23bf199fc_800x492.png 848w, https://substackcdn.com/image/fetch/$s_!uJiK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45e0abcf-d809-46c9-a5f0-24b23bf199fc_800x492.png 1272w, https://substackcdn.com/image/fetch/$s_!uJiK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45e0abcf-d809-46c9-a5f0-24b23bf199fc_800x492.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uJiK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45e0abcf-d809-46c9-a5f0-24b23bf199fc_800x492.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/45e0abcf-d809-46c9-a5f0-24b23bf199fc_800x492.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uJiK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45e0abcf-d809-46c9-a5f0-24b23bf199fc_800x492.png 424w, https://substackcdn.com/image/fetch/$s_!uJiK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45e0abcf-d809-46c9-a5f0-24b23bf199fc_800x492.png 848w, https://substackcdn.com/image/fetch/$s_!uJiK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45e0abcf-d809-46c9-a5f0-24b23bf199fc_800x492.png 1272w, https://substackcdn.com/image/fetch/$s_!uJiK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45e0abcf-d809-46c9-a5f0-24b23bf199fc_800x492.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>This blog post describes a high-performance micro-service we created at <a href="https://www.wegalvanize.com/">Galvanize</a> to meet the authorization needs of our SaaS platform. This new micro-service, code-named &#8220;Authy&#8221; (not related to the commercial product of the same name), provides platform-wide management of users and groups, while tracking permissions assigned to all the platform&#8217;s domain objects. Amazon&#8217;s DynamoDB is used to manage the authorization data in our platform, typically allowing authorization decisions to be made within 20&#8211;30ms.</p><p>In the past, authorization was handled separately by each service, so different parts of our SaaS platform solved the problem in different ways. With our new Authy service, we&#8217;ve unified the approach, providing a consistent model for controlling access to resources. This was especially important as we added new micro-services, and didn&#8217;t want all of them to solve the same challenges in their own unique way.</p><p>The following diagram shows the authorization workflow. First, a user requests some data from our platform, typically asking for an HTML page, or a JSON response to an API call. The backend services each call upon Authy to validate the current user has access to the resource they&#8217;re requesting. Based on Authy&#8217;s response, the client either approves or rejects the request.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cflg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5c9321-862e-419e-a7ac-349873e45602_800x552.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cflg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5c9321-862e-419e-a7ac-349873e45602_800x552.png 424w, https://substackcdn.com/image/fetch/$s_!cflg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5c9321-862e-419e-a7ac-349873e45602_800x552.png 848w, https://substackcdn.com/image/fetch/$s_!cflg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5c9321-862e-419e-a7ac-349873e45602_800x552.png 1272w, https://substackcdn.com/image/fetch/$s_!cflg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5c9321-862e-419e-a7ac-349873e45602_800x552.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cflg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5c9321-862e-419e-a7ac-349873e45602_800x552.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5a5c9321-862e-419e-a7ac-349873e45602_800x552.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cflg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5c9321-862e-419e-a7ac-349873e45602_800x552.png 424w, https://substackcdn.com/image/fetch/$s_!cflg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5c9321-862e-419e-a7ac-349873e45602_800x552.png 848w, https://substackcdn.com/image/fetch/$s_!cflg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5c9321-862e-419e-a7ac-349873e45602_800x552.png 1272w, https://substackcdn.com/image/fetch/$s_!cflg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5a5c9321-862e-419e-a7ac-349873e45602_800x552.png 1456w" sizes="100vw"></picture><div></div></div></a></figure></div><p>Note that Authy provides <em>Authorization</em> services, but doesn&#8217;t handle <em>Authentication</em>. All HTTP requests from the user are accompanied by a JWT token confirming the user&#8217;s identity. Authy then determines whether the resource being accessed (e.g. <code>/projects/1234</code>) is available to the user.</p><h3>The Requirements</h3><p>To understand how Authy works, let&#8217;s start by learning more about our SaaS product. Galvanize operates in the GRC (Governance, Risk management, and Compliance) space, helping companies be fiscally and socially responsible. This includes fighting accounting fraud, and detecting corporate waste.</p><p>From a software perspective, our product is similar to an ERP or CRM solution. It has a beautiful user experience via the web interface, with well-defined REST APIs for programmatic access to data. Multiple tenants access the same platform, but can only see their own organization&#8217;s data. Finally, each of our customers potentially has hundreds of users, so we have a powerful user-management system with complex user-defined workflows.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mPDm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05512a4b-04c3-493b-a4c5-de1762ecdc03_563x409.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mPDm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05512a4b-04c3-493b-a4c5-de1762ecdc03_563x409.png 424w, https://substackcdn.com/image/fetch/$s_!mPDm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05512a4b-04c3-493b-a4c5-de1762ecdc03_563x409.png 848w, https://substackcdn.com/image/fetch/$s_!mPDm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05512a4b-04c3-493b-a4c5-de1762ecdc03_563x409.png 1272w, https://substackcdn.com/image/fetch/$s_!mPDm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05512a4b-04c3-493b-a4c5-de1762ecdc03_563x409.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mPDm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05512a4b-04c3-493b-a4c5-de1762ecdc03_563x409.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/05512a4b-04c3-493b-a4c5-de1762ecdc03_563x409.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mPDm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05512a4b-04c3-493b-a4c5-de1762ecdc03_563x409.png 424w, https://substackcdn.com/image/fetch/$s_!mPDm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05512a4b-04c3-493b-a4c5-de1762ecdc03_563x409.png 848w, https://substackcdn.com/image/fetch/$s_!mPDm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05512a4b-04c3-493b-a4c5-de1762ecdc03_563x409.png 1272w, https://substackcdn.com/image/fetch/$s_!mPDm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05512a4b-04c3-493b-a4c5-de1762ecdc03_563x409.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Let&#8217;s now learn about the GRC domain model, as well as how users and groups are granted permission to access the domain objects (aka &#8220;resources&#8221;).</p><h4>The Domain&nbsp;Model</h4><p>Our SaaS product has a large number of <em>Domain Objects</em>, representing items of interest to the customer. The most important top-level domain objects include:</p><ul><li><p><strong>Project</strong>&#8202;&#8212;&#8202;Represents a body of work performed by a user. This includes a schedule for undertaking the project work, <em><strong>Risks</strong></em> identified within the project, <em><strong>Controls</strong></em> to help manage the Risks, <em><strong>Issues</strong></em> identified, and <em><strong>Actions</strong></em> to resolve the Issues.</p></li><li><p><strong>Asset</strong>&#8202;&#8212;&#8202;Represents something of interest to the company, such as a third-party vendor, an IT server, or a piece of software.</p></li><li><p><strong>Collection</strong>&#8202;&#8212;&#8202;A collection of data tables, each containing many rows of data. For example, the access log from a security system showing when people entered/exited a building.</p></li><li><p><strong>Toolkit</strong>&#8202;&#8212;&#8202;A template of product content, such as a pre-defined project with typical risks and controls already defined for common scenarios.</p></li></ul><p>Inside each of these top-level domain objects is a large number of fine-grained domain objects. For example, within a Project, a user would add Risks, Controls, and Issues, each with their own ID number.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!soFC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F559fe165-2925-4eb7-a2f1-179346a6b480_800x674.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!soFC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F559fe165-2925-4eb7-a2f1-179346a6b480_800x674.png 424w, https://substackcdn.com/image/fetch/$s_!soFC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F559fe165-2925-4eb7-a2f1-179346a6b480_800x674.png 848w, https://substackcdn.com/image/fetch/$s_!soFC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F559fe165-2925-4eb7-a2f1-179346a6b480_800x674.png 1272w, https://substackcdn.com/image/fetch/$s_!soFC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F559fe165-2925-4eb7-a2f1-179346a6b480_800x674.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!soFC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F559fe165-2925-4eb7-a2f1-179346a6b480_800x674.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/559fe165-2925-4eb7-a2f1-179346a6b480_800x674.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!soFC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F559fe165-2925-4eb7-a2f1-179346a6b480_800x674.png 424w, https://substackcdn.com/image/fetch/$s_!soFC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F559fe165-2925-4eb7-a2f1-179346a6b480_800x674.png 848w, https://substackcdn.com/image/fetch/$s_!soFC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F559fe165-2925-4eb7-a2f1-179346a6b480_800x674.png 1272w, https://substackcdn.com/image/fetch/$s_!soFC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F559fe165-2925-4eb7-a2f1-179346a6b480_800x674.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>The other top-level domain objects, such as Assets, Collections, or Toolkits, are also container-like, holding their own fine-grained domain objects.</p><h4>Users, Groups, and Organizations</h4><p>Our SaaS platform supports multiple tenants (aka Organizations), where each customer&#8217;s data is fully isolated from that of other customers, even though they reside in the same database. All database queries include an <code>orgId</code> field to ensure the correct data is returned.</p><p>Within each organization there are multiple <em><strong>Users</strong></em>, which for large customers will number in the thousands. Most customers make use of <em><strong>Groups</strong></em> to help manage those users.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GUDj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd69ad904-f0f8-4ae8-b9ec-1af97c019103_800x600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GUDj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd69ad904-f0f8-4ae8-b9ec-1af97c019103_800x600.png 424w, https://substackcdn.com/image/fetch/$s_!GUDj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd69ad904-f0f8-4ae8-b9ec-1af97c019103_800x600.png 848w, https://substackcdn.com/image/fetch/$s_!GUDj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd69ad904-f0f8-4ae8-b9ec-1af97c019103_800x600.png 1272w, https://substackcdn.com/image/fetch/$s_!GUDj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd69ad904-f0f8-4ae8-b9ec-1af97c019103_800x600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GUDj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd69ad904-f0f8-4ae8-b9ec-1af97c019103_800x600.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d69ad904-f0f8-4ae8-b9ec-1af97c019103_800x600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GUDj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd69ad904-f0f8-4ae8-b9ec-1af97c019103_800x600.png 424w, https://substackcdn.com/image/fetch/$s_!GUDj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd69ad904-f0f8-4ae8-b9ec-1af97c019103_800x600.png 848w, https://substackcdn.com/image/fetch/$s_!GUDj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd69ad904-f0f8-4ae8-b9ec-1af97c019103_800x600.png 1272w, https://substackcdn.com/image/fetch/$s_!GUDj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd69ad904-f0f8-4ae8-b9ec-1af97c019103_800x600.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Users and groups can both be assigned permissions to access various domain objects in the system. For example, User John may be assigned write access to a specific Project, whereas the Finance Group may only have read access. Note that Administrators are a special class of Users who automatically have access to everything.</p><h4>Domain Objects are Containers for Permissions</h4><p>When it comes to granting permissions for users or groups, this can only be done on top-level domain objects (such as Projects) which are containers for fine-grained objects, such as Risks or Controls. It is not possible, nor is it desirable, to grant permission on fine-grained objects.</p><p>This is enforced for several reasons:</p><ol><li><p>Experience shows that when customers can grant user permissions on every domain object, no matter how small, they end up with a lot of support challenges when permissions become confusing to manage.</p></li><li><p>We have billions of fine-grained domain objects in the system, which would explode the number of permissions that Authy must manage.</p></li><li><p>Likewise, the number of calls from a backend service to Authy would be excessive if every domain object must be authorized.</p></li></ol><p>Therefore, we chose to only allow permissions on top-level domain objects, while requiring fine-grained objects to inherit permissions from their parent container. To make this work in a practical way, the permissions provided on the top-level objects are very expressive.</p><p>For example, let&#8217;s imagine that John has the following permissions, set on a specific Project:</p><ul><li><p><code>CAN_CREATE_RISKS</code>&#8202;&#8212;&#8202;This allows John to create a new (fine-grained) Risk object, but only in the context of this specific Project.</p></li><li><p><code>CAN_CREATE_ISSUES</code>&#8202;&#8212;&#8202;Similar to the previous example, John can also create Issue objects inside this Project.</p></li></ul><p>Note that John can not create other fine-grained objects, such as Controls, or Actions, since he doesn&#8217;t have those permissions on the parent Project.</p><p>Next, John may also have permissions to <em>read</em> the existing domain objects in the Project:</p><ul><li><p><code>CAN_READ_RISKS</code>&#8202;&#8212;&#8202;This permits John to read the content of any Risk object in the Project.</p></li><li><p><code>CAN_READ_ISSUE_IF_OWNER</code>&#8202;&#8212;&#8202;This allows John to read Issues in the Project, but only if his name appears in the <code>owner</code> field of the Issue. He doesn&#8217;t have permission to read any other Issues that he&#8217;s not the owner of.</p></li></ul><p>This second example illustrates how permissions become more fine-grained, while still only being set on the Project itself. Once we&#8217;ve moved fully over to Authy, we anticipate having hundreds of possible permissions. At a minimum, this includes create, read, update, delete, and list permissions for each of the possible fine-grained objects.</p><p>Now that we understand the domain object we&#8217;re authorizing access to, let&#8217;s learn about queries that Authy must respond to.</p><h3>The Queries</h3><p>An important goal for Authy is to respond to queries within 20&#8211;30ms, so as to avoid negative impact on the end-user response time. Authy&#8217;s algorithms were optimized to reduce the run time of a single query, but also to reduce the number of calls necessary to validate an incoming customer request.</p><h4>Common Queries</h4><p>The following are typical queries to Authy, based on the type of information a backend service needs when authorizing an incoming request. We did a comprehensive review of our platform&#8217;s authorization requirements to ensure that Authy would cover them all.</p><p>The common queries are:</p><ol><li><p><strong>Give me the personal details for a user&#8202;</strong>&#8212;&#8202;This is needed throughout the platform when displaying a user&#8217;s name or sending them email.</p></li><li><p><strong>Give me the list of groups in this organization, and their members</strong>&#8202;&#8212;&#8202;This is used for display purposes only, but is not used for authorization.</p></li><li><p><strong>Give me the list of groups that this user belongs to</strong>&#8202;&#8212;&#8202;Also used for display purposes only, but not for authorization.</p></li><li><p><strong>Can this user perform this operation on this resource (aka domain object)?</strong>&#8202;&#8212;&#8202;This is a very common question, accounting for most of the queries to Authy. The user&#8217;s ID comes from the JWT of the original HTTP request, and the resource ID comes from the URL of the request (such as <code>/projects/1234</code>). Finally, the necessary permission (for example, <code>CAN_READ_PROJECT</code>) depends on the semantics of the HTTP operation, such as <code>GET</code> versus <code>PUT</code>.</p></li><li><p><strong>What operations can this user perform on this resource?</strong>&#8202;&#8212;&#8202;This is similar to the previous query, but rather than providing a single yes/no answer, Authy provides the complete set of permissions a user has for that resource (such as <code>[ CAN_READ_PROJECT, CAN_UPDATE_PROJECT, CAN_READ_RISKS,&nbsp;&#8230;]</code>. This reduces the need to send multiple requests to inquire about multiple permissions.</p></li><li><p><strong>Which users in the organization can perform this operation on this resource?</strong>&#8202;&#8212;&#8202;This is not used as much as the previous queries, but is very useful to identify users in the organization with a base set of permissions. For example, to populate a drop-down list of users who could be asked to review an Issue object, we&#8217;ll ask Authy for all users who have the <code>CAN_REVIEW_ISSUE</code> permission. There&#8217;s no point in assigning a user to review something if they don&#8217;t have permission to do so.</p></li><li><p><strong>Which resources can this user perform this operation on?</strong>&#8202;&#8212;&#8202;This query is very important for listing the objects a user can act upon. For example, when the user requests a list of Projects, they should only see the Projects they have <code>CAN_READ_PROJECT</code> permission for. Other Projects should not be listed.</p></li></ol><h4>Queries That Shouldn&#8217;t Be&nbsp;Asked</h4><p>When designing Authy, we identified several queries that were commonly asked, but ended up being anti-patterns.</p><ol><li><p><strong>Is the user an administrator?</strong>&#8202;&#8212;&#8202;In our platform, administrators have the power to perform any operation on any resource in the system. However, instead of explicitly asking Authy whether the user is an administrator, the query should only ask about the specific permission, such as <code>CAN_DELETE_PROJECT</code>. If Authy responds with &#8220;Yes&#8221;, then it doesn&#8217;t matter whether the user is an administrator, or whether they got the permission some other way. All that matters is that they have the permission.</p></li><li><p><strong>Does this group have a specific permission for this resource?</strong>&#8202;&#8212;&#8202;Initially this seemed like an important variation on asking whether a user has permission, but in reality, both users and groups get folded into the same query anyway. For example, although the Sales group might have <code>CAN_READ_ISSUE</code> permission, the request to Authy will actually ask about Frank, the logged-in user who is a member of the Sales group. Therefore, Frank may either get the <code>CAN_READ_ISSUE</code> permission by being part of Sales, or by explicitly being assigned that permission on the parent Project. In either case, he has the permission.</p></li><li><p><strong>Can this user access this Risk?</strong>&#8202;&#8212;&#8202;As mentioned above, a Risk is a fine-grained domain object that resides inside a Project, and Authy doesn&#8217;t track permissions on fine-grained objects. To answer this query, the backend service first determines the ID of the Project containing the Risk, then ask whether the <code>CAN_READ_RISK</code> permission is set on the Project.</p></li></ol><p>Now that we understand the domain object, their permissions, and the common authorization queries, it&#8217;s now time to dive into the implementation, using the DynamoDB database.</p><h3>The DynamoDB&nbsp;Solution</h3><p>When describing how Authy stores permission assignments in DynamoDB, we&#8217;ll focus exclusively on these common queries:</p><ol><li><p>What operations can the user perform on a specific resource?</p></li><li><p>Who are all the users in the organization who can perform this operation on this resource?</p></li><li><p>Which resources can this user perform this operation on?</p></li></ol><p>Our DynamoDB table has four main attributes of interest, allowing these queries to be answered:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BKLT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F929f252a-220c-4aa2-b2e0-1f607e946d3c_800x123.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BKLT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F929f252a-220c-4aa2-b2e0-1f607e946d3c_800x123.png 424w, https://substackcdn.com/image/fetch/$s_!BKLT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F929f252a-220c-4aa2-b2e0-1f607e946d3c_800x123.png 848w, https://substackcdn.com/image/fetch/$s_!BKLT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F929f252a-220c-4aa2-b2e0-1f607e946d3c_800x123.png 1272w, https://substackcdn.com/image/fetch/$s_!BKLT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F929f252a-220c-4aa2-b2e0-1f607e946d3c_800x123.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BKLT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F929f252a-220c-4aa2-b2e0-1f607e946d3c_800x123.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/929f252a-220c-4aa2-b2e0-1f607e946d3c_800x123.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BKLT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F929f252a-220c-4aa2-b2e0-1f607e946d3c_800x123.png 424w, https://substackcdn.com/image/fetch/$s_!BKLT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F929f252a-220c-4aa2-b2e0-1f607e946d3c_800x123.png 848w, https://substackcdn.com/image/fetch/$s_!BKLT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F929f252a-220c-4aa2-b2e0-1f607e946d3c_800x123.png 1272w, https://substackcdn.com/image/fetch/$s_!BKLT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F929f252a-220c-4aa2-b2e0-1f607e946d3c_800x123.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h4><strong>Attribute: </strong><code>orgID</code></h4><p>This is the table&#8217;s Partition Key, used to separate one organization&#8217;s data from another. This is simply an organization number, such as <code>47</code>.</p><h4><strong>Attribute: </strong><code>identity</code></h4><p>This field, the DynamoDB Sort Key, identifies the user or group the permission assignment is for.</p><ul><li><p>For a user, the value will be <code>u|&lt;user-id&gt;|&lt;resource&gt;</code>.</p></li><li><p>For a group, the value will be <code>g|&lt;group-id&gt;|&lt;resource&gt;</code>.</p></li><li><p>To assign the permission to everybody in the organization: <code>*|&lt;resource&gt;</code>.</p></li></ul><p>Note the <code>&lt;resource&gt;</code> suffix is only required because DynamoDB needs each record to have a unique Partition Key / Sort Key combination, so <code>&lt;resource&gt;</code> makes them unique. The next section explains the format for <code>&lt;resource&gt;</code>.</p><h4><strong>Attribute: </strong><code>resource</code></h4><p>This field specifies which resource (aka domain object) the permissions are for. We have multiple top-level domain objects, so we combine the resource&#8217;s type with the resource&#8217;s ID. The format will be <code>&lt;resource-type&gt;|&lt;resource-id&gt;</code> where <code>&lt;resource-type&gt;</code> is a digit identifying the type of domain object that <code>&lt;resource-id&gt;</code> refers to.</p><ul><li><p><code>0</code> = The whole organization.</p></li><li><p><code>1|&lt;project-id&gt;</code>= A Project domain object.</p></li><li><p><code>2|&lt;collection-id&gt;</code> = A Collection domain object.</p></li><li><p><code>3|&lt;asset-type-id&gt;</code> = An Asset Type domain object.</p></li><li><p><code>4|&lt;toolkit-id&gt;</code> = A Toolkit domain object.</p></li></ul><p>This list can easily be expanded to include new top-level domain objects.</p><h4><strong>Attribute: </strong><code>grantedSet</code></h4><p>This is the field in which all the permissions are stored. In other words, the user is <em>granted</em> the set of permissions listed in this field. For the sake of efficiency, we encode all possible permissions into an enumeration:</p><pre><code>export enum Permission {
  CAN_CREATE_PROJECT     = 0
  CAN_READ_PROJECT       = 1,
  CAN_UPDATE_PROJECT     = 2,
  CAN_DELETE_PROJECT     = 3,
  CAN_CREATE_RISK        = 4
  CAN_READ_RISK          = 5,
  CAN_UPDATE_RISK        = 6,
  CAN_DELETE_RISK        = 7,
  CAN_READ_RISK_IF_OWNER = 8,
  ...
}</code></pre><p>There could in theory be hundreds or thousands of permissions, with any combination of permissions being valid. We therefore introduce the concept of <em><strong>Permission Set</strong></em> allowing multiple permissions to be specified.</p><p>In essence, a Permission Set is implemented as a set of binary bits. For example, if a user is allowed to read and update projects, but not to create or delete them, their permission set will be <code>CAN_READ_PROJECT</code> (bit 1) + <code>CAN_UPDATE_PROJECT</code> (bit 2) = 0b0110, which is 6 in decimal.</p><p>To support potentially thousands of permissions in this Permission Set, we use a DynamoDB <code>List</code> field type, with each entry being a 32-bit number. List index 0 contains bits 0&#8211;31, with list index 1 containing bits 32&#8211;63, and so on. For practical purposes, we only have a few hundred permissions, so this <code>GrantedSet</code> field is typically quite short.</p><h3>An Example</h3><p>Now that we know the format of the DynamoDB table, let&#8217;s see an example of how it&#8217;s queried. To start with, here&#8217;s a subset of the permissions we&#8217;d expect to see for a small customer:</p><ol><li><p>Everybody in Organization 47 has permission to read all Projects.</p></li><li><p>Members of the Sales group can update Project 234.</p></li><li><p>John can create new Projects and delete all existing Projects.</p></li><li><p>Mary is an administrator for Organization 47, and can do anything.</p></li></ol><p>Here&#8217;s the DynamoDB table, with these rules encoded using the attributes we described above. Before ready further, take a minute to study each line of this table, comparing them to our four permission rules.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IcYA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27a9c007-d76e-488d-b8c3-d3d4be074b31_800x372.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IcYA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27a9c007-d76e-488d-b8c3-d3d4be074b31_800x372.png 424w, https://substackcdn.com/image/fetch/$s_!IcYA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27a9c007-d76e-488d-b8c3-d3d4be074b31_800x372.png 848w, https://substackcdn.com/image/fetch/$s_!IcYA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27a9c007-d76e-488d-b8c3-d3d4be074b31_800x372.png 1272w, https://substackcdn.com/image/fetch/$s_!IcYA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27a9c007-d76e-488d-b8c3-d3d4be074b31_800x372.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IcYA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27a9c007-d76e-488d-b8c3-d3d4be074b31_800x372.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/27a9c007-d76e-488d-b8c3-d3d4be074b31_800x372.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!IcYA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27a9c007-d76e-488d-b8c3-d3d4be074b31_800x372.png 424w, https://substackcdn.com/image/fetch/$s_!IcYA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27a9c007-d76e-488d-b8c3-d3d4be074b31_800x372.png 848w, https://substackcdn.com/image/fetch/$s_!IcYA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27a9c007-d76e-488d-b8c3-d3d4be074b31_800x372.png 1272w, https://substackcdn.com/image/fetch/$s_!IcYA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27a9c007-d76e-488d-b8c3-d3d4be074b31_800x372.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Now, let&#8217;s run some of our common queries to see how the data is accessed. We won&#8217;t discuss the full Authy algorithm, but these examples should give you an idea of how it works.</p><h4><strong>Query 1: What permission does Frank have for Project&nbsp;567?</strong></h4><p>To answer this query, we need to consider the different ways that Frank could be given access to Project 567.</p><ol><li><p>He might be directly assigned permissions to Project 567&#8202;&#8212;&#8202;In this example, no he isn&#8217;t.</p></li><li><p>He might be a member of a group that is directly assigned permissions to Project 567&#8202;&#8212;&#8202;He&#8217;s in Sales, but that group is not explicitly assigned permissions to this Project.</p></li><li><p>He might be directly assigned permissions on the whole Organization, which implies he has those permissions for all Projects in the Organization&#8202;&#8212;&#8202;No, not in this example.</p></li><li><p>Similarly, he might be in a group that has Organization-level permissions&#8202;&#8212;&#8202;No, not in this case either.</p></li><li><p>All members of the Organization might be given permissions to Project 567&#8202;&#8212;&#8202;No, not in this example.</p></li><li><p>All members of the Organization might have the permissions for the whole Organization&#8202;&#8212;&#8202;Yes, all users have <code>CAN_READ_PROJECT</code> set, as shown on Line 1 of the DynamoDB table.</p></li></ol><p>That&#8217;s certainly a lot of things to check in order to determine that Frank has <code>CAN_READ_PROJECT</code> permission for Project 567. Let&#8217;s see how&#8217;d we&#8217;d do this in DynamoDB. Clearly we want to minimize the number of queries we perform.</p><p>We start by performing three DynamoDB queries in parallel:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!P3JR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5e06168-3411-4d97-9e78-612e9adf210d_800x409.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!P3JR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5e06168-3411-4d97-9e78-612e9adf210d_800x409.png 424w, https://substackcdn.com/image/fetch/$s_!P3JR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5e06168-3411-4d97-9e78-612e9adf210d_800x409.png 848w, https://substackcdn.com/image/fetch/$s_!P3JR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5e06168-3411-4d97-9e78-612e9adf210d_800x409.png 1272w, https://substackcdn.com/image/fetch/$s_!P3JR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5e06168-3411-4d97-9e78-612e9adf210d_800x409.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!P3JR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5e06168-3411-4d97-9e78-612e9adf210d_800x409.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d5e06168-3411-4d97-9e78-612e9adf210d_800x409.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!P3JR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5e06168-3411-4d97-9e78-612e9adf210d_800x409.png 424w, https://substackcdn.com/image/fetch/$s_!P3JR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5e06168-3411-4d97-9e78-612e9adf210d_800x409.png 848w, https://substackcdn.com/image/fetch/$s_!P3JR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5e06168-3411-4d97-9e78-612e9adf210d_800x409.png 1272w, https://substackcdn.com/image/fetch/$s_!P3JR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5e06168-3411-4d97-9e78-612e9adf210d_800x409.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><ol><li><p>We fetch the list of groups that Frank belongs to. We didn&#8217;t show the DynamoDB schema for group membership, but it&#8217;s fairly straightforward.</p></li><li><p>In parallel, request all the records related to Project 567, using a DynamoDB Secondary Index. This returns the list of permission assignments, regardless of whether they&#8217;re for Frank (starting with <code>u|frank|</code>), groups (starting with <code>g|</code>), or for everybody in the organization (the <code>*</code> wildcard value). We then use DynamoDB&#8217;s query filter to discard user records that aren&#8217;t for Frank, but we still need to return all the group records because we don&#8217;t yet know which groups Franks belongs to&#8202;&#8212;&#8202;not until Step 1 (above) completes.</p></li><li><p>In parallel, we request all records for the organization-level resource. This is where we specify permissions that apply to all Projects, not just for a single Project.</p></li></ol><p>Once these three DynamoDB queries have completed, we perform an in-memory filter on the two resource lists (steps 2 and 3 above) by keeping only the records for Frank, or for any of the groups that Frank belongs to. All other records are discarded.</p><p>For all records that we didn&#8217;t throw out, we perform a bitwise OR operation to merge the individual permission sets into a single permission set, which is then returned from Authy. In this particular example, only Line 1 of our table is relevant to Frank, so the final answer is <code>[ 2 ]</code> (<code>CAN_READ_PROJECT</code>).</p><p>Note that we&#8217;re assuming here that each resource will have a limited number of rows in the table, otherwise this operation would be very intensive. This is typically true, since our smaller customers have only a few users, and our larger customers tend to place their users into a small number of groups. The merge algorithm will therefore not be too complex.</p><h4>Query 2: Can Jenny, a member of the Sales team, update Project&nbsp;234?</h4><p>This is a similar example, but since Jenny is a member of the sales team, Line 2 of our DynamoDB table will also be used when calculating the final permission set.</p><ul><li><p>From Line 1&#8202;&#8212;&#8202;Jenny gets <code>CAN_READ_PROJECT</code> [2].</p></li><li><p>From Line 2&#8202;&#8212;&#8202;Jenny gets <code>CAN_UPDATE_PROJECT</code> [4].</p></li></ul><p>The bitwise OR of [4] and [2] is [6], so Jenny has permission set <code>[CAN_READ_PROJECT, CAN_UPDATE_PROJECT</code>], so yes, she can update this Project.</p><h4>Query 3: What can John, not a member of Sales, do on Project&nbsp;234?</h4><p>For John, there are two lines in the DynamoDB table that provide permissions:</p><ul><li><p>Line 1&#8202;&#8212;&#8202;John gets <code>CAN_READ_PROJECT</code> [2] on all Projects.</p></li><li><p>Line 3&#8202;&#8212;&#8202;John gets <code>CAN_CREATE_PROJECT</code> and <code>CAN_DELETE_PROJECT</code> [9] on all Projects.</p></li></ul><p>Therefore, the bitwise OR is [2] and [9] is [11], so John can do <code>[CAN_READ_PROJECT, CAN_CREATE_PROJECT, CAN_DELETE_PROJECT]</code>. If you&#8217;re observant, you&#8217;ll realize that <code>CAN_CREATE_PROJECT</code> can only be assigned at the organization level, since you can never assign create permissions directly on a resource that doesn&#8217;t yet exist.</p><h4>Query 4: Can Mary delete Project&nbsp;135?</h4><p>Yes, Mary is an administrator, and according to line 4 of the DynamoDB table, she has all the permission bits set. However, our algorithm must still perform all the other queries (including finding Mary&#8217;s groups). This is because the DynamoDB queries are performed in parallel so are already in progress (or may have completed) by the time we learn that Mary is an administrator.</p><h3>Summary</h3><p>So that&#8217;s how Authy works! This blog post has covered the high-level requirements, and the solution using DynamoDB. We saw the optimized DynamoDB schema for storing permission assignments, and learned how to query the database to answer permission-related questions about our SaaS platform&#8217;s authorization rules.</p><p>Although we&#8217;ve covered a lot of material, we&#8217;ve barely scratched the surface on all that Authy is capable of doing. We&#8217;re constantly adding new features, and enabling new ways for our customers to control access to their domain objects.</p><p>If you like what you&#8217;ve read, and you&#8217;d like to learn about Authy, then <a href="https://www.wegalvanize.com/careers/">come join us at Galvanize!</a></p>]]></content:encoded></item><item><title><![CDATA[Calculating 1 + 1 in JavaScript — Part 5]]></title><description><![CDATA[I&#8217;m a compiler enthusiast who has been learning how the V8 JavaScript Engine works. Of course, the best way to learn something is to write&#8230;]]></description><link>https://www.petersmith.net/p/calculating-1-1-in-javascript-part-5-79abef791670</link><guid isPermaLink="false">https://www.petersmith.net/p/calculating-1-1-in-javascript-part-5-79abef791670</guid><dc:creator><![CDATA[Peter Smith]]></dc:creator><pubDate>Tue, 18 May 2021 00:15:43 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/7ccacd60-b678-40ca-bb85-5b3740b35e77_498x304.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>I&#8217;m a compiler enthusiast who has been learning how the<a href="https://v8.dev/"> V8 JavaScript Engine</a> works. Of course, the best way to learn something is to write about it, so that&#8217;s why I&#8217;m sharing my experiences here. I hope this might be interesting to others too.</em></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!n6ae!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4233c9a9-cb07-4fa1-9e82-1943afe306f7_498x304.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!n6ae!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4233c9a9-cb07-4fa1-9e82-1943afe306f7_498x304.png 424w, https://substackcdn.com/image/fetch/$s_!n6ae!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4233c9a9-cb07-4fa1-9e82-1943afe306f7_498x304.png 848w, https://substackcdn.com/image/fetch/$s_!n6ae!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4233c9a9-cb07-4fa1-9e82-1943afe306f7_498x304.png 1272w, https://substackcdn.com/image/fetch/$s_!n6ae!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4233c9a9-cb07-4fa1-9e82-1943afe306f7_498x304.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!n6ae!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4233c9a9-cb07-4fa1-9e82-1943afe306f7_498x304.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4233c9a9-cb07-4fa1-9e82-1943afe306f7_498x304.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!n6ae!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4233c9a9-cb07-4fa1-9e82-1943afe306f7_498x304.png 424w, https://substackcdn.com/image/fetch/$s_!n6ae!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4233c9a9-cb07-4fa1-9e82-1943afe306f7_498x304.png 848w, https://substackcdn.com/image/fetch/$s_!n6ae!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4233c9a9-cb07-4fa1-9e82-1943afe306f7_498x304.png 1272w, https://substackcdn.com/image/fetch/$s_!n6ae!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4233c9a9-cb07-4fa1-9e82-1943afe306f7_498x304.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>This blog post discusses how JavaScript is compiled differently when entered into the REPL or via the <code>eval</code> command, versus being in a function body.</p><p>This is the fifth part in a series that dives into how the <a href="https://v8.dev/">V8 JavaScript Engine</a> computes the expression <code>1 + 1</code>. This seems like a simple task, but utilizes a large portion of the JavaScript run-time environment, and so far has required four blog posts to describe. There will likely be four more to complete the full story, so kudos to anyone who reads them all!</p><p>Since I started writing this blog series, I&#8217;ve learn a lot of the unusual JavaScript quirks you wouldn&#8217;t normally think about. If you&#8217;re curious about the entire blog series, here&#8217;s what we&#8217;ve seen so far:</p><ul><li><p><a href="https://medium.com/compilers/calculating-1-1-in-javascript-1cecb6e9610">Part 1&#8202;&#8212;&#8202;How the <code>1 + 1</code> string is stored in the JavaScript heap.</a></p></li><li><p><a href="https://medium.com/compilers/calculating-1-1-in-javascript-part-2-e01f336503d0">Part 2&#8202;&#8212;&#8202;How byte codes are cached to avoid unnecessary compilation.</a></p></li><li><p><a href="https://medium.com/compilers/calculating-1-1-in-javascript-part-3-710f686d9a40">Part 3&#8202;&#8212;&#8202;How the string <code>1 + 1</code> is scanned into lexical tokens.</a></p></li><li><p><a href="https://medium.com/compilers/calculating-1-1-in-javascript-part-4-42ca49f45ac5">Part 4&#8202;&#8212;&#8202;Parsing the Expression into an Abstract Syntax Tree</a></p></li></ul><p>Let&#8217;s now see how V8 rewrites certain statements if they&#8217;re entered into the REPL or passed into the <code>eval</code> command.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FWfC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2b94e0-409b-4d9a-a26b-77ff2cb32ee4_60x60.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FWfC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2b94e0-409b-4d9a-a26b-77ff2cb32ee4_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!FWfC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2b94e0-409b-4d9a-a26b-77ff2cb32ee4_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!FWfC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2b94e0-409b-4d9a-a26b-77ff2cb32ee4_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!FWfC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2b94e0-409b-4d9a-a26b-77ff2cb32ee4_60x60.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FWfC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2b94e0-409b-4d9a-a26b-77ff2cb32ee4_60x60.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec2b94e0-409b-4d9a-a26b-77ff2cb32ee4_60x60.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FWfC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2b94e0-409b-4d9a-a26b-77ff2cb32ee4_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!FWfC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2b94e0-409b-4d9a-a26b-77ff2cb32ee4_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!FWfC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2b94e0-409b-4d9a-a26b-77ff2cb32ee4_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!FWfC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec2b94e0-409b-4d9a-a26b-77ff2cb32ee4_60x60.png 1456w" sizes="100vw"></picture><div></div></div></a></figure></div><h3>Entering Statements into the&nbsp;REPL</h3><p>Here&#8217;s a simple example. What&#8217;s the result of the following statement, when entered into the V8 JavaScript REPL (such as Node.js, or Chrome console)?</p><pre><code>if (3 &gt; 5) {
  10
} else {
  20
}</code></pre><p>Alternatively, what&#8217;s the expected value when passed to <code>eval</code>?</p><pre><code>eval("if (3 &gt; 5) { 10 } else { 20 }")</code></pre><p>It should be no surprise that <code>20</code> is returned in both cases. How about this similar example, using a function definition and call?</p><pre><code>function f() {
  if (3 &gt; 5) {
    10
  } else {
    20
  }
}</code></pre><pre><code>f()</code></pre><p>You might think it also returns <code>20</code>, but that&#8217;s not the case. This function-based approach instead returns <code>undefined</code>. If you think about it, the constants <code>10</code> and <code>20</code> are effectively dead code since they&#8217;re executed without side-effects (not assigned to a variable), and they&#8217;re not explicitly returned, so the result will immediately be discarded.</p><p>Unlike some languages, JavaScript does not return the value of the <code>then</code> or <code>else</code> code paths as the value returned by the <code>if</code> statement itself. That is, the following code will fail:</p><pre><code>&gt; const max = if (a &gt; b) { a } else { b }
              ^^
Uncaught SyntaxError: Unexpected token 'if'</code></pre><p>Additionally, JavaScript does not support the implicit return of the last expression in a statement:</p><pre><code>&gt; function f(a, b) { a + b }
&gt; f(10, 20)
undefined</code></pre><p>Although many people think of JavaScript as a great language for functional programming, these examples demonstrate some weaknesses in that vision. As an aside, using the ternary&nbsp;<code>?&nbsp;:</code> operator solves the first limitation, and lambda/arrow functions solve the second case, so at least some aspects of JavaScript allow functional style.</p><p>Back to our problem&#8202;&#8212;&#8202;why did our first example return <code>20</code>, while the second example (the same code inside a function) gave us <code>undefined</code>? Let&#8217;s investigate.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WGJw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad42ffa6-0045-4231-a9a8-77afd753e398_60x60.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WGJw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad42ffa6-0045-4231-a9a8-77afd753e398_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!WGJw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad42ffa6-0045-4231-a9a8-77afd753e398_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!WGJw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad42ffa6-0045-4231-a9a8-77afd753e398_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!WGJw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad42ffa6-0045-4231-a9a8-77afd753e398_60x60.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WGJw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad42ffa6-0045-4231-a9a8-77afd753e398_60x60.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ad42ffa6-0045-4231-a9a8-77afd753e398_60x60.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WGJw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad42ffa6-0045-4231-a9a8-77afd753e398_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!WGJw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad42ffa6-0045-4231-a9a8-77afd753e398_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!WGJw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad42ffa6-0045-4231-a9a8-77afd753e398_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!WGJw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad42ffa6-0045-4231-a9a8-77afd753e398_60x60.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Rewriting the Immediate Code</h3><p>Our example worked because the JavaScript REPL, as well as the built-in <code>eval</code> feature, both modify the standard JavaScript semantics to be more functional in nature. Let&#8217;s face it, the REPL wouldn&#8217;t be useful if you couldn&#8217;t evaluate simple expressions. Consider what would happen if side-effect-free expressions were basically discarded:</p><pre><code>&gt; 1 + 1
undefined</code></pre><pre><code>&gt; eval("1 + 1")
undefined</code></pre><p>To make this practical, V8 rewrites JavaScript code (as long as it&#8217;s not inside a function) to actually return the value. Here&#8217;s our first example again, but with the code rewritten to include the necessary side-effects.</p><pre><code>function f() {
  let result = undefined
  if (3 &gt; 5) {
    result = 10
  } else {
    result = 20
  }
  return result
}</code></pre><pre><code>f()</code></pre><p>Note the use of the explicit <code>result</code> variable, with a single <code>return</code> statement at the end of the function. You might feel that adding <code>return 10</code> and <code>return 20</code> statements would be more efficient, but consider the following example:</p><pre><code>if (3 &gt; 5) {
  10
} else {
  20
  25
}</code></pre><p>The answer should be <code>25</code>, but rewriting to use <code>return 20; return 25;</code> would incorrectly return <code>20</code>.</p><p>Similarly, an upfront <code>result = undefined</code> statement might be necessary if there&#8217;s a path where no valid expression will be seen:</p><pre><code>function f() {
  let result = undefined
  if (3 &gt; 5) {
    result = 10
  }
  return result
}</code></pre><pre><code>f()</code></pre><p>In this case, the lack of an <code>else</code> clause results in <code>undefined</code> being returned if the false path is taken.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4QD-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4da4a170-68c0-4ca9-a899-532613b1ba0d_60x60.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4QD-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4da4a170-68c0-4ca9-a899-532613b1ba0d_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!4QD-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4da4a170-68c0-4ca9-a899-532613b1ba0d_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!4QD-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4da4a170-68c0-4ca9-a899-532613b1ba0d_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!4QD-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4da4a170-68c0-4ca9-a899-532613b1ba0d_60x60.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4QD-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4da4a170-68c0-4ca9-a899-532613b1ba0d_60x60.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4da4a170-68c0-4ca9-a899-532613b1ba0d_60x60.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4QD-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4da4a170-68c0-4ca9-a899-532613b1ba0d_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!4QD-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4da4a170-68c0-4ca9-a899-532613b1ba0d_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!4QD-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4da4a170-68c0-4ca9-a899-532613b1ba0d_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!4QD-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4da4a170-68c0-4ca9-a899-532613b1ba0d_60x60.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>How V8 Rewrites the Abstract Syntax&nbsp;Tree</h3><p>As we saw in previous blog posts, V8 encapsulates all standalone code (such as <code>1 + 1</code>) into an implicit function (with no parameters and no properties). This allows V8 to use the <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L2108"><code>FunctionLiteral</code></a> AST node as the basic unit of compilation in a consistent way across the V8 engine.</p><p>However, simply inserting the standalone code into a function body causes the computed values to be lost, and <code>undefined</code> to be returned. To avoid this, V8 rewrites the AST using the <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/rewriter.cc#L379"><code>Rewriter::Rewrite()</code></a> method (see <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/rewriter.cc">src/parsing/rewriter.cc</a>) to assign expressions to a temporary <code>result</code> variable, then automatically inserts a <code>return</code> statement at the end of the function body.</p><p>In terms of the actual AST changes required for our earlier <code>if</code> example, <code>Rewrite()</code> takes the following high-level steps:</p><ol><li><p>Allocate a new temporary variable to contain the function&#8217;s result. This will be named&nbsp;<code>.result</code>, with a period at the start of the name so it won&#8217;t conflict with any user-defined variable names.</p></li><li><p>For each child of the <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L719"><code>IfStatement</code></a> AST node (the <code>true</code> path and the <code>false</code> path), search backward through the list of child statements and insert a new <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L1977"><code>Assignment</code></a> AST node for the last statement that creates a value. We&#8217;ll learn more about this shortly.</p></li><li><p>At the end of the main <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L2108"><code>FunctionLiteral</code></a>'s list of child statements, add a new <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L620"><code>ReturnStatement</code></a> node to return the value of the temporary variable (<code>.result</code>).</p></li></ol><p>As you&#8217;d expect, the end result will be:</p><pre><code>function f() {
  let .result = undefined
  if (3 &gt; 5) {
    .result = 10
  } else {
    .result = 20
  }
  return .result
}</code></pre><p>Let&#8217;s examine this process in more detail.</p><h4>Creating the Temporary Variable</h4><p>The rewriter code starts in the <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/rewriter.cc#L379"><code>Rewriter::Rewrite()</code></a> method, which calls <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/rewriter.cc#L401"><code>RewriteBody()</code></a> with the list of statements (aka &#8220;body&#8221;) that appears as the child of the <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L2108"><code>FunctionLiteral</code></a> AST node.</p><p>Other than a few sanity checks, the most important step here is to create the temporary variable,&nbsp;<code>.result</code>, using <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/rewriter.cc#L409">the following code</a>:</p><pre><code>Variable* result = 
  scope-&gt;AsDeclarationScope()-&gt;
    NewTemporary(info-&gt;ast_value_factory()-&gt;dot_result_string());</code></pre><p>Once created, the temporary variable and the function&#8217;s body of statements are passed into a new instance of the <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/rewriter.cc#L17"><code>Processor</code></a> class, which proceeds to rewrite the AST to include the necessary variable assignments. This is all done by calling <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/rewriter.cc#L125"><code>Processor::Process()</code></a>.</p><h4>Visiting the&nbsp;AST</h4><p>The <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/rewriter.cc#L17"><code>Processor</code></a> class is a subclass of <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L2627"><code>AstVisitor</code></a> which provides common &#8220;tree walking&#8221; functionality for traversing the AST in various ways. For example, <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/prettyprinter.h#L88"><code>AstVisitor&lt;AstPrinter&gt;</code></a> is used when pretty-printing the AST (see the <code>--print-ast</code> option), and <a href="https://github.com/v8/v8/blob/8.8.276/src/interpreter/bytecode-generator.h#L32"><code>AstVisitor&lt;BytecodeGenerator&gt;</code></a> is used when generating executable byte codes. In our case, <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/rewriter.cc#L17"><code>AstVisitor&lt;Processor&gt;</code></a> is used to traverse the AST to add the&nbsp;<code>.result =</code> statements.</p><p><a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L2627"><code>AstVisitor</code></a> and its subclasses all have a <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L2629"><code>Visit()</code></a> method that takes an <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L137"><code>AstNode</code></a><code> *</code> as input. This method determines the type of that <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L137"><code>AstNode</code></a>, then dispatches to the appropriate tree-walker method. For example, if the AST node has type <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L568"><code>ExpressionStatement</code></a>, then the <code>VisitExpressionStatement()</code> method is called. Likewise, seeing an <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L719"><code>IfStatement</code></a> AST node results in the <code>VisitIfStatement()</code> method being called.</p><p>Inside each of the tree-walker methods, the code does whatever is necessary to process the information in that node. This likely involves a recursive call to <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L2629"><code>Visit()</code></a> to walk through the child AST nodes. For example, <code>VisitIfStatement()</code> calls <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L2629"><code>Visit()</code></a> on each of its children, to visit the <code>true</code> statement body, and then the <code>else</code> statement body, if it exists.</p><p>As you can see from eye-balling the <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/rewriter.cc"><code>src/parsing/rewriter.cc</code></a> file, there are numerous <code>VisitX()</code> methods implemented, with each one handling a different type of AST node. Not all AST nodes have a corresponding <code>VisitX()</code> method though, but anything required for rewriting our AST will be present.</p><h4>Walking Backwards Through&nbsp;Blocks</h4><p>One interesting part of the rewriting algorithm is that it intelligently considers when to add the&nbsp;<code>.result =</code> assignment. In the following contrived example, we can see that only <code>10</code> and <code>25</code> need to be assigned to the&nbsp;<code>.result</code> variable, whereas assigning <code>20</code> would be pointless.</p><pre><code>if (3 &gt; 5) {
  10
} else {
  20
  25
}</code></pre><p>Our AST traversal code therefore walks backwards through the list of statements, only assigning the final expression to the&nbsp;<code>.result</code> variable. However, this assumes that the block is not &#8220;breakable&#8221; in the sense that the flow of control could leave the block earlier than the last statement.</p><p>For example, here&#8217;s another contrived example:</p><pre><code>while (true) { 
  10
  break
  20
}</code></pre><p>In this case, if we had stopped after setting&nbsp;<code>.result = 20</code>, we&#8217;d miss the fact that the code exits the loop before getting to that point, so&nbsp;<code>.result = 10</code> must also be assigned.</p><p>Inside the <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/rewriter.cc#L125"><code>Processor::Process()</code></a> method, the algorithm walks backward through the list of statements in the <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L2108"><code>FunctionLiteral</code></a>'s body. It calls <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L2629"><code>Visit()</code></a> on each statement, then replaces the statement with the rewritten version (in the case where&nbsp;<code>.result =</code> was added). If rewriting wasn&#8217;t necessary, the corresponding <code>VisitX()</code> method simply sets <code>replacement_</code> to be the original statement, unchanged.</p><p>Here&#8217;s <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/rewriter.cc#L125">the code</a>:</p><pre><code>void Processor::Process(ZonePtrList&lt;Statement&gt;* statements) {
  for (int i = statements-&gt;length() - 1; 
            i &gt;= 0 &amp;&amp; (breakable_ || !is_set_); --i) {
    Visit(statements-&gt;at(i));
    statements-&gt;Set(i, replacement_);
  }
}</code></pre><p>Note the use of <code>is_set_</code> and <code>breakable_</code> variables in the loop. These ensure the loop terminates as soon as one of the statements (starting at the end of the list) is rewritten, unless the block of statements could potentially contain a <code>break</code> statement.</p><h4>Assigning Values to the Temporary Variable</h4><p>Let&#8217;s now see how each individual statement is rewritten. First, note that constants like <code>10</code> or <code>20</code> have type <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L568"><code>ExpressionStatement</code></a> when they appear in the <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L2108"><code>FunctionLiteral</code></a>'s statement body.</p><p>The following <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/rewriter.cc#L156"><code>VisitExpressionStatement()</code></a> method is invoked whenever an <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L568"><code>ExpressionStatement</code></a> node is seen:</p><pre><code>void Processor::VisitExpressionStatement(ExpressionStatement* node){
  // Rewrite : &lt;x&gt;; -&gt; .result = &lt;x&gt;;
  if (!is_set_) {
    node-&gt;set_expression(SetResult(node-&gt;expression()));
    is_set_ = true;
  }
  replacement_ = node;
}</code></pre><p>Note that the code only has an effect if <code>is_set_</code> is false, indicating that no value has yet been assigned to&nbsp;<code>.result</code> in the current statement body (working backwards from the last statement, to the first). In this case, <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/rewriter.cc#L55"><code>SetResult()</code></a> is called to insert a new <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L1977"><code>Assignment</code></a> AST node into the AST.</p><pre><code>// Returns ".result = value"
Expression* SetResult(Expression* value) {
  result_assigned_ = true;
  
  VariableProxy* result_proxy = 
      factory()-&gt;NewVariableProxy(result_);
  
  return factory()-&gt;NewAssignment(
      Token::ASSIGN, result_proxy, value, kNoSourcePosition);
}</code></pre><p>The <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L3099"><code>NewAssignment()</code></a> method creates a new <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L1977"><code>Assignment</code></a> AST node, with a <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L1416"><code>VariableProxy</code></a> AST node as the left child (the thing being assigned to), and the original expression as the right child.</p><h4>A More Complex Example: If Statements</h4><p>Now for a more complex scenario, the case where an <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L719"><code>IfStatement</code></a> node appears in the AST. We must handle the case where both a <code>true</code> branch and a <code>false</code> branch are included, but also consider that the <code>false</code> branch is optional, therefore returning <code>undefined</code>.</p><pre><code>void Processor::VisitIfStatement(IfStatement* node) {
  // Rewrite both branches.
  bool set_after = is_set_;</code></pre><pre><code>  Visit(node-&gt;then_statement());
  node-&gt;set_then_statement(replacement_);
  bool set_in_then = is_set_;</code></pre><pre><code>  is_set_ = set_after;
  Visit(node-&gt;else_statement());
  node-&gt;set_else_statement(replacement_);</code></pre><pre><code>  replacement_ = set_in_then &amp;&amp; is_set_ ? 
      node : AssignUndefinedBefore(node);</code></pre><pre><code>  is_set_ = true;
}</code></pre><p>In this code, we see two calls to <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L2629"><code>Visit()</code></a>, once for the <code>true</code> (then) case, and once for the <code>false</code> (else) case. In both scenarios, they replace the <code>true</code> or <code>false</code> statement lists with the rewritten lists, assuming anything was actually rewritten. Note how the <code>is_set_</code> variable is reset back to its original value before evaluating the <code>false</code> case.</p><p>The <code>replacement_</code> variable is conditionally updated depending on whether both the <code>true</code> and <code>false</code> branches generated values. If they weren&#8217;t both used, the entire <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L719"><code>IfStatement</code></a> AST node is prepended with an assignment of&nbsp;<code>.result = undefined</code>, so as to account for the path where no expression value was assigned to&nbsp;<code>.result</code>.</p><p>Finally, the <code>is_set_</code> flag is always set <code>true</code> at the end, indicating that a value must have been set at this point in the code. Either the value came from one of the two branches, or the <code>if</code> statement will have returned <code>undefined</code> if there was no <code>else</code> case.</p><h4>Adding the Final&nbsp;Return</h4><p>To finish up the rewriting process, <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/rewriter.cc#L421">at the end of <code>Rewriter::RewriteBody</code></a>, the following code adds a <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L620"><code>ReturnStatement</code></a> AST node at the end of the <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L2108"><code>FunctionLiteral</code></a>'s statement list.</p><pre><code>Statement* result_statement =
    processor.factory()-&gt;NewReturnStatement(result_value, pos);
body-&gt;Add(result_statement, info-&gt;zone());</code></pre><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!52JE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F356cc107-6af0-4bc6-b442-0c0ec4faad3b_60x60.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!52JE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F356cc107-6af0-4bc6-b442-0c0ec4faad3b_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!52JE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F356cc107-6af0-4bc6-b442-0c0ec4faad3b_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!52JE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F356cc107-6af0-4bc6-b442-0c0ec4faad3b_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!52JE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F356cc107-6af0-4bc6-b442-0c0ec4faad3b_60x60.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!52JE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F356cc107-6af0-4bc6-b442-0c0ec4faad3b_60x60.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/356cc107-6af0-4bc6-b442-0c0ec4faad3b_60x60.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!52JE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F356cc107-6af0-4bc6-b442-0c0ec4faad3b_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!52JE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F356cc107-6af0-4bc6-b442-0c0ec4faad3b_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!52JE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F356cc107-6af0-4bc6-b442-0c0ec4faad3b_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!52JE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F356cc107-6af0-4bc6-b442-0c0ec4faad3b_60x60.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Back to Our 1 + 1&nbsp;Example</h3><p>Now that we understand the concept of rewriting the AST, and why it&#8217;s necessary, let&#8217;s return back to our over-arching story&#8202;&#8212;&#8202;how JavaScript calculates <code>1 + 1</code>. It was necessary to branch off to give examples using <code>if</code> statements, otherwise it wouldn&#8217;t be clear why rewriting the AST was necessary.</p><p>If you recall from the <a href="https://medium.com/compilers/calculating-1-1-in-javascript-part-4-42ca49f45ac5">previous blog post</a>, we&#8217;ve already used <em>constant folding</em> to simplify our <code>1 + 1</code> expression into the even simpler <code>2</code> expression. Here&#8217;s the AST we ended up with:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!28T0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd01d43-db37-44c1-9722-8d1a24e8ccd9_338x600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!28T0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd01d43-db37-44c1-9722-8d1a24e8ccd9_338x600.png 424w, https://substackcdn.com/image/fetch/$s_!28T0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd01d43-db37-44c1-9722-8d1a24e8ccd9_338x600.png 848w, https://substackcdn.com/image/fetch/$s_!28T0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd01d43-db37-44c1-9722-8d1a24e8ccd9_338x600.png 1272w, https://substackcdn.com/image/fetch/$s_!28T0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd01d43-db37-44c1-9722-8d1a24e8ccd9_338x600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!28T0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd01d43-db37-44c1-9722-8d1a24e8ccd9_338x600.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/acd01d43-db37-44c1-9722-8d1a24e8ccd9_338x600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!28T0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd01d43-db37-44c1-9722-8d1a24e8ccd9_338x600.png 424w, https://substackcdn.com/image/fetch/$s_!28T0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd01d43-db37-44c1-9722-8d1a24e8ccd9_338x600.png 848w, https://substackcdn.com/image/fetch/$s_!28T0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd01d43-db37-44c1-9722-8d1a24e8ccd9_338x600.png 1272w, https://substackcdn.com/image/fetch/$s_!28T0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Facd01d43-db37-44c1-9722-8d1a24e8ccd9_338x600.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Notice how V8 converted the simple expression into a function definition (with no arguments or properties), implying that the <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L2108"><code>FunctionLiteral</code></a> class is the basic unit of compilation.</p><p>Next, the <code>Rewriter</code> class modifies our code to:</p><pre><code>function() {
  let .result = 2
  return .result
}</code></pre><p>which gives us the following AST:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!R8--!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f85e3da-d57e-470a-8243-56e20d3a0e4c_800x603.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!R8--!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f85e3da-d57e-470a-8243-56e20d3a0e4c_800x603.png 424w, https://substackcdn.com/image/fetch/$s_!R8--!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f85e3da-d57e-470a-8243-56e20d3a0e4c_800x603.png 848w, https://substackcdn.com/image/fetch/$s_!R8--!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f85e3da-d57e-470a-8243-56e20d3a0e4c_800x603.png 1272w, https://substackcdn.com/image/fetch/$s_!R8--!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f85e3da-d57e-470a-8243-56e20d3a0e4c_800x603.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!R8--!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f85e3da-d57e-470a-8243-56e20d3a0e4c_800x603.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9f85e3da-d57e-470a-8243-56e20d3a0e4c_800x603.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!R8--!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f85e3da-d57e-470a-8243-56e20d3a0e4c_800x603.png 424w, https://substackcdn.com/image/fetch/$s_!R8--!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f85e3da-d57e-470a-8243-56e20d3a0e4c_800x603.png 848w, https://substackcdn.com/image/fetch/$s_!R8--!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f85e3da-d57e-470a-8243-56e20d3a0e4c_800x603.png 1272w, https://substackcdn.com/image/fetch/$s_!R8--!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f85e3da-d57e-470a-8243-56e20d3a0e4c_800x603.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>This function is now ready for compilation, and eventually execution. The result can be displayed on the console, or returned from the <code>eval</code> command.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kUOU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6576e50c-59ef-4f22-a717-f448436cdce1_60x60.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kUOU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6576e50c-59ef-4f22-a717-f448436cdce1_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!kUOU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6576e50c-59ef-4f22-a717-f448436cdce1_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!kUOU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6576e50c-59ef-4f22-a717-f448436cdce1_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!kUOU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6576e50c-59ef-4f22-a717-f448436cdce1_60x60.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kUOU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6576e50c-59ef-4f22-a717-f448436cdce1_60x60.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6576e50c-59ef-4f22-a717-f448436cdce1_60x60.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kUOU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6576e50c-59ef-4f22-a717-f448436cdce1_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!kUOU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6576e50c-59ef-4f22-a717-f448436cdce1_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!kUOU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6576e50c-59ef-4f22-a717-f448436cdce1_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!kUOU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6576e50c-59ef-4f22-a717-f448436cdce1_60x60.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Next Time&#8230;</h3><p>In the next blog post, we&#8217;ll start with this rewritten AST, then generate byte codes. There&#8217;s a lot of complexity in generating byte codes, so it&#8217;ll surely be another detailed blog post.</p>]]></content:encoded></item><item><title><![CDATA[Calculating 1 + 1 in JavaScript — Part 4]]></title><description><![CDATA[Discovering how the V8 JavaScript Engine computes the 1 + 1 expression.]]></description><link>https://www.petersmith.net/p/calculating-1-1-in-javascript-part-4-42ca49f45ac5</link><guid isPermaLink="false">https://www.petersmith.net/p/calculating-1-1-in-javascript-part-4-42ca49f45ac5</guid><dc:creator><![CDATA[Peter Smith]]></dc:creator><pubDate>Mon, 29 Mar 2021 11:02:28 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/09357799-7b72-4d25-92a4-14841d9f376a_493x294.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>I&#8217;m a compiler enthusiast who has been learning how the<a href="https://v8.dev/"> V8 JavaScript Engine</a> works. Of course, the best way to learn something is to write about it, so that&#8217;s why I&#8217;m sharing my experiences here. I hope this might be interesting to others too.</em></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dF80!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F551dce56-4226-4180-82ed-950572bd88a6_493x294.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dF80!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F551dce56-4226-4180-82ed-950572bd88a6_493x294.png 424w, https://substackcdn.com/image/fetch/$s_!dF80!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F551dce56-4226-4180-82ed-950572bd88a6_493x294.png 848w, https://substackcdn.com/image/fetch/$s_!dF80!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F551dce56-4226-4180-82ed-950572bd88a6_493x294.png 1272w, https://substackcdn.com/image/fetch/$s_!dF80!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F551dce56-4226-4180-82ed-950572bd88a6_493x294.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dF80!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F551dce56-4226-4180-82ed-950572bd88a6_493x294.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/551dce56-4226-4180-82ed-950572bd88a6_493x294.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dF80!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F551dce56-4226-4180-82ed-950572bd88a6_493x294.png 424w, https://substackcdn.com/image/fetch/$s_!dF80!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F551dce56-4226-4180-82ed-950572bd88a6_493x294.png 848w, https://substackcdn.com/image/fetch/$s_!dF80!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F551dce56-4226-4180-82ed-950572bd88a6_493x294.png 1272w, https://substackcdn.com/image/fetch/$s_!dF80!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F551dce56-4226-4180-82ed-950572bd88a6_493x294.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>This is the fourth part in a series that dives into how the <a href="https://v8.dev/">V8 JavaScript Engine</a> computes the expression <code>1 + 1</code>. This may seem like a simple task, but it utilizes a large portion of the JavaScript run-time environment. In previous blog posts, we saw:</p><ul><li><p><a href="https://medium.com/compilers/calculating-1-1-in-javascript-1cecb6e9610">Part 1&#8202;&#8212;&#8202;How the <code>1 + 1</code> string is stored in the JavaScript heap.</a></p></li><li><p><a href="https://medium.com/compilers/calculating-1-1-in-javascript-part-2-e01f336503d0">Part 2&#8202;&#8212;&#8202;How byte codes are cached to avoid unnecessary compilation.</a></p></li><li><p><a href="https://medium.com/compilers/calculating-1-1-in-javascript-part-3-710f686d9a40">Part 3&#8202;&#8212;&#8202;How the string <code>1 + 1</code> is scanned into lexical tokens.</a></p></li></ul><p>Now, in Part 4 we&#8217;ll learn how <code>1 + 1</code> is parsed to validate it against the official JavaScript grammar, with the goal of creating an in-memory Abstract Syntax Tree (AST).</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-AGO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe36bf97d-95b2-445a-9b77-03ffed42301f_60x60.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-AGO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe36bf97d-95b2-445a-9b77-03ffed42301f_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!-AGO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe36bf97d-95b2-445a-9b77-03ffed42301f_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!-AGO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe36bf97d-95b2-445a-9b77-03ffed42301f_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!-AGO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe36bf97d-95b2-445a-9b77-03ffed42301f_60x60.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-AGO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe36bf97d-95b2-445a-9b77-03ffed42301f_60x60.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e36bf97d-95b2-445a-9b77-03ffed42301f_60x60.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-AGO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe36bf97d-95b2-445a-9b77-03ffed42301f_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!-AGO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe36bf97d-95b2-445a-9b77-03ffed42301f_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!-AGO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe36bf97d-95b2-445a-9b77-03ffed42301f_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!-AGO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe36bf97d-95b2-445a-9b77-03ffed42301f_60x60.png 1456w" sizes="100vw"></picture><div></div></div></a></figure></div><h3>The ECMAScript Grammar</h3><p>As you probably know, the JavaScript language is officially defined by the <a href="https://262.ecma-international.org/11.0/">ECMAScript Standard</a>, also known as ECMA-262. However, unless you&#8217;ve studied the standard you probably didn&#8217;t know that chapters 12&#8211;15 spend roughly 200 pages (out of 860) defining the syntax of the language using a detailed <a href="https://en.wikipedia.org/wiki/Context-free_grammar">Context Free Grammar</a>.</p><ul><li><p><a href="https://262.ecma-international.org/11.0/#sec-ecmascript-language-expressions">Chapter 12</a>&#8202;&#8212;&#8202;Provides the grammar for all the possible expressions in the JavaScript language, including an extensive list of mathematical and logical operators.</p></li><li><p><a href="https://262.ecma-international.org/11.0/#sec-ecmascript-language-statements-and-declarations">Chapter 13</a>&#8202;&#8212;&#8202;Provides the grammar for statements (such as <code>if</code>, <code>then</code>, and <code>while</code>) and declarations (such as <code>let</code> and <code>const).</code></p></li><li><p><a href="https://262.ecma-international.org/11.0/#sec-ecmascript-language-functions-and-classes">Chapter 14</a>&#8202;&#8212;&#8202;Provides the grammar for function and class definitions.</p></li><li><p><a href="https://262.ecma-international.org/11.0/#sec-ecmascript-language-scripts-and-modules">Chapter 15&#8202;</a>&#8212;&#8202;Provides the grammar for top-level scripts and modules.</p></li></ul><p>If you study these chapters, you&#8217;ll see the entire definition of the <em>grammatical syntax </em>of the language, describing the valid ordering of tokens in a JavaScript program. However, there&#8217;s nothing in these chapters to describe the actual <em>meaning</em> of a valid program. Instead, the <em>semantics</em> of the language is described in the later chapters of the specification (and not discussed in this blog post either).</p><h4>The Grammar for <code>1 +&nbsp;1</code></h4><p>Let&#8217;s take a quick journey through JavaScript&#8217;s syntactical grammar to see how <code>1 + 1</code> will be parsed, at least on the theoretical level (we&#8217;ll later see how it&#8217;s done in V8). The grammar starts with the <em><a href="https://262.ecma-international.org/11.0/#prod-Script">Script</a></em> symbol, initiating a long chain of grammar rules used to derive a valid JavaScript program.</p><p>For the sake of illustration, we&#8217;ll only show the sequence of rules relevant when parsing <code>1 + 1</code>. As with any context free grammar, the symbols on the right hand side are either <em>terminals</em>, representing tokens in our language (such as <code>1</code> or <code>+</code>), or are <em>non-terminals</em>, which can be recursively replaced using other rules.</p><ul><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-Script">Script</a>&nbsp;::= ScriptBody</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-ScriptBody">ScriptBody</a>&nbsp;::= StatementList</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-StatementList">StatementList</a>&nbsp;::= StatementListItem | StatementList StatementListItem</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-StatementListItem">StatementListItem</a>&nbsp;::= Statement | Declaration</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-Statement">Statement</a>&nbsp;::= ExpressionStatement</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-ExpressionStatement">ExpressionStatement</a>&nbsp;::= Expression&nbsp;</em><code>;</code></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-Expression">Expression</a>&nbsp;::= AssignmentExpression | Expression&nbsp;</em><code>,</code><em> AssignmentExpression</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-AssignmentExpression">AssignmentExpression</a>&nbsp;::= ConditionalExpression</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-ConditionalExpression">ConditionalExpression</a>&nbsp;::= ShortCircuitExpression</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-ShortCircuitExpression">ShortCircuitExpression</a>&nbsp;::= LogicalORExpression</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-LogicalORExpression">LogicalOrExpression</a>&nbsp;::= LogicalANDExpression</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-LogicalANDExpression">LogicalANDExpression</a>&nbsp;::= BitwiseORExpression</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-BitwiseORExpression">BitwiseORExpression</a>&nbsp;::= BitwiseXORExpression</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-BitwiseXORExpression">BitwiseXORExpression</a>&nbsp;::= BitwiseANDExpression</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-BitwiseANDExpression">BitwiseANDExpression</a>&nbsp;::= EqualityExpression</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-EqualityExpression">EqualityExpression</a>&nbsp;::= RelationalExpression</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-RelationalExpression">RelationalExpression</a>&nbsp;::= ShiftExpression</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-ShiftExpression">ShiftExpression</a>&nbsp;::= AdditiveExpression</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-AdditiveExpression">AdditiveExpression</a>&nbsp;::= AdditiveExpression </em><code>+</code><em> MultiplicativeExpression</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-MultiplicativeExpression">MultiplicativeExpression</a>&nbsp;::= ExponentiationExpression</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-ExponentiationExpression">ExponentiationExpression</a>&nbsp;::= UnaryExpression</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-UnaryExpression">UnaryExpression</a>&nbsp;::= UpdateExpression</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-UpdateExpression">UpdateExpression</a>&nbsp;::= LeftHandSideExpression</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-LeftHandSideExpression">LeftHandSideExpression</a>&nbsp;::= NewExpression</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-NewExpression">NewExpression</a>&nbsp;::= MemberExpression</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-MemberExpression">MemberExpression</a>&nbsp;::= PrimaryExpression</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-PrimaryExpression">PrimaryExpression</a>&nbsp;::= Literal</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-Literal">Literal</a>&nbsp;::= NumericLiteral</em></p></li><li><p><em><a href="https://262.ecma-international.org/11.0/#prod-NumericLiteral">NumericLiteral</a>&nbsp;::= </em><code>1</code><em> </em>(this part was abbreviated)</p></li></ul><p>That&#8217;s a long chain of rules for a simple expression, but it&#8217;s even longer for realistic programs. We&#8217;ve only seen the relevant rules for our <code>1 + 1</code> example. You&#8217;re strongly encouraged to click on the hyperlinks (above), and you&#8217;ll see a larger set of rules, including operators such as <code>&amp;&amp;</code>, <code>&amp;</code>, <code>|</code>, <code>||</code>, <code>*</code>, or <code>+</code>.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zRks!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ced5e8b-847a-4248-ba2d-18c71eee4a62_60x60.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zRks!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ced5e8b-847a-4248-ba2d-18c71eee4a62_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!zRks!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ced5e8b-847a-4248-ba2d-18c71eee4a62_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!zRks!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ced5e8b-847a-4248-ba2d-18c71eee4a62_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!zRks!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ced5e8b-847a-4248-ba2d-18c71eee4a62_60x60.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zRks!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ced5e8b-847a-4248-ba2d-18c71eee4a62_60x60.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7ced5e8b-847a-4248-ba2d-18c71eee4a62_60x60.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zRks!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ced5e8b-847a-4248-ba2d-18c71eee4a62_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!zRks!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ced5e8b-847a-4248-ba2d-18c71eee4a62_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!zRks!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ced5e8b-847a-4248-ba2d-18c71eee4a62_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!zRks!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ced5e8b-847a-4248-ba2d-18c71eee4a62_60x60.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>The Abstract Syntax Tree&nbsp;(AST)</h3><p>One of the goals of parsing the input tokens is to create an Abstract Syntax Tree. This provides the compiler with an in-memory representation of the program, which is much easier to manipulate than a one-dimensional sequence of tokens. The AST is optimally designed so that tree-traversal algorithms can validate the input program, optimize the program structure, and eventually generate byte codes.</p><p>The AST is an <em>n-ary tree</em>, with each node representing some portion of the input program. Nodes have various properties describing the program in more detail (for example, variable names or integer values), and usually have child nodes to reflect the nesting structure of the program.</p><p>In V8, all of the AST nodes are defined as C++ classes in <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L113"><code>src/ast/ast.h</code></a>, with <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L137"><code>AstNode</code></a> being the parent class for all other node types. Each class contains internal fields for decorating the node (with variable names, or literal values), as well as pointers to child AST nodes.</p><p>The following diagram shows a portion of the class hierarchy for AST nodes. Note that <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L180"><code>Statement</code></a><code> </code>is a sub-class of <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L137"><code>AstNode</code></a>, with each of its sub-classes representing a different type of JavaScript statement. As we&#8217;ll see later, we&#8217;ll be using <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L568"><code>ExpressionStatement</code></a> in our <code>1 + 1</code> example.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GTLe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc4f1b42-a640-4db3-9a5c-a8e447faae48_758x639.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GTLe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc4f1b42-a640-4db3-9a5c-a8e447faae48_758x639.png 424w, https://substackcdn.com/image/fetch/$s_!GTLe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc4f1b42-a640-4db3-9a5c-a8e447faae48_758x639.png 848w, https://substackcdn.com/image/fetch/$s_!GTLe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc4f1b42-a640-4db3-9a5c-a8e447faae48_758x639.png 1272w, https://substackcdn.com/image/fetch/$s_!GTLe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc4f1b42-a640-4db3-9a5c-a8e447faae48_758x639.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GTLe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc4f1b42-a640-4db3-9a5c-a8e447faae48_758x639.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bc4f1b42-a640-4db3-9a5c-a8e447faae48_758x639.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GTLe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc4f1b42-a640-4db3-9a5c-a8e447faae48_758x639.png 424w, https://substackcdn.com/image/fetch/$s_!GTLe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc4f1b42-a640-4db3-9a5c-a8e447faae48_758x639.png 848w, https://substackcdn.com/image/fetch/$s_!GTLe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc4f1b42-a640-4db3-9a5c-a8e447faae48_758x639.png 1272w, https://substackcdn.com/image/fetch/$s_!GTLe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc4f1b42-a640-4db3-9a5c-a8e447faae48_758x639.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Here&#8217;s another portion of the class hierarchy, focusing on the <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L186"><code>Expression</code></a> class, which is also a sub-class of <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L137"><code>AstNode</code></a>. Each of the sub-classes represents a different type of JavaScript expression. We&#8217;ll be using the <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L915"><code>Literal</code></a> class in our example.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!W9V4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F567f6428-10cc-40ac-8691-1af23bffeed1_490x506.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!W9V4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F567f6428-10cc-40ac-8691-1af23bffeed1_490x506.png 424w, https://substackcdn.com/image/fetch/$s_!W9V4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F567f6428-10cc-40ac-8691-1af23bffeed1_490x506.png 848w, https://substackcdn.com/image/fetch/$s_!W9V4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F567f6428-10cc-40ac-8691-1af23bffeed1_490x506.png 1272w, https://substackcdn.com/image/fetch/$s_!W9V4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F567f6428-10cc-40ac-8691-1af23bffeed1_490x506.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!W9V4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F567f6428-10cc-40ac-8691-1af23bffeed1_490x506.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/567f6428-10cc-40ac-8691-1af23bffeed1_490x506.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!W9V4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F567f6428-10cc-40ac-8691-1af23bffeed1_490x506.png 424w, https://substackcdn.com/image/fetch/$s_!W9V4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F567f6428-10cc-40ac-8691-1af23bffeed1_490x506.png 848w, https://substackcdn.com/image/fetch/$s_!W9V4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F567f6428-10cc-40ac-8691-1af23bffeed1_490x506.png 1272w, https://substackcdn.com/image/fetch/$s_!W9V4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F567f6428-10cc-40ac-8691-1af23bffeed1_490x506.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h4>The AST For 1 +&nbsp;1</h4><p>For our particular <code>1 + 1</code> example, the AST is quite simple, with only three nodes involved: <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L2108"><code>FunctionLiteral</code></a>, <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L568"><code>ExpressionStatement</code></a>, and <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L915"><code>Literal</code></a>.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZQMv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8b229b0-d486-4628-b61c-af8d7d2a4d87_246x434.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZQMv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8b229b0-d486-4628-b61c-af8d7d2a4d87_246x434.png 424w, https://substackcdn.com/image/fetch/$s_!ZQMv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8b229b0-d486-4628-b61c-af8d7d2a4d87_246x434.png 848w, https://substackcdn.com/image/fetch/$s_!ZQMv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8b229b0-d486-4628-b61c-af8d7d2a4d87_246x434.png 1272w, https://substackcdn.com/image/fetch/$s_!ZQMv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8b229b0-d486-4628-b61c-af8d7d2a4d87_246x434.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZQMv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8b229b0-d486-4628-b61c-af8d7d2a4d87_246x434.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e8b229b0-d486-4628-b61c-af8d7d2a4d87_246x434.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZQMv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8b229b0-d486-4628-b61c-af8d7d2a4d87_246x434.png 424w, https://substackcdn.com/image/fetch/$s_!ZQMv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8b229b0-d486-4628-b61c-af8d7d2a4d87_246x434.png 848w, https://substackcdn.com/image/fetch/$s_!ZQMv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8b229b0-d486-4628-b61c-af8d7d2a4d87_246x434.png 1272w, https://substackcdn.com/image/fetch/$s_!ZQMv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8b229b0-d486-4628-b61c-af8d7d2a4d87_246x434.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>At the top of the AST is the <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L2108"><code>FunctionLiteral</code></a> node. Our expression doesn&#8217;t look like a function, but V8 has wrapped the expression this way to make byte code compilation more feasible. In our case, the function has zero formal parameters, with zero JavaScript object properties expected.</p><p>At the second level is an <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L568"><code>ExpressionStatement</code></a> node, which is the single item in a list of <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L180"><code>Statement</code></a> nodes referenced by the <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L2108"><code>FunctionLiteral</code>'s</a> <code>body</code> field. If this was a more typical function, we&#8217;d expect to see multiple <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L180"><code>Statement</code></a> nodes in that list.</p><p>Finally, the <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L568"><code>ExpressionStatement</code>'s</a> <code>expression_</code> field refers to the one-and-only expression node. In our case, the <code>1 + 1</code> expression will be &#8220;folded&#8221; into the single small integer value of <code>2</code> (we&#8217;ll see this shortly).</p><p>To help understand how each of these AST nodes is represented in memory, here&#8217;s a heavily abbreviated (and annotated) version of the <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L915"><code>Literal</code></a> C++ class, which is capable of storing literal values of any type (numbers, strings, booleans etc).</p><pre><code>class Literal final : public Expression {</code></pre><pre><code>public:</code></pre><pre><code>  // All the possible literal types
  enum Type {
    kSmi,
    kHeapNumber,
    kBigInt,
    kString,
    kBoolean,
    kUndefined,
    kNull,
    kTheHole,
  };</code></pre><pre><code>  // Return the type of this Literal
  Type type() const { ... }</code></pre><pre><code>  // Methods for testing the Literal type
  bool IsNumber() const { ... }
  bool IsString() const { ... }</code></pre><pre><code>  // Methods for fetching the Literal value in various ways
  AstRawString* AsRawPropertyName() { ... }
  Smi AsSmiLiteral() { ... }
  double AsNumber() { ... }
  AstBigInt AsBigInt() { ... }
  AstRawString* AsRawString() { ... }
  bool ToBooleanIsTrue();
  bool ToBooleanIsFalse() { ... }
  bool ToUint32(uint32_t* value) const;</code></pre><pre><code>private:</code></pre><pre><code>  // C++ union for storing the value.
  union {
    const AstRawString* string_;
    int smi_;
    double number_;
    AstBigInt bigint_;
    bool boolean_;
  };
};</code></pre><p>The basic structure of <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L915"><code>Literal</code></a> includes a type-tag to distinguish the various possible values, followed by a range of accessor functions for fetching the value itself. Finally, the private section of the class allows for storage of each type of value. For the full source code, see <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L915"><code>src/ast/ast.h</code></a>.</p><h4>Zone Memory Allocation</h4><p>One interesting fact we&#8217;ve glossed over in our discussion of AST nodes is exactly where in memory they&#8217;re stored. We already know about the JavaScript heap (from <a href="https://medium.com/compilers/calculating-1-1-in-javascript-1cecb6e9610">Part 1</a> of this series), but that&#8217;s not where the AST nodes are stored. Instead, the concept of a Zone comes into play.</p><p>The <a href="https://github.com/v8/v8/blob/8.8.276/src/zone/zone.h#L38"><code>Zone</code></a> class provides very fast allocation of small blocks of memory, such as AST nodes. However, rather than using a garbage collection algorithm, or explicit <code>delete</code> calls to deallocate the memory, the entire zone is discarded at once. This makes sense when think of an AST being created incrementally, being held in memory for a period of time, and then being discarded when the AST is no longer required. There&#8217;s no concept of partially deallocating an AST.</p><p>To allocate blocks of memory from the Zone, the <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L2721"><code>AstNodeFactory</code></a> class (using the &#8220;factory&#8221; pattern) provides convenience methods for creating different types of AST node. For example, here&#8217;s part of the <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser.cc#L315"><code>ExpressionFromLiteral()</code></a> method that recognizes the next scanner token as a <code>Token::SMI</code>, fetches the integer value from that token, then creates a new <a href="https://github.com/v8/v8/blob/8.8.276/src/ast/ast.h#L915"><code>Literal</code></a> node with that integer value.</p><pre><code>...
case Token::SMI: {
  uint32_t value = scanner()-&gt;smi_value();
  return factory()-&gt;NewSmiLiteral(value, pos);
}
...</code></pre><p>Now that we&#8217;ve seen the ECMAScript grammar, and we&#8217;ve learned how the AST is constructed, we have enough knowledge to step through the process of parsing <code>1 + 1</code>. As we learned in <a href="https://medium.com/compilers/calculating-1-1-in-javascript-part-3-710f686d9a40">Part 3</a>, this input will be streamed from the scanner to the parser as a sequence of <code>Token::SMI</code> (value 1), <code>Token::ADD</code>, and <code>Token::SMI</code> (value 1).</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!A2hi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e2c1aac-1aa7-4bda-a349-0bf833f2f50c_60x60.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!A2hi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e2c1aac-1aa7-4bda-a349-0bf833f2f50c_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!A2hi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e2c1aac-1aa7-4bda-a349-0bf833f2f50c_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!A2hi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e2c1aac-1aa7-4bda-a349-0bf833f2f50c_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!A2hi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e2c1aac-1aa7-4bda-a349-0bf833f2f50c_60x60.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!A2hi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e2c1aac-1aa7-4bda-a349-0bf833f2f50c_60x60.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1e2c1aac-1aa7-4bda-a349-0bf833f2f50c_60x60.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!A2hi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e2c1aac-1aa7-4bda-a349-0bf833f2f50c_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!A2hi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e2c1aac-1aa7-4bda-a349-0bf833f2f50c_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!A2hi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e2c1aac-1aa7-4bda-a349-0bf833f2f50c_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!A2hi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e2c1aac-1aa7-4bda-a349-0bf833f2f50c_60x60.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Recursive Descent Parsing of 1 +&nbsp;1</h3><p>To construct an AST, the V8 JavaScript Engine uses the <a href="https://en.wikipedia.org/wiki/Recursive_descent_parser">Recursive Descent</a> parsing technique. This is one of the easiest parsing algorithms to understand, as it simply uses recursive method calls to emulate the chain of grammar rules. That is, each non-terminal in our syntactical grammar is mapped to a C++ method. When called, the method looks ahead at the next scanner token (or tokens) to decide how to proceed, recursively calling other C++ methods as it descends through the grammar.</p><p>Let&#8217;s see how this is done in our <code>1 + 1</code> example.</p><h4>Recursively Parsing Downwards</h4><p>We&#8217;ll start our journey from within the <a href="https://github.com/v8/v8/blob/8.8.276/src/codegen/compiler.cc#L1295"><code>CompileTopLevel()</code></a> method, which calls upon <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parsing.cc#L40"><code>ParseProgram()</code></a> to convert the sequence of scanner tokens into an AST. To achieve this goal, the method initializes the scanner, then recursively invokes <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser.cc#L562"><code>DoParseProgram()</code></a> (we&#8217;ll see a lot of recursive calls in this discussion!).</p><pre><code>...
scanner_.Initialize();
FunctionLiteral* result = DoParseProgram(isolate, info);
...</code></pre><p>If we think back to the ECMAScript grammar we saw earlier, the <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser.cc#L562"><code>DoParseProgram()</code></a> method is equivalent to the <em><a href="https://262.ecma-international.org/11.0/#prod-ScriptBody">ScriptBody</a></em> rule. In particular, it includes code for parsing a <em><a href="https://262.ecma-international.org/11.0/#prod-StatementList">StatementList</a></em>, which is a sequence of zero or more <em><a href="https://262.ecma-international.org/11.0/#prod-StatementListItem">StatementListItem</a></em> items:</p><pre><code>...
while (peek() != end_token) {
  StatementT stat = ParseStatementListItem();
  if (impl()-&gt;IsNull(stat)) return;
  if (stat-&gt;IsEmptyStatement()) continue;
  body-&gt;Add(stat);
}
...</code></pre><p>At the end of the <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser.cc#L562"><code>DoParseProgram()</code></a> method, <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser.cc#L674">there&#8217;s code</a> for creating a new <a href="https://github.com/v8/v8/blob/master/src/ast/ast.h#L2109"><code>FunctionLiteral</code></a> AST node, which incorporates our list of <a href="https://github.com/v8/v8/blob/master/src/ast/ast.h#L181"><code>Statement</code></a> nodes. This new node should look familiar, as it&#8217;s the first of three nodes we&#8217;re expecting to see in our final AST.</p><pre><code>result = factory()-&gt;NewScriptOrEvalFunctionLiteral(... body ...);</code></pre><p>To continue recursively, the <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L4937"><code>ParseStatementListItem()</code></a> method is called. This method, like many others we&#8217;re about to see, uses a consistent pattern of checking the next scanner token, then determining which additional method to call recursively. For example, if <code>Token::CLASS</code> is the next token in the input stream, we consume that token and recursively call <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L4052"><code>ParseClassDeclaration()</code></a>.</p><pre><code>...
switch (peek()) {
  case Token::FUNCTION:
    return ParseHoistableDeclaration(...);</code></pre><pre><code>  case Token::CLASS:
    Consume(Token::CLASS);
    return ParseClassDeclaration(...);</code></pre><pre><code>  case Token::VAR:
  case Token::CONST:
    return ParseVariableStatement(...);</code></pre><pre><code>  case Token::LET:
    return ...</code></pre><pre><code>  case Token::ASYNC:
    return ...</code></pre><pre><code>  default:
    break;
}</code></pre><pre><code>/* none of the tokens match, continue along rule chain */
return ParseStatement(...)</code></pre><p>In our simple <code>1 + 1</code> example, none of these look-ahead tokens are relevant, so we continue to the default case of calling <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L4983"><code>ParseStatement()</code></a>. In fact, this pattern continues for a while with more recursive calls:</p><ul><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L5189"><code>ParseExpressionOrLabelledStatement()</code></a></p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L1951"><code>ParseExpressionCoverGrammar()</code></a></p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L2745"><code>ParseAssignmentExpressionCoverGrammar()</code></a></p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L2944"><code>ParseConditionalExpression()</code></a></p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L2958"><code>ParseLogicalExpression()</code></a></p></li></ul><p>This list should look remarkably similar to the ECMAScript grammar we saw earlier. Perhaps the only discrepancy is that some rules are very similar, such as <em>ExpressionStatement</em> and <em>LabelledStatement</em>, so they&#8217;re merged into a single <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L5189"><code>ParseExpressionOrLabelledStatement()</code></a> method capable of handling both.</p><p>Once we reach <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L3091"><code>ParseBinaryExpression()</code></a>, the pattern changes slightly because many different grammar rules, from <em>LogicalOrExpression</em> down to <em>ExponentiationExpression,</em> are merged into a single C++ method. As we&#8217;ll discuss shortly, this is handled by passing in a precedence parameter, and using precedence rules to group sub-expressions together as appropriate.</p><p>Finally, we continue down the grammar rule chain by calling more C++ methods:</p><ul><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L3184"><code>ParseUnaryExpression()</code></a></p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L3208"><code>ParsePostfixExpression()</code></a></p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L3242"><code>ParseLeftHandSideExpression()</code></a></p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L3518"><code>ParseMemberExpression()</code></a></p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L1774"><code>ParsePrimaryExpression()</code></a></p></li></ul><h4>Parsing the&nbsp;Literals</h4><p>Once we reach the <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L1774"><code>ParsePrimaryExpression()</code></a> method, the recursion ends. We no longer have any more rules to apply, and we&#8217;re now ready to identify <code>1</code> as a literal value. Here&#8217;s the <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L1831">relevant part of the code</a>:</p><pre><code>...
if (Token::IsLiteral(token)) {
  return impl()-&gt;ExpressionFromLiteral(Next(), beg_pos);
}
...</code></pre><p>Within the <a href="https://github.com/v8/v8/blob/master/src/parsing/parser.cc#L316"><code>ExpressionFromLiteral()</code></a> method, there&#8217;s code for fetching the token&#8217;s SMI value (<code>1</code>), and then using the AST factory to create a new <a href="https://github.com/v8/v8/blob/master/src/ast/ast.h#L916"><code>Literal</code></a> AST node, as we saw earlier.</p><pre><code>...
case Token::SMI: {
  uint32_t value = scanner()-&gt;smi_value();
  return factory()-&gt;NewSmiLiteral(value, pos);
}
...</code></pre><p>The return value from this code has type <code>Expression *</code>, which will now be passed back up the C++ call stack.</p><h4>Parsing of Binary Expressions With Precedence</h4><p>As we&#8217;ve now reached the bottom of the call stack, and we&#8217;re returning from the long list of recursive methods, it&#8217;s not just a simple matter of immediately returning. Each method must decide whether it&#8217;s appropriate to consume more input tokens, or whether its job is complete and returning to the caller is the correct approach.</p><p>To see why this matters, consider the expression: <code>1 + 2 * 3</code>. If we&#8217;ve already seen <code>1 + 2</code>, the question to be asked is whether <code>1 + 2</code> is a complete expression, or whether there&#8217;s more to evaluate. Clearly it should evaluate <code>2 * 3</code> first, before adding <code>1</code>, so we need the consume the remaining tokens to form the binary expression <code>2 * 3</code> before returning to produce the binary-expression <code>1 + (2 * 3)</code>.</p><p>To state this another way, consider the ECMAScript rule we saw earlier:</p><p><em><a href="https://262.ecma-international.org/11.0/#prod-AdditiveExpression">AdditiveExpression</a>&nbsp;::= AdditiveExpression </em><code>+</code><em> MultiplicativeExpression</em></p><p>Given our example, we need to fully evaluate <em>MultiplicativeExpression</em>, before &#8220;returning&#8221; to the previous method to evaluate the <em>AdditiveExpression</em>.</p><p>Inside V8, this is all done with the <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L3091"><code>ParseBinaryExpression()</code></a> method, which expects the operator precedence as an argument. When parsing of the expression has completed, the precedence value of the next token is checked to see if it&#8217;s higher. If so, parsing continues at the same level in the grammar, in this case by calling <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L3040"><code>ParseBinaryContinuation()</code></a>.</p><p>Here&#8217;s the <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L3095">relevant code</a>:</p><pre><code>    ...
    x = ParseUnaryExpression();
  }</code></pre><pre><code>  int prec1 = Token::Precedence(peek(), accept_IN_);
  if (prec1 &gt;= prec) {
    return ParseBinaryContinuation(x, prec, prec1);
  }
  return x;
}</code></pre><p>In our <code>1 + 1</code> example, <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L3091"><code>ParseBinaryExpression()</code></a> was called with <code>prec</code> equal to 6, and since the precedence of <code>+</code> is 12, we decide that <code>prec1 &gt; prec</code> and the expression should continue at the same level in the grammar. We therefore call <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L3040"><code>ParseBinaryContinuation()</code></a> rather than returning up the stack.</p><h4>Constant Folding</h4><p>Another thing that <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L3040"><code>ParseBinaryContinuation()</code></a> does is attempt to &#8220;fold&#8221; constants, simplifying the expression at compile time. We already have a <a href="https://github.com/v8/v8/blob/master/src/ast/ast.h#L916"><code>Literal</code></a> for the first <code>1</code>, the operator <code>+</code>, and a second <a href="https://github.com/v8/v8/blob/master/src/ast/ast.h#L916"><code>Literal</code></a> for the second <code>1</code>. We then call <a href="https://github.com/v8/v8/blob/master/src/parsing/parser.cc#L152"><code>ShortcutNumericLiteralBinaryExpression()</code></a> with all of these values to see if they can be folded into a single <a href="https://github.com/v8/v8/blob/master/src/ast/ast.h#L916"><code>Literal</code></a>.</p><p>Here&#8217;s the <a href="https://github.com/v8/v8/blob/master/src/parsing/parser.cc#L155">relevant code</a>:</p><pre><code>...
if ((*x)-&gt;IsNumberLiteral() &amp;&amp; y-&gt;IsNumberLiteral()) {
  double x_val = (*x)-&gt;AsLiteral()-&gt;AsNumber();
  double y_val = y-&gt;AsLiteral()-&gt;AsNumber();</code></pre><pre><code>  switch (op) {
    case Token::ADD:
      *x = factory()-&gt;NewNumberLiteral(x_val + y_val, pos);
      return true;
    ...</code></pre><p>To summarize, if both of the expressions are literal numbers, then we add them together and replace them with a single <a href="https://github.com/v8/v8/blob/master/src/ast/ast.h#L916"><code>Literal</code></a> containing the sum.</p><h4>Finishing Up</h4><p>We&#8217;ve now seen the entire expression, with no further input tokens remaining, except for the final <code>Token::EOS</code> marking the end of the input stream. As each recursive C++ method returns, it checks the next token but only sees <code>Token::EOS</code>, causing it to return immediately.</p><p>Although this is fairly routine by now, there&#8217;s one interesting case worth mentioning. In the <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser-base.h#L5189"><code>ParseExpressionOrLabelledStatement()</code></a> method, the <a href="https://github.com/v8/v8/blob/master/src/ast/ast.h#L916"><code>Literal</code></a> AST node, with value of <code>2</code> is wrapped in an <a href="https://github.com/v8/v8/blob/master/src/ast/ast.h#L569"><code>ExpressionStatement</code></a> AST node:</p><pre><code>return factory()-&gt;NewExpressionStatement(expr, pos);</code></pre><p>Finally, as we saw earlier, a <a href="https://github.com/v8/v8/blob/master/src/ast/ast.h#L2109"><code>FunctionalLiteral</code></a> AST node is created, wrapping the <a href="https://github.com/v8/v8/blob/master/src/ast/ast.h#L569"><code>ExpressionStatement</code></a>. This leads us back to the AST diagram we&#8217;ve been expecting to see:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KrcA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187b3a67-96e9-4bbe-9389-5ca897847052_246x434.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KrcA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187b3a67-96e9-4bbe-9389-5ca897847052_246x434.png 424w, https://substackcdn.com/image/fetch/$s_!KrcA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187b3a67-96e9-4bbe-9389-5ca897847052_246x434.png 848w, https://substackcdn.com/image/fetch/$s_!KrcA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187b3a67-96e9-4bbe-9389-5ca897847052_246x434.png 1272w, https://substackcdn.com/image/fetch/$s_!KrcA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187b3a67-96e9-4bbe-9389-5ca897847052_246x434.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KrcA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187b3a67-96e9-4bbe-9389-5ca897847052_246x434.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/187b3a67-96e9-4bbe-9389-5ca897847052_246x434.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KrcA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187b3a67-96e9-4bbe-9389-5ca897847052_246x434.png 424w, https://substackcdn.com/image/fetch/$s_!KrcA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187b3a67-96e9-4bbe-9389-5ca897847052_246x434.png 848w, https://substackcdn.com/image/fetch/$s_!KrcA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187b3a67-96e9-4bbe-9389-5ca897847052_246x434.png 1272w, https://substackcdn.com/image/fetch/$s_!KrcA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F187b3a67-96e9-4bbe-9389-5ca897847052_246x434.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4QLB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ae367d-d81b-4bb8-9bed-0a2f5c4a19ed_60x60.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4QLB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ae367d-d81b-4bb8-9bed-0a2f5c4a19ed_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!4QLB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ae367d-d81b-4bb8-9bed-0a2f5c4a19ed_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!4QLB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ae367d-d81b-4bb8-9bed-0a2f5c4a19ed_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!4QLB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ae367d-d81b-4bb8-9bed-0a2f5c4a19ed_60x60.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4QLB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ae367d-d81b-4bb8-9bed-0a2f5c4a19ed_60x60.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/80ae367d-d81b-4bb8-9bed-0a2f5c4a19ed_60x60.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4QLB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ae367d-d81b-4bb8-9bed-0a2f5c4a19ed_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!4QLB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ae367d-d81b-4bb8-9bed-0a2f5c4a19ed_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!4QLB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ae367d-d81b-4bb8-9bed-0a2f5c4a19ed_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!4QLB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ae367d-d81b-4bb8-9bed-0a2f5c4a19ed_60x60.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Next Time&#8230;</h3><p>We&#8217;ve now seen a large part of the process involved in parsing <code>1 + 1</code>, including allocation of data on the JavaScript heap, caching of byte codes, scanning of tokens, and now parsing against the ECMAScript grammar. <a href="https://medium.com/compilers/calculating-1-1-in-javascript-part-5-79abef791670">In the next blog post</a>, will continue by examining how <code>1 + 1</code> (actually, now <code>2</code>) is restructured into a function, which will later be converted to byte codes.</p>]]></content:encoded></item><item><title><![CDATA[Calculating 1 + 1 in JavaScript — Part 3]]></title><description><![CDATA[I&#8217;m a compiler enthusiast who has been learning how the V8 JavaScript Engine works. Of course, the best way to learn something is to write&#8230;]]></description><link>https://www.petersmith.net/p/calculating-1-1-in-javascript-part-3-710f686d9a40</link><guid isPermaLink="false">https://www.petersmith.net/p/calculating-1-1-in-javascript-part-3-710f686d9a40</guid><dc:creator><![CDATA[Peter Smith]]></dc:creator><pubDate>Thu, 11 Mar 2021 15:50:54 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/5147564c-ddd1-400e-91ad-ab56994b7bba_491x360.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>I&#8217;m a compiler enthusiast who has been learning how the<a href="https://v8.dev/"> V8 JavaScript Engine</a> works. Of course, the best way to learn something is to write about it, so that&#8217;s why I&#8217;m sharing my experiences here. I hope this might be interesting to others too.</em></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3Pq-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b976339-fe28-476a-bf97-d8720124b72b_491x360.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3Pq-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b976339-fe28-476a-bf97-d8720124b72b_491x360.png 424w, https://substackcdn.com/image/fetch/$s_!3Pq-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b976339-fe28-476a-bf97-d8720124b72b_491x360.png 848w, https://substackcdn.com/image/fetch/$s_!3Pq-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b976339-fe28-476a-bf97-d8720124b72b_491x360.png 1272w, https://substackcdn.com/image/fetch/$s_!3Pq-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b976339-fe28-476a-bf97-d8720124b72b_491x360.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3Pq-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b976339-fe28-476a-bf97-d8720124b72b_491x360.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6b976339-fe28-476a-bf97-d8720124b72b_491x360.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3Pq-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b976339-fe28-476a-bf97-d8720124b72b_491x360.png 424w, https://substackcdn.com/image/fetch/$s_!3Pq-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b976339-fe28-476a-bf97-d8720124b72b_491x360.png 848w, https://substackcdn.com/image/fetch/$s_!3Pq-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b976339-fe28-476a-bf97-d8720124b72b_491x360.png 1272w, https://substackcdn.com/image/fetch/$s_!3Pq-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b976339-fe28-476a-bf97-d8720124b72b_491x360.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>This is the third part of a multi-part series describing how the <a href="https://v8.dev/">V8 JavaScript Engine</a> calculates <code>1 + 1</code>. If you haven&#8217;t read the previous posts in this series, you might like to start with <a href="https://medium.com/compilers/calculating-1-1-in-javascript-1cecb6e9610">Part 1</a> (storing the source code string in the JavaScript heap), and <a href="https://medium.com/compilers/calculating-1-1-in-javascript-part-2-e01f336503d0">Part 2</a> (checking if the byte codes are already cached). However, since this blog post is fairly independent from the previous two, you can likely understand it in isolation.</p><p>In this part of our story of how V8 calculates <code>1 + 1</code>&nbsp;, we&#8217;ll learn how the input characters are scanned into tokens, which are then used as input to JavaScript&#8217;s parser. This concept will be familiar to anyone who has read an introductory compiler text book.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xVl4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e236f6b-e25b-4a44-b235-749320abc934_60x60.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xVl4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e236f6b-e25b-4a44-b235-749320abc934_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!xVl4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e236f6b-e25b-4a44-b235-749320abc934_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!xVl4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e236f6b-e25b-4a44-b235-749320abc934_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!xVl4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e236f6b-e25b-4a44-b235-749320abc934_60x60.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xVl4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e236f6b-e25b-4a44-b235-749320abc934_60x60.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3e236f6b-e25b-4a44-b235-749320abc934_60x60.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xVl4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e236f6b-e25b-4a44-b235-749320abc934_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!xVl4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e236f6b-e25b-4a44-b235-749320abc934_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!xVl4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e236f6b-e25b-4a44-b235-749320abc934_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!xVl4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e236f6b-e25b-4a44-b235-749320abc934_60x60.png 1456w" sizes="100vw"></picture><div></div></div></a></figure></div><h3>The Scanning&nbsp;Process</h3><p>With our <code>1 + 1</code> example, the output from the <em>scanner</em> we&#8217;re expecting to see is the following sequence of <em>tokens</em>:</p><pre><code>Token::SMI (value 1)
Token::ADD
Token::SMI (value 1)</code></pre><p>Where <code>Token::SMI</code> is a special variant of <code>Token::NUMBER</code> representing small integer values, and <code>Token::ADD</code> unsurprisingly represents addition. Note also, the white space characters between <code>1</code>, <code>+</code>, and <code>1</code> have been ignored, since they don&#8217;t provide any further value.</p><p>Here&#8217;s the overall flow of V8, as it scans and parses the input stream:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tNFE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a47bde2-6ea9-4f39-8bbe-93ce51b495bf_800x257.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tNFE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a47bde2-6ea9-4f39-8bbe-93ce51b495bf_800x257.png 424w, https://substackcdn.com/image/fetch/$s_!tNFE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a47bde2-6ea9-4f39-8bbe-93ce51b495bf_800x257.png 848w, https://substackcdn.com/image/fetch/$s_!tNFE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a47bde2-6ea9-4f39-8bbe-93ce51b495bf_800x257.png 1272w, https://substackcdn.com/image/fetch/$s_!tNFE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a47bde2-6ea9-4f39-8bbe-93ce51b495bf_800x257.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tNFE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a47bde2-6ea9-4f39-8bbe-93ce51b495bf_800x257.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6a47bde2-6ea9-4f39-8bbe-93ce51b495bf_800x257.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tNFE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a47bde2-6ea9-4f39-8bbe-93ce51b495bf_800x257.png 424w, https://substackcdn.com/image/fetch/$s_!tNFE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a47bde2-6ea9-4f39-8bbe-93ce51b495bf_800x257.png 848w, https://substackcdn.com/image/fetch/$s_!tNFE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a47bde2-6ea9-4f39-8bbe-93ce51b495bf_800x257.png 1272w, https://substackcdn.com/image/fetch/$s_!tNFE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a47bde2-6ea9-4f39-8bbe-93ce51b495bf_800x257.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>The first step is for the <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.h#L40"><code>v8::internal::Utf16CharacterStream</code></a> class to read the individual characters from the JavaScript heap (as seen in <a href="https://medium.com/compilers/calculating-1-1-in-javascript-1cecb6e9610">Part 1</a>, the string is stored as a <code>SeqOneByteString</code> object). Next, the <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.h#L210"><code>v8::internal:Scanner</code></a> class converts sequences of characters into tokens (of type <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/token.h#L207"><code>v8::internal::Token</code></a>). Finally, the <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parser.h#L128"><code>v8::internal::Parser</code></a> class (which we&#8217;ll examine in a later blog post) uses these tokens to validate the input stream and build an in-memory Abstract Syntax Tree (AST).</p><p>What&#8217;s important to understand is that all these activities happen in a streaming fashion. It all starts when the <code>Parser</code> requests the next token, which causes the <code>Scanner</code> to request the next character (or characters) from the <code>Utf16CharacterStream</code>, which in turn reads the input string from the JavaScript heap. As with any stream-based solution, the temporary storage space requirement is minimal, as tokens are not scanned until they&#8217;re actually required downstream.</p><p>Let&#8217;s look at this process in more detail.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cQst!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c832ea4-0eaa-43eb-bb3e-7e741fb8cd5b_60x60.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cQst!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c832ea4-0eaa-43eb-bb3e-7e741fb8cd5b_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!cQst!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c832ea4-0eaa-43eb-bb3e-7e741fb8cd5b_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!cQst!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c832ea4-0eaa-43eb-bb3e-7e741fb8cd5b_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!cQst!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c832ea4-0eaa-43eb-bb3e-7e741fb8cd5b_60x60.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cQst!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c832ea4-0eaa-43eb-bb3e-7e741fb8cd5b_60x60.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5c832ea4-0eaa-43eb-bb3e-7e741fb8cd5b_60x60.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cQst!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c832ea4-0eaa-43eb-bb3e-7e741fb8cd5b_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!cQst!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c832ea4-0eaa-43eb-bb3e-7e741fb8cd5b_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!cQst!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c832ea4-0eaa-43eb-bb3e-7e741fb8cd5b_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!cQst!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c832ea4-0eaa-43eb-bb3e-7e741fb8cd5b_60x60.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>The v8::internal::Utf16CharacterStream Class</h3><p>The <code>Utf16CharacterStream</code> class is responsible for reading Unicode characters from the input stream, then providing them one-by-one to the <code>Scanner</code> class. This seems like a trivial exercise, but as we&#8217;ll see there are interesting edge-cases to consider. First, the scanner might need to look ahead into the input stream before deciding on what the current token should be. Second, the scanner doesn&#8217;t know (or care) where the characters come from, or how they&#8217;re stored in memory.</p><h4>Utf16CharacterStream Methods</h4><p>Let&#8217;s look at some of the <code>Utf16CharacterStream</code> methods, to understand how this class is used.</p><ul><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.h#L65"><code>stream-&gt;Advance()</code></a>&#8202;&#8212;&#8202;This method returns the next Unicode character in the input, fully removing that character from the input stream. This is the standard behaviour you expect when reading characters in a sequence.</p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.h#L53"><code>stream-&gt;Peek()</code></a>&#8202;&#8212;&#8202;This method returns the next Unicode character in the stream, but without actually consuming that character. Therefore, it&#8217;ll still be available when <code>Peek()</code> or <code>Advanced()</code> is next called. This is useful for looking ahead into the input, without committing to consume the character until you know for sure it&#8217;s part of the current token.</p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.h#L75"><code>stream-&gt;AdvanceUntil(func)</code></a>&#8202;&#8212;&#8202;Continually read (aka &#8220;use up&#8221;) characters until the <code>func</code> function returns true. This is useful for consuming characters until a certain point is reached, such as the end of line.</p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.h#L98"><code>stream-&gt;Back()</code></a>&#8202;&#8212;&#8202;Essentially the opposite of <code>Advance()</code>. Returns the character back to the input stream so it&#8217;s available for future <code>Peek()</code> or <code>Advance()</code> calls. This is useful when the scanner tried to read ahead, yet decided the next character wasn&#8217;t actually part of the current token.</p></li></ul><p>There are several more methods available, but these are the most important. As we&#8217;ll see when we examine some of the methods in the <code>Scanner</code> class, the ability to <em>read ahead</em>, yet also <em>push back</em> characters is vital for correctly scanning input tokens.</p><h4>Utf16CharacterStream is&nbsp;Abstract</h4><p>A second interesting discussion is how the characters are retrieved from memory. It turns out that <code>Utf16CharacterStream</code> is an abstract class, with a range of different implementations available, each focusing on a specific storage layout of the source string. In our case, the <code>1 + 1</code> string was stored on JavaScript&#8217;s heap using one byte to store each character. Other options include reading from two-byte strings, as well as from strings that are stored externally from the JavaScript heap.</p><p>Selection of the appropriate <code>Utf16CharacterStream</code> sub-class is performed within the <code>ParseProgram</code> method (see <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/parsing.cc#L51"><code>src/parsing/parsing.cc</code></a><code>)</code>&nbsp;. <code>ParseProgram</code> does many different things, but as far as scanning is concerned the most relevant line of code is:</p><pre><code>std::unique_ptr&lt;Utf16CharacterStream&gt; stream(
    ScannerStream::For(isolate, source));</code></pre><p>Using our <code>1 + 1</code> example, the<a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner-character-streams.cc#L751"><code>ScannerStream::For</code></a> method examines the source string and determines it has type <code>SeqOneByteString</code> (one-byte string, stored in the JavaScript heap). The <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner-character-streams.cc#L775">code then returns an instance</a> of the <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner-character-streams.cc#L235"><code>BufferedCharacterStream</code></a> class, which is the particular sub-class of <code>Utf16CharacterStream</code> capable of reading <code>SeqOneByteString</code> objects.</p><p>The most interesting part of the <code>BufferedCharacterStream</code> sub-class is the <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner-character-streams.cc#L253"><code>ReadBlock</code></a><code>()</code> method. This method is called upon by higher-level methods such as <code>Peek()</code> or <code>Advance()</code> to fetch the next block of characters from the input stream, however it may be stored.</p><p>Let&#8217;s now move forward in the scanner pipeline to learn how tokens are represented in V8.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uzpv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21a03995-7aca-4958-b24e-eca5f9911192_60x60.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uzpv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21a03995-7aca-4958-b24e-eca5f9911192_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!uzpv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21a03995-7aca-4958-b24e-eca5f9911192_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!uzpv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21a03995-7aca-4958-b24e-eca5f9911192_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!uzpv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21a03995-7aca-4958-b24e-eca5f9911192_60x60.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uzpv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21a03995-7aca-4958-b24e-eca5f9911192_60x60.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/21a03995-7aca-4958-b24e-eca5f9911192_60x60.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uzpv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21a03995-7aca-4958-b24e-eca5f9911192_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!uzpv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21a03995-7aca-4958-b24e-eca5f9911192_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!uzpv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21a03995-7aca-4958-b24e-eca5f9911192_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!uzpv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21a03995-7aca-4958-b24e-eca5f9911192_60x60.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>The v8::internal::Token Class</h3><p>Similar to other compilers, the <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/token.h#L207"><code>v8::internal::Token</code></a> class provides an enumeration of all token values recognized by the scanner. The definition of these tokens (see <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/token.h#L210"><code>src/parsing/token.h</code></a>) is managed by a clever C++ macro (<code>TOKEN_LIST</code>) containing a list of all tokens, combined with a second macro (<code>T</code>) which extracts the name portion:</p><p>Here&#8217;s the definition of the token enumeration, using C++ macros (warning: it&#8217;s not very easy to read):</p><pre><code>#define T(name, string, precedence) name,  
   enum Value : uint8_t { TOKEN_LIST(T, T) NUM_TOKENS };
#undef T</code></pre><p>And here is the definition of <code>TOKEN_LIST</code>, abbreviated from code in <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/token.h#L56"><code>src/parsing/token.h</code></a>.</p><pre><code>#define TOKEN_LIST(T, K) \
    T(TEMPLATE_SPAN, nullptr, 0) \
    T(TEMPLATE_TAIL, nullptr, 0) \
    T(PERIOD, ".", 0) \
    T(LBRACK, "[", 0) \
    T(QUESTION_PERIOD, "?.", 0) \
    T(LPAREN, "(", 0) \
    T(RPAREN, ")", 0) \
    T(RBRACK, "]", 0) \
    ...</code></pre><p>When macros are expanded, this becomes a more readable enumeration.</p><pre><code>enum Value : uint8_t {
   TEMPLATE_SPAN,
   TEMPLATE_TAIL,
   PERIOD,
   LBRACK,
   QUESTION_PERIOD,
   LPAREN,
   RPAREN,
   ...
   ADD,
   ...
   SMI,
   ...
   WHITESPACE,
   UNINITIALIZED,
   REGEXP_LITERAL,
   NUM_TOKENS
}</code></pre><p>As we&#8217;ll see later, token values are referenced in the <code>Scanner</code> and <code>Parser</code> classes using the syntax <code>Token::SMI</code> or <code>Token::ADD</code>.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EL-P!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4de327a9-5eef-44d3-a21a-e57b55ee36b3_60x60.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EL-P!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4de327a9-5eef-44d3-a21a-e57b55ee36b3_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!EL-P!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4de327a9-5eef-44d3-a21a-e57b55ee36b3_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!EL-P!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4de327a9-5eef-44d3-a21a-e57b55ee36b3_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!EL-P!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4de327a9-5eef-44d3-a21a-e57b55ee36b3_60x60.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EL-P!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4de327a9-5eef-44d3-a21a-e57b55ee36b3_60x60.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4de327a9-5eef-44d3-a21a-e57b55ee36b3_60x60.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EL-P!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4de327a9-5eef-44d3-a21a-e57b55ee36b3_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!EL-P!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4de327a9-5eef-44d3-a21a-e57b55ee36b3_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!EL-P!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4de327a9-5eef-44d3-a21a-e57b55ee36b3_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!EL-P!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4de327a9-5eef-44d3-a21a-e57b55ee36b3_60x60.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>The v8::internal::Scanner Class</h3><p>Let&#8217;s now dive into the internals of the <code>Scanner</code> class, responsible for reading characters from the input, and generating tokens for the output. We&#8217;ll see examples of scanning operators (such as <code>+</code>), as well as scanning numbers (such as <code>1</code>).</p><h4>Scanner Methods</h4><p>First, here are some interesting methods from the <code>Scanner</code> class. They are somewhat similar to the methods on the <code>Utf16CharacterStream</code> class, but operate on whole tokens, rather than individual characters.</p><ul><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.cc#L158"><code>scanner-&gt;Next()</code></a> &#8212;Return the next token from the input stream, and advanced the input pointer. Naturally, this calls the <code>stream-&gt;Advance()</code> method to fetch multiple characters from the <code>Utf16CharacterStream</code>, but will only return a single token value. In our <code>1 + 1</code> example, each token is only one character long, but normally this isn&#8217;t the case.</p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.h#L317"><code>scanner-&gt;peek()</code></a>&#8202;&#8212;&#8202;Peek ahead to see what the next token will be (beyond what <code>Next()</code> returns) without advancing the input. This is used by the parser to check the upcoming tokens to decide whether or not the current parser rule is matched by the input.</p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.cc#L181"><code>scanner-&gt;PeekAhead()</code></a>&#8212; Peek even further ahead, necessary for parsing some of JavaScript&#8217;s more complicated syntax.</p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.h#L289"><code>scanner-&gt;location()</code></a>&#8202;&#8212;&#8202;Return the location of the current token. This provides the start and end character positions within the source string.</p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.h#L377"><code>scanner-&gt;smi_value()</code></a>&#8202;&#8212;&#8202;Returns the Smi (small integer) value of the current token, if any. In our example, this will return the integer 1 for both of our <code>Token::SMI</code> tokens.</p></li></ul><p>And as you might expect, there are many other <code>Scanner</code> methods, mostly focused on error handling, but also for fetching the token&#8217;s associated value. As a sneak peek, here&#8217;s a small section of the <code>Parser</code> code using these methods:</p><pre><code>...</code></pre><pre><code>if (peek() == Token::PERIOD &amp;&amp; PeekAhead() == Token::PRIVATE_NAME) {
    Consume(Token::PERIOD);
    Consume(Token::PRIVATE_NAME);
    ...
}
...</code></pre><p>We&#8217;ll learn more about the <code>Parser</code> class in the next blog post, but this code snippet gives you a sense of how <code>Parser</code> calls upon <code>Scanner</code> to return the upcoming token values.</p><h4>The TokenDesc Structure</h4><p>Although we&#8217;ve already discussed the token enumerated values, allowing us to write <code>Token::SMI</code> (for the number <code>1</code>) or <code>Token::ADD</code> (for the <code>+</code> symbol) that&#8217;s only part of the what&#8217;s required for the scanner to represent a token. In addition, the scanner cares about the token&#8217;s location, the literal characters, any possible error cases, and of course, the actual numeric value of the token.</p><p>To store all this extra information, the scanner uses a <code>TokenDesc</code> structure:</p><pre><code>struct TokenDesc {
  Location location = {0, 0};
  LiteralBuffer literal_chars;
  LiteralBuffer raw_literal_chars;
  Token::Value token = Token::UNINITIALIZED;
  MessageTemplate invalid_template_escape_message = 
      MessageTemplate::kNone;
  Location invalid_template_escape_location;
  uint32_t smi_value_ = 0;
  bool after_line_terminator = false;
}</code></pre><p>The fields are:</p><ul><li><p><code>location</code>&#8202;&#8212;&#8202;The numeric start and end positions within the source string. For example, our first <code>Token::SMI</code> is at position 0, and our <code>Token::ADD</code> is at position 2.</p></li><li><p><code>literal_chars</code>&#8202;&#8212;&#8202;These are the actual characters that make up the token, whether it be a number, a string, an identifier, or something else. This is important, since knowing that <code>total_cost</code> is a <code>Token::IDENTIFIER</code> is only part of the story. In addition, we also need the identifier&#8217;s name (<code>total_cost</code>) to distinguish it from other <code>Token::IDENTIFIER</code> values.</p></li><li><p><code>raw_literal_chars</code>&#8202;&#8212;&#8202;The is similar to <code>literal_chars</code> but is used for <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals">template literals</a>. In this case, we don&#8217;t want escape sequences (e.g. <code>\064</code>) to be replaced by the corresponding character (e.g. <code>4</code>), but instead require that the <em>raw</em> literal characters be passed to the template function.</p></li><li><p><code>token</code>&#8202;&#8212;&#8202;The token&#8217;s enumerated value, as before.</p></li><li><p><code>invalid_template_escape_message</code> / <code>invalid_template_escape_location</code>&#8202;&#8212;&#8202;If there&#8217;s an error discovered while scanning a token, these fields store the error code and location.</p></li><li><p><code>smi_value_</code>&#8202;&#8212;&#8202;In the case of <code>Token::SMI</code>, this field stores the number&#8217;s actual integer value. This is returned by <code>scanner-&gt;smi_value()</code>.</p></li><li><p><code>after_line_terminator</code>&#8202;&#8212;&#8202;Indicates whether the token appears as the first token of a new line. This is useful for automatically inserting semicolons.</p></li></ul><p>Now we understand all the building blocks, let&#8217;s continue along the scanner pipeline, to see how characters are actually converted into token values.</p><h4>Example: Scanning for Operator&nbsp;Tokens</h4><p>At this point, we&#8217;re ready to trace through the operation of scanning <code>1 + 1</code>. Since scanning the operator is easier than scanning the numbers, let&#8217;s start by seeing what <code>scanner-&gt;Next()</code> does when the upcoming input character is the <code>+</code> sign.</p><p>The bulk of the scanning mechanism is in the private <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner-inl.h#L343"><code>ScanSingleToken()</code></a> method. The scanning starts with a very simple lookup into the <code>one_char_tokens[128]</code> array, which contains a direct mapping from the first 128 Unicode characters (aka ASCII characters) to a corresponding &#8220;guess&#8221; of what the token might be. Here&#8217;s the <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner-inl.h#L127"><code>GetOneCharToken()</code></a> method, used to populate the <code>one_char_tokens[128]</code> array:</p><pre><code>constexpr Token::Value GetOneCharToken(char c) {
   return
      c == '(' ? Token::LPAREN :
      c == ')' ? Token::RPAREN :
      c == '{' ? Token::LBRACE :
      c == '}' ? Token::RBRACE :
      c == '[' ? Token::LBRACK :
      c == ']' ? Token::RBRACK :
      c == '?' ? Token::CONDITIONAL :
      c == ':' ? Token::COLON :
      ...
      c == '+' ? Token::ADD :
      ...
}</code></pre><p>In our example, the character <code>+</code> is mapped to <code>Token::ADD</code>. However, this is only a guess. What if the actual input was <code>++</code> or <code>+=</code>, which are both legal tokens in JavaScript? To handle this, the code looks ahead to see if the following character is also a <code>+</code> (in which case <code>Token::INC</code> is returned), or perhaps an <code>=</code> (returning <code>Token::ASSIGN_ADD)</code>. If neither case is true, the original <code>Token::ADD</code> is returned.</p><p>Here&#8217;s <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner-inl.h#L418">the code in question</a>:</p><pre><code>case Token::ADD:
  // + ++ +=
  Advance();
  if (c0_ == '+') return Select(Token::INC);
  if (c0_ == '=') return Select(Token::ASSIGN_ADD);
  return Token::ADD;</code></pre><p>Note that the <code>Advance()</code> method reads the next character into local variable <code>c0_</code>, and <code>Select</code> is short-hand for consuming that character, and returning the specified token.</p><p>Let&#8217;s now see the more complicated case of scanning numbers.</p><h4>Example: Scanning for Number&nbsp;Tokens</h4><p>When scanning the string <code>1</code>, the process starts the same way as before. We look up the character in the <code>one_char_tokens[128]</code> array, which provides <code>Token::NUMBER</code> as the initial guess. The <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner-inl.h#L519">code then immediately calls</a> the <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.cc#L744"><code>ScanNumber</code></a> method to look more deeply at the characters appearing in the input stream.</p><pre><code>case Token::NUMBER:
    return ScanNumber(false);</code></pre><p>Here&#8217;s a breakdown of how <code>ScanNumber</code> works:</p><ul><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.cc#L752">Line 752</a>&#8202;&#8212;&#8202;A decision is made about whether the number starts with a decimal point (the&nbsp;<code>.</code> character). If this is true, the number contains only a fractional portion (such as&nbsp;<code>.123</code>), so further scanning is delegated to the <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.cc#L647"><code>ScanDecimalDigits()</code></a> method. In our <code>1 + 1</code> example, we don&#8217;t take this code path.</p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.cc#L762">Line 762</a>&#8212;Next, we check whether the first character in the number is a <code>0</code>. If so, we check the following character for <code>x</code>, <code>o</code>, or <code>b</code>, delegating to <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.cc#L733"><code>ScanHexDigits()</code></a>, <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.cc#L709"><code>ScanOctalDigits()</code></a>, or <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.cc#L705"><code>ScanBinaryDigits()</code></a> respectively. However, if the next character was actually an octal digit (from <code>0</code> to <code>7</code>), then <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.cc#L713"><code>ScanImplicitOctalDigits()</code></a> is called to handle cases such as <code>077</code> (as opposed to the more explicit <code>0o77</code>). Finally, a number like <code>088</code> (out of range for an octal) is treated as a regular decimal <code>88</code>.</p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.cc#L797">Line 797</a>&#8202;&#8212;&#8202;Since our input string (<code>1 + 1</code>) does not start with a <code>0</code> digit, we consider the number to be a decimal (base 10), possibly containing underscores (e.g. <code>1_000</code>)</p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.cc#L803">Line 803</a>&#8202;&#8212;&#8202;Given that most number literals are small, we take our chances and scan the number as a Smi (small integer), delegating the work to <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.cc#L691"><code>ScanDecimalAsSmi()</code></a>, returning the value as a C++ <code>uint64_t</code>.</p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.cc#L807">Line 807</a>&#8202;&#8212;&#8202;A Smi must be small enough to fit into 31-bits. If this is possible, we set the <code>smi_value_</code> field of the token to the integer&#8217;s value, then return <code>Token::SMI</code>. This is the path taken in our <code>1 + 1</code> example.</p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.cc#L819">Line 819</a>&#8202;&#8212;&#8202;If the number wasn&#8217;t a Smi, continue to parse the decimal number, calling <a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.cc#L647"><code>ScanDecimalDigits()</code></a> to do so. Additionally, we check for a trailing decimal point (the&nbsp;<code>.</code> character), followed by another decimal number (the fractional part). Note that this code simply validates the number is well-formed, rather than extracting the actual value itself (as we did in the Smi case).</p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.cc#L833">Line 833&#8202;</a>&#8212;&#8202;This is where we handle the BigInt scenario, where the number has a trailing <code>n</code>. For example, <code>12345678n</code>.</p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/parsing/scanner.cc#L848">Line 848&#8202;</a>&#8212;&#8202;Finally, we handle the exponent case, where the number has a trailing <code>e</code> following by the exponent. For example: <code>123e5</code>. This is not relevant in our <code>1 + 1</code> case, since a Smi can&#8217;t have an exponent.</p></li></ul><p>And then ends the scanning process for operators and numbers in JavaScript, due to the complexity of multi-character operators, different bases (hex, octal, binary), underscores used as digit separators, fractional portions, BigInts, and exponents, the whole process is quite complicated.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ddi4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac6e5455-57f2-4a94-adbf-6b9be2d5b819_60x60.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ddi4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac6e5455-57f2-4a94-adbf-6b9be2d5b819_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!ddi4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac6e5455-57f2-4a94-adbf-6b9be2d5b819_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!ddi4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac6e5455-57f2-4a94-adbf-6b9be2d5b819_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!ddi4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac6e5455-57f2-4a94-adbf-6b9be2d5b819_60x60.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ddi4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac6e5455-57f2-4a94-adbf-6b9be2d5b819_60x60.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ac6e5455-57f2-4a94-adbf-6b9be2d5b819_60x60.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ddi4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac6e5455-57f2-4a94-adbf-6b9be2d5b819_60x60.png 424w, https://substackcdn.com/image/fetch/$s_!ddi4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac6e5455-57f2-4a94-adbf-6b9be2d5b819_60x60.png 848w, https://substackcdn.com/image/fetch/$s_!ddi4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac6e5455-57f2-4a94-adbf-6b9be2d5b819_60x60.png 1272w, https://substackcdn.com/image/fetch/$s_!ddi4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac6e5455-57f2-4a94-adbf-6b9be2d5b819_60x60.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Next Time&#8230;</h3><p>In <a href="https://medium.com/compilers/calculating-1-1-in-javascript-part-4-42ca49f45ac5">the next blog post</a>, we&#8217;ll continue our story of calculating <code>1 + 1</code>. Now that we have a sequence of tokens (<code>Token::SMI</code>, <code>Token::ADD</code>, and <code>TOKEN:SMI</code>) we can see how the <code>Parser</code> class validates the input against the JavaScript language definition. Finally, we&#8217;ll see how an AST (Abstract Syntax Tree) is created as an in-memory representation of our program.</p>]]></content:encoded></item><item><title><![CDATA[Calculating 1 + 1 in JavaScript — Part 2]]></title><description><![CDATA[This blog post describes how the V8 JavaScript Engine calculates 1 + 1, considering the V8 API, the heap, and garbage collection.]]></description><link>https://www.petersmith.net/p/calculating-1-1-in-javascript-part-2-e01f336503d0</link><guid isPermaLink="false">https://www.petersmith.net/p/calculating-1-1-in-javascript-part-2-e01f336503d0</guid><dc:creator><![CDATA[Peter Smith]]></dc:creator><pubDate>Mon, 01 Mar 2021 16:35:30 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/731a8b48-e394-4d2f-be9e-384b5071c89a_512x330.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>I&#8217;m a compiler enthusiast who has been learning how the<a href="https://v8.dev/"> V8 JavaScript Engine</a> works. Of course, the best way to learn something is to write about it, so that&#8217;s why I&#8217;m sharing my experiences here. I hope this might be interesting to others too.</em></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hZ8X!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60a5ed1b-4ab5-4036-b12b-2b34eaa338f7_512x330.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hZ8X!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60a5ed1b-4ab5-4036-b12b-2b34eaa338f7_512x330.png 424w, https://substackcdn.com/image/fetch/$s_!hZ8X!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60a5ed1b-4ab5-4036-b12b-2b34eaa338f7_512x330.png 848w, https://substackcdn.com/image/fetch/$s_!hZ8X!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60a5ed1b-4ab5-4036-b12b-2b34eaa338f7_512x330.png 1272w, https://substackcdn.com/image/fetch/$s_!hZ8X!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60a5ed1b-4ab5-4036-b12b-2b34eaa338f7_512x330.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hZ8X!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60a5ed1b-4ab5-4036-b12b-2b34eaa338f7_512x330.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/60a5ed1b-4ab5-4036-b12b-2b34eaa338f7_512x330.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hZ8X!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60a5ed1b-4ab5-4036-b12b-2b34eaa338f7_512x330.png 424w, https://substackcdn.com/image/fetch/$s_!hZ8X!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60a5ed1b-4ab5-4036-b12b-2b34eaa338f7_512x330.png 848w, https://substackcdn.com/image/fetch/$s_!hZ8X!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60a5ed1b-4ab5-4036-b12b-2b34eaa338f7_512x330.png 1272w, https://substackcdn.com/image/fetch/$s_!hZ8X!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60a5ed1b-4ab5-4036-b12b-2b34eaa338f7_512x330.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>This is the second part of a multi-part series describing how the <a href="https://v8.dev/">V8 JavaScript Engine</a> calculates <code>1 + 1</code>. It&#8217;s a very simple expression with an obvious answer, but it still requires the full mechanism of scanning and parsing the input string, generating and executing byte codes, then displaying the result, all while maintaining data on the JavaScript heap.</p><p>If you haven&#8217;t read it, it&#8217;s important you start with <a href="https://medium.com/compilers/calculating-1-1-in-javascript-1cecb6e9610">Part 1</a>, although it&#8217;s also recommended you come with a passion for compiler technology.</p><h3>Last Time&#8230;</h3><p>Last time we saw our example client program, including how it calls C++ methods in the standard V8 libraries. Our program stores the literal string of <code>1 + 1</code> in the JavaScript heap (as a <code>SeqOneByteString</code> object), then compiles the expression to byte code, executes that byte code, then displays the result on the console.</p><pre><code>// Create a string containing the JavaScript source code.
Local&lt;String&gt; source = String::NewFromUtf8Literal(isolate, "1 + 1");</code></pre><pre><code>// Compile the source code.
Local&lt;Script&gt; script = 
    Script::Compile(context, source).ToLocalChecked();</code></pre><pre><code>// Run the script to get the result.
Local&lt;Value&gt; result = script-&gt;Run(context).ToLocalChecked();</code></pre><pre><code>// Convert the result to Number and print it.
Local&lt;Number&gt; number = Local&lt;Number&gt;::Cast(result);
printf("%f\n", number-&gt;Value());</code></pre><p>In the <a href="https://medium.com/compilers/calculating-1-1-in-javascript-1cecb6e9610">first blog post</a> we traced the full <code>String::NewFromUtf8Literal()</code> method, so this time we&#8217;ll continue with the <code>Script::Compile()</code> method:</p><pre><code>Local&lt;Script&gt; script = 
    Script::Compile(context, source).ToLocalChecked();</code></pre><p><code>Script::Compile()</code> is responsible for a large number of activities:</p><ol><li><p>Checking the <em>Compilation Cache</em> to see if the same script had already been compiled before. This saves us from repeatedly generating byte codes for commonly used scripts.</p></li><li><p>Scanning the input string into <em>Tokens</em>. As we&#8217;ll see, <code>1 + 1</code> is converted to a sequence of token values: <code>Token::SMI</code> (small integer), <code>Token::ADD</code>, and then a second <code>Token::SMI</code>.</p></li><li><p>Parsing the tokens into an Abstract Syntax Tree (AST), providing an in-memory view of the program.</p></li><li><p>Generating the corresponding V8 byte codes, while performing some amount of optimization.</p></li></ol><p>The return value from <code>Script::Compile()</code> is a <code>Local&lt;Script&gt;</code> handle, referring to byte codes to be executed by the V8 virtual machine. For now though, we&#8217;ll focus exclusively on the first step above. That is, checking if the compiled code is already available in a cache.</p><p>It shouldn&#8217;t come as a surprise that almost all JavaScript code is downloaded multiple times, either in the same browser session, or in different sessions over a period time. To avoid recompiling source code that hasn&#8217;t changed, V8 provides two purpose-built cache mechanisms. The first is the per-Isolate cache<em>, </em>storing compiled byte codes directly in V8&#8217;s local memory. The second approach allows embedder applications (such as Chromium or NodeJS) to save their own copy of the compiled byte codes, most likely in a disk-based format. Let&#8217;s look at each approach.</p><h3>Approach 1&#8202;&#8212;&#8202;The Per-Isolate Cache</h3><p>The per-Isolate cache is built into V8, and is enabled by default. In V8 terminology, an <em>Isolate</em> is an instance of a JavaScript virtual machine, complete with its own heap memory. When V8 is embedded into applications, such as a browser, it&#8217;s common to use different V8 Isolates as a means of separating (aka &#8220;isolating&#8221;) one JavaScript run-time environment from another. Perhaps the best example is browser tabs, where code running inside one tab must not impact the code in other tabs.</p><p>When a script is submitted to an Isolate for compilation, the source code string (such as <code>1 + 1</code>) is used as a key for an in-memory hash table. If that exact source code had been compiled before, a <code>SharedFunctionInfo</code> object, containing the script&#8217;s byte codes, is read from the cache and returned to the caller. However, if there&#8217;s a cache miss, the script must be compiled from scratch, with the generated byte codes inserted into the cache for next time.</p><p>The per-Isolate cache (in the <code>CompilationCache</code> class, see <a href="https://github.com/v8/v8/blob/8.8.276/src/codegen/compilation-cache.h#L185"><code>src/codegen/compilation-cache.h</code></a>) is not just a simple hash table, but has a number of features catering for different types of script. For example, the <code>LookupScript()</code> and <code>PutScript()</code> methods cache &#8220;normal&#8221; JavaScript source code, delegating their work to the <a href="https://github.com/v8/v8/blob/8.8.276/src/codegen/compilation-cache.h#L80"><code>CompilationCacheScript</code></a> class. In contrast, the <code>LookupEval()</code> and <code>PutEval()</code> methods manage the cache of JavaScript strings passed into the <code>eval()</code> function, delegating their work to the <a href="https://github.com/v8/v8/blob/8.8.276/src/codegen/compilation-cache.h#L117"><code>CompilationCacheEval</code></a> class. Likewise, there are sub-caches for regular expressions (regexes) and other code objects.</p><p>In addition, each sub-cache in the the per-Isolate cache has multiple <em>generations, </em>allowing older cached items to be <em>aged out</em> over time if they haven&#8217;t been used recently. There has clearly been a lot of thought and optimization put into the design of this in-memory cache system.</p><p>Here&#8217;s an example of how the compilation cache is laid out in memory, showing the hierarchical hash tables:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BygM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bd631e1-bb2a-4593-bd09-a5ecdcfd2f45_800x328.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BygM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bd631e1-bb2a-4593-bd09-a5ecdcfd2f45_800x328.png 424w, https://substackcdn.com/image/fetch/$s_!BygM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bd631e1-bb2a-4593-bd09-a5ecdcfd2f45_800x328.png 848w, https://substackcdn.com/image/fetch/$s_!BygM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bd631e1-bb2a-4593-bd09-a5ecdcfd2f45_800x328.png 1272w, https://substackcdn.com/image/fetch/$s_!BygM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bd631e1-bb2a-4593-bd09-a5ecdcfd2f45_800x328.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BygM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bd631e1-bb2a-4593-bd09-a5ecdcfd2f45_800x328.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2bd631e1-bb2a-4593-bd09-a5ecdcfd2f45_800x328.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BygM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bd631e1-bb2a-4593-bd09-a5ecdcfd2f45_800x328.png 424w, https://substackcdn.com/image/fetch/$s_!BygM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bd631e1-bb2a-4593-bd09-a5ecdcfd2f45_800x328.png 848w, https://substackcdn.com/image/fetch/$s_!BygM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bd631e1-bb2a-4593-bd09-a5ecdcfd2f45_800x328.png 1272w, https://substackcdn.com/image/fetch/$s_!BygM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bd631e1-bb2a-4593-bd09-a5ecdcfd2f45_800x328.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>To see the per-Isolate cache in action, enter <code>1 + 1</code> into the <code>d8</code> interpreter multiple times:</p><pre><code>$ ./out/x64.debug/d8 --print-bytecode
V8 version 8.8.0 (candidate)
d8&gt; 1 + 1
... lots of output given, including byte codes ...
2
d8&gt; 1 + 1
2
d8&gt; 1 + 1
2</code></pre><p>As expected, large amounts of compilation output is generated the first time (thanks to the <code>--print-bytecode</code> flag), but no byte codes are generated the second (or third) times. In contrast, if you were to specify the <code>--no-compilation-cache</code> command-line flag, you&#8217;ll instead see the code being recompiled every time.</p><h3>Approach 2&#8202;&#8212;&#8202;Caching Byte Codes in the&nbsp;Embedder</h3><p>There are several limitations of the per-Isolate cache mentioned above. In particular, an in-memory cache will not survive when the application restarts (such as shutting down your browser). Additionally, the cache is not shared between different instances of V8, implying that a web page loaded in one browser tab does not share the cache with other browser tabs.</p><p>To solve these issues, a second type of cache is available. As discussed in <a href="https://v8.dev/blog/code-caching">Code caching</a>, the application can request that V8 provide a serialized version of the compiled code, which is saved in the application&#8217;s own cache (such as the Chromium browser cache). This serialized data is passed back to the application using the <code>GetCacheData()</code> method of the <code>Source</code> object (see <a href="https://github.com/v8/v8/blob/8.8.276/include/v8.h#L1756"><code>include/v8.h</code></a>) then saved in the application&#8217;s own cache.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iAZR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda3873b5-a9d4-4131-893a-06b7f2a6937f_800x299.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iAZR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda3873b5-a9d4-4131-893a-06b7f2a6937f_800x299.png 424w, https://substackcdn.com/image/fetch/$s_!iAZR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda3873b5-a9d4-4131-893a-06b7f2a6937f_800x299.png 848w, https://substackcdn.com/image/fetch/$s_!iAZR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda3873b5-a9d4-4131-893a-06b7f2a6937f_800x299.png 1272w, https://substackcdn.com/image/fetch/$s_!iAZR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda3873b5-a9d4-4131-893a-06b7f2a6937f_800x299.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iAZR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda3873b5-a9d4-4131-893a-06b7f2a6937f_800x299.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/da3873b5-a9d4-4131-893a-06b7f2a6937f_800x299.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iAZR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda3873b5-a9d4-4131-893a-06b7f2a6937f_800x299.png 424w, https://substackcdn.com/image/fetch/$s_!iAZR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda3873b5-a9d4-4131-893a-06b7f2a6937f_800x299.png 848w, https://substackcdn.com/image/fetch/$s_!iAZR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda3873b5-a9d4-4131-893a-06b7f2a6937f_800x299.png 1272w, https://substackcdn.com/image/fetch/$s_!iAZR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda3873b5-a9d4-4131-893a-06b7f2a6937f_800x299.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>When the application attempts to compile the same script again, such as when a web page downloads the same&nbsp;<code>.js</code> file multiple times, the browser passes the <a href="https://github.com/v8/v8/blob/8.8.276/include/v8.h#L1710"><code>CachedData</code></a> back to V8 to avoid regenerating the byte codes.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!X2lD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb30f479-1a8b-407c-9fa2-ee1f40543131_800x298.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!X2lD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb30f479-1a8b-407c-9fa2-ee1f40543131_800x298.png 424w, https://substackcdn.com/image/fetch/$s_!X2lD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb30f479-1a8b-407c-9fa2-ee1f40543131_800x298.png 848w, https://substackcdn.com/image/fetch/$s_!X2lD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb30f479-1a8b-407c-9fa2-ee1f40543131_800x298.png 1272w, https://substackcdn.com/image/fetch/$s_!X2lD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb30f479-1a8b-407c-9fa2-ee1f40543131_800x298.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!X2lD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb30f479-1a8b-407c-9fa2-ee1f40543131_800x298.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fb30f479-1a8b-407c-9fa2-ee1f40543131_800x298.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!X2lD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb30f479-1a8b-407c-9fa2-ee1f40543131_800x298.png 424w, https://substackcdn.com/image/fetch/$s_!X2lD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb30f479-1a8b-407c-9fa2-ee1f40543131_800x298.png 848w, https://substackcdn.com/image/fetch/$s_!X2lD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb30f479-1a8b-407c-9fa2-ee1f40543131_800x298.png 1272w, https://substackcdn.com/image/fetch/$s_!X2lD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb30f479-1a8b-407c-9fa2-ee1f40543131_800x298.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>The clear advantage is that code can be cached for long periods of time, even if the application restarts. However, the downside is that byte codes must be serialized (see the <a href="https://github.com/v8/v8/blob/8.8.276/src/snapshot/code-serializer.h#L47"><code>CodeSerializer</code></a> class) from V8&#8217;s in-memory format to a sequence of bytes more suited for on-disk storage. At a later point in time, this serialized data must be deserialized again before it can be executed. All of this requires extra time, slightly negating the value of caching byte codes in the first place.</p><p>Because of this extra overhead, V8 only serializes the data the second time it&#8217;s compiled, ensuring it&#8217;s not just a one-time script that will never be seen again. Also, V8 defers that serialization work until <em>after</em> the code has been executed, ensuring the serialization does not diminish the user&#8217;s experience.</p><h3>Tracing the Code&#8202;&#8212;&#8202;Making the Cache&nbsp;Decision</h3><p>To see how these caching techniques fit into our big picture of computing the <code>1 + 1</code> expression, let&#8217;s walk through the full code path. As mentioned earlier, we start by calling upon V8&#8217;s <code>Script::Compile()</code> method with <code>1 + 1</code> as an input parameter. Although this method initiates the entire compilation process, we&#8217;ll only look at how the caching mechanisms are involved.</p><pre><code>Local&lt;Script&gt; script = 
    Script::Compile(context, source).ToLocalChecked();</code></pre><p>As we saw in the first blog post, this calls into V8&#8217;s API layer (see <a href="https://github.com/v8/v8/blob/8.8.276/src/api/api.cc#L2723"><code>src/api/api.cc</code></a>) to validate the input arguments, add a few more important values (such as the pointer to the <code>Isolate</code> object), as well as translate between external <code>Local</code> handles and their corresponding V8 internal objects.</p><p>Before too long, we reach the <code>Compiler::GetSharedFunctionInfoForScript()</code> method, which is where the caching decisions are made (see <a href="https://github.com/v8/v8/blob/8.8.276/src/codegen/compiler.cc#L2639"><code>src/codegen/compiler.cc</code></a>). Here are the basic steps that are followed:</p><ol><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/codegen/compiler.cc#L2647https://github.com/v8/v8/blob/8.8.276/src/codegen/compiler.cc#L2647">Line 2647</a>&#8202;&#8212;&#8202;One of the parameters for <code>GetSharedFunctionInfoForScript()</code> is <code>compile_options</code>, specifying how the embedder cache should be used. If the caller passes <code>kConsumeCodeCache</code> as the value for <code>compile_option</code>, V8 is asked to consider using the serialized byte codes that were saved in the embedder&#8217;s cache (available in the <code>cached_data</code> parameter). In our case though, this defaulted to <code>kNoCompileOptions</code>, indicating that no serialized data is available.</p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/codegen/compiler.cc#L2655">Line 2655&#8202;</a>&#8212;&#8202;For tracking purposes, we record the number of bytes loaded and compiled for this isolate. There are 5 bytes in <code>1 + 1</code>.</p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/codegen/compiler.cc#L2659">Line 2659</a>&#8202;&#8212;&#8202;We must take JavaScript&#8217;s language mode into account, since it impacts code generation and therefore the byte codes that are cached. The options are <code>kSloppy</code> and <code>kStrict</code>, representing the traditional JavaScript syntax, versus the newer strict mode.</p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/codegen/compiler.cc#L2676">Line 2676&#8202;</a>&#8212;&#8202;Regardless of whether the embedder provided a <code>cached_data</code> parameter, we check whether the source code is already cached in the per-Isolate cache. This dives into the <a href="https://github.com/v8/v8/blob/8.8.276/src/codegen/compilation-cache.cc#L333"><code>CompilationCache::LookupScript()</code></a> method, which delegates to the <a href="https://github.com/v8/v8/blob/8.8.276/src/codegen/compilation-cache.cc#L147"><code>CompilationCacheScript::LookUp()</code></a> method in the &#8220;script&#8221; sub-cache. Eventually, that code performs a lookup in the <a href="https://github.com/v8/v8/blob/8.8.276/src/codegen/compilation-cache.cc#L160">multi-generational hash table</a>. Checking this cache (even if we were passed <code>cached_data</code> by the embedder) is super fast, given that the byte codes are already in V8&#8217;s memory.</p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/codegen/compiler.cc#L2682">Line 2686</a>&#8202;&#8212;&#8202;Given that our <code>1 + 1</code> script had not previously been compiled and cached in V8&#8217;s memory, we now consider using <code>cached_data</code> from the embedder. In our example though, we weren&#8217;t given any <code>cached_data</code> by our embedder (our simple example program), so neither of the caches provide a hit. However, if <code>cached_data</code> had been provided, we&#8217;d need to <a href="https://github.com/v8/v8/blob/8.8.276/src/codegen/compiler.cc#L2691">deserialize it into the in-memory format</a>, and then <a href="https://github.com/v8/v8/blob/8.8.276/src/codegen/compiler.cc#L2698">insert it into V8&#8217;s per-Isolate cache</a>.</p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/codegen/compiler.cc#L2727">Line 2727&#8202;</a>&#8212;&#8202;Given that neither of our caches contained the pre-compiled byte codes for <code>1 + 1</code>, we now proceed to compile the source code. This is done by the <code>CompileScriptOnMainThread()</code> method. As we&#8217;ll see in the next blog post, this is where all the complexity of scanning, parsing, and code generation takes place.</p></li><li><p><a href="https://github.com/v8/v8/blob/8.8.276/src/codegen/compiler.cc#L2736">Line 2736&#8202;</a>&#8212;&#8202;If the compilation was successful, the <code>SharedFunctionInfo</code> object (containing the generated byte codes) is inserted into the per-Isolate cache, ready for the next time that <code>1 + 1</code> is evaluated.</p></li></ol><p>So, that&#8217;s an overview of the V8 code caching mechanism. If you&#8217;re interested, there are several really great articles and presentations from the V8 team on how caching works, including the very comprehensive <a href="https://v8.dev/blog/code-caching-for-devs">Code caching for JavaScript developers</a> and <a href="https://www.youtube.com/watch?v=YqHOUy2rYZ8">BlinkOn 9: Caching (more) JavaScript code in Chrome</a>.</p><h3>Next Time&#8230;</h3><p>In <a href="https://medium.com/compilers/calculating-1-1-in-javascript-part-3-710f686d9a40">Part 3 of this blog post series</a>, we&#8217;ll continue by tracing further along into the <code>Script::Compile()</code> method. That is, we&#8217;ll learn more about how V8&#8217;s lexical scanner reads a sequence of input characters (in our case, <code>1 + 1</code>) and forms them into tokens to use as input into the parsing process.</p>]]></content:encoded></item><item><title><![CDATA[Calculating 1 + 1 in JavaScript]]></title><description><![CDATA[Ever wonder how JavaScript calculates 1 + 1? It&#8217;s not as simple as you think.]]></description><link>https://www.petersmith.net/p/calculating-1-1-in-javascript-1cecb6e9610</link><guid isPermaLink="false">https://www.petersmith.net/p/calculating-1-1-in-javascript-1cecb6e9610</guid><dc:creator><![CDATA[Peter Smith]]></dc:creator><pubDate>Fri, 13 Nov 2020 17:34:42 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/da41c140-ff76-4659-99aa-ca9cea78dc92_526x268.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>I&#8217;m a compiler enthusiast who has been learning how the<a href="https://v8.dev/"> V8 JavaScript Engine</a> works. Of course, the best way to learn something is to write about it, so that&#8217;s why I&#8217;m sharing my experiences here. I hope this might be interesting to others too (<a href="https://juejin.cn/post/6948243728842620959">also available in Chinese</a>, thanks to </em><a href="https://medium.com/@qqqqqcy">@qqqqqqcy</a> for the translation).</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aznC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6b3c6b9-8c42-47e0-ade9-472abdd57535_526x268.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aznC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6b3c6b9-8c42-47e0-ade9-472abdd57535_526x268.png 424w, https://substackcdn.com/image/fetch/$s_!aznC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6b3c6b9-8c42-47e0-ade9-472abdd57535_526x268.png 848w, https://substackcdn.com/image/fetch/$s_!aznC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6b3c6b9-8c42-47e0-ade9-472abdd57535_526x268.png 1272w, https://substackcdn.com/image/fetch/$s_!aznC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6b3c6b9-8c42-47e0-ade9-472abdd57535_526x268.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aznC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6b3c6b9-8c42-47e0-ade9-472abdd57535_526x268.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c6b3c6b9-8c42-47e0-ade9-472abdd57535_526x268.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aznC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6b3c6b9-8c42-47e0-ade9-472abdd57535_526x268.png 424w, https://substackcdn.com/image/fetch/$s_!aznC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6b3c6b9-8c42-47e0-ade9-472abdd57535_526x268.png 848w, https://substackcdn.com/image/fetch/$s_!aznC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6b3c6b9-8c42-47e0-ade9-472abdd57535_526x268.png 1272w, https://substackcdn.com/image/fetch/$s_!aznC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6b3c6b9-8c42-47e0-ade9-472abdd57535_526x268.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>Yes, obviously the answer is that <code>1 + 1 = 2</code>, but how does the V8 JavaScript Engine compute this answer?</p><p>Stepping back for a moment&#8230; one of my favourite interview questions is:</p><p>&#8220;<em>When I type a URL into a web browser, and then press enter, what are all the things that happen to load the web page? Go into as much detail as you like</em>&#8221;.</p><p>It&#8217;s a great question, because it shows the depth and breadth of a person&#8217;s understanding, allowing them to illustrate which parts of the process are the most interesting for them.</p><p>This is the first in a series of blog posts that will look at everything V8 does when <code>1 + 1</code> is entered. To start with, we&#8217;ll focus on how V8 stores the <code>1 + 1</code> string in its heap memory. It sounds simple, but it&#8217;s an entire blog post on its own!</p><h3>The Client Application</h3><p>To compute <code>1 + 1</code>, the most likely approach you&#8217;d take is to fire up NodeJS, or the Chrome developer console, and simply enter <code>1 + 1</code>. That will work, but in order to show the V8 internals, I&#8217;ve decided to modify <a href="https://github.com/v8/v8/blob/8.8.276/samples/hello-world.cc"><code>hello-world.cc</code></a>, one of the standard sample applications in the V8 source code.</p><p>I took the <a href="https://github.com/v8/v8/blob/8.8.276/samples/hello-world.cc#L39">original code, which printed &#8220;Hello World&#8221;</a>, and replaced it with the expression <code>1 + 1</code>:</p><pre><code>// Create a string containing the JavaScript source code.
Local&lt;String&gt; source = String::NewFromUtf8Literal(isolate, "1 + 1");</code></pre><pre><code>// Compile the source code.
Local&lt;Script&gt; script = 
    Script::Compile(context, source).ToLocalChecked();</code></pre><pre><code>// Run the script to get the result.
Local&lt;Value&gt; result = script-&gt;Run(context).ToLocalChecked();</code></pre><pre><code>// Convert the result to Number and print it.
Local&lt;Number&gt; number = Local&lt;Number&gt;::Cast(result);
printf("%f\n", number-&gt;Value());</code></pre><p>Take a quick look at this code to get an idea of what it does. The lines of C++ might be fairly cryptic, but the comments will help. For this blog post we&#8217;ll focus exclusively on the first statement, allocating a new <code>1 + 1</code> string in the V8 heap:</p><pre><code>Local&lt;String&gt; source = String::NewFromUtf8Literal(isolate, "1 + 1");</code></pre><p>To understand this code, let&#8217;s start with the high-level sequence of V8 modules involved. In this diagram, the flow of execution is from left-to-right, with the return value being passed back from right-to-left to be inserted into the <code>source</code> variable.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RBv_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F476cf9e5-0e0d-457d-b3d3-1bdfa58341fd_800x177.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RBv_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F476cf9e5-0e0d-457d-b3d3-1bdfa58341fd_800x177.png 424w, https://substackcdn.com/image/fetch/$s_!RBv_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F476cf9e5-0e0d-457d-b3d3-1bdfa58341fd_800x177.png 848w, https://substackcdn.com/image/fetch/$s_!RBv_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F476cf9e5-0e0d-457d-b3d3-1bdfa58341fd_800x177.png 1272w, https://substackcdn.com/image/fetch/$s_!RBv_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F476cf9e5-0e0d-457d-b3d3-1bdfa58341fd_800x177.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RBv_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F476cf9e5-0e0d-457d-b3d3-1bdfa58341fd_800x177.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/476cf9e5-0e0d-457d-b3d3-1bdfa58341fd_800x177.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RBv_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F476cf9e5-0e0d-457d-b3d3-1bdfa58341fd_800x177.png 424w, https://substackcdn.com/image/fetch/$s_!RBv_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F476cf9e5-0e0d-457d-b3d3-1bdfa58341fd_800x177.png 848w, https://substackcdn.com/image/fetch/$s_!RBv_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F476cf9e5-0e0d-457d-b3d3-1bdfa58341fd_800x177.png 1272w, https://substackcdn.com/image/fetch/$s_!RBv_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F476cf9e5-0e0d-457d-b3d3-1bdfa58341fd_800x177.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><ul><li><p><strong>Application</strong>&#8202;&#8212;&#8202;This represents the client of V8. In our example, it&#8217;s the <code>hello-world.cc</code> program, although more realistically it could be the entire Chrome Browser, the NodeJS run-time system, or any other piece of software that embeds the V8 JavaScript Engine.</p></li><li><p><strong>V8 External API</strong>&#8202;&#8212;&#8202;This is the <a href="https://v8docs.nodesource.com/">client-facing API</a> providing access to the functionality of V8. Although it&#8217;s implemented in C++, the API is shaped around various JavaScript concepts, such as numbers, strings, arrays, functions, and objects, allowing them to be created and manipulated in various ways.</p></li><li><p><strong>Heap Factory</strong>&#8202;&#8212;&#8202;Internal to the V8 Engine (not exposed via the API) is a &#8220;factory&#8221; for creating various data objects on the heap. Quite surprisingly, the set of factory methods available is very different to what the external API provides, so a lot of translation is done within the API layer.</p></li><li><p><strong>New Space</strong>&#8202;&#8212;&#8202;V8&#8217;s heap is very complicated, but newly-allocated objects are usually stored in the <em>New Space</em>, often known as the <em>Young Generation</em>. We won&#8217;t cover the detail here, but the new space is managed using <a href="https://en.wikipedia.org/wiki/Cheney%27s_algorithm">Cheney&#8217;s Algorithm</a>, a well-known algorithm for performing garbage collection.</p></li></ul><p>Let&#8217;s now go into more detail on this flow, focusing on:</p><ul><li><p>How the API layer decides what type of string to create, and where it should be stored within the heap.</p></li><li><p>What the string&#8217;s internal memory layout will be. This depends on the range of characters appearing in the string.</p></li><li><p>How the storage is allocated from the heap. In our example, 20 bytes are required.</p></li><li><p>Finally, how the pointer to the string is returned to the application, with the goal of garbage collecting in the future.</p></li></ul><h3>Determining How and Where to Store the&nbsp;String</h3><p>As mentioned above, there&#8217;s a fair amount of translation that must happen between the client application and the heap factory, where the object is actually created. The bulk of this work is performed in <a href="https://github.com/v8/v8/blob/8.8.276/src/api/api.cc"><code>src/api/api.cc</code></a>.</p><p>Let&#8217;s start with the client application&#8217;s call:</p><pre><code>String::NewFromUtf8Literal(isolate, "1 + 1");</code></pre><p>The first argument is for an &#8220;Isolate&#8221; which is V8&#8217;s main internal data structure representing the state of the run-time system, isolated from other possible V8 instances. To understand this, think about having multiple browser windows open, where each window has a completely separate instance of V8 running, each with its own isolated heap. We won&#8217;t talk about the <code>isolate</code> argument much, other than noting that a very large number of API calls expect this parameter.</p><p>The <code>String::NewFromUtf8Literal</code> method (see <a href="https://github.com/v8/v8/blob/2f384600ac9bef4d8142d5b599590ea490550811/src/api/api.cc#L6437"><code>src/api/api.cc</code></a>) starts with basic string length checking, but also decides how to store the string in memory. Given that we only provided two arguments to the call, the third <code>type</code> argument defaults to <code>NewStringType::kNormal</code>, indicating the string should be allocated as a regular object on the heap. The alternative would have been to pass <code>NewStringType::kInternalized</code> indicating that de-duplication of the string is desired. This feature is very useful to avoid storing multiple copies of the same constant string.</p><p>The next step is to call the <code>NewString</code> method (see <a href="https://github.com/v8/v8/blob/2f384600ac9bef4d8142d5b599590ea490550811/src/api/api.cc#L6384"><code>src/api/api.cc</code></a>) which invokes <code>factory-&gt;NewStringFromUtf8(string)</code>. Note that <code>string</code> here has been mapped into an internal <code>Vector</code> data structure, instead of a regular C++ string, because the heap factory has quite a different set of methods than the external API. This difference will become more apparent later when the return value is passed back to the client application.</p><p>Inside <code>NewStringFromUtf8</code> (see <a href="https://github.com/v8/v8/blob/2f384600ac9bef4d8142d5b599590ea490550811/src/heap/factory.cc#L606"><code>src/heap/factory.cc</code></a>), a decision is made on the optimal storage format for the string. Naturally, UTF-8 is a convenient format for storing a wide range of Unicode characters, but when only basic ASCII characters are used (such as<code>1 + 1</code>) V8 stores the string in &#8220;one byte&#8221; format. To make this decision, the string&#8217;s characters are passed into <code>Utf8Decoder decoder(utf8_data)</code> (declared in <a href="https://github.com/v8/v8/blob/2f384600ac9bef4d8142d5b599590ea490550811/src/strings/unicode-decoder.h#L51"><code>src/strings/unicode-decoder.h</code></a><code>)</code>&nbsp;.</p><p>Now that we&#8217;ve decided to allocate a one-byte string, using the normal (not internalized) approach, the next step is to invoke<code>NewRawOneByteString</code> (see <a href="https://github.com/v8/v8/blob/2f384600ac9bef4d8142d5b599590ea490550811/src/heap/factory-base.cc#L526"><code>src/heap/factory-base.cc</code></a>), where the heap memory is allocated, and the string&#8217;s content is written into that memory.</p><h3>The String&#8217;s In-Memory Structure</h3><p>Inside V8, our <code>1 + 1</code> string is represented as an instance of the <code>v8::internal::SeqOneByteString</code> class (see <a href="https://github.com/v8/v8/blob/2f384600ac9bef4d8142d5b599590ea490550811/src/objects/string.h#L565"><code>src/objects/string.h</code></a>). If you&#8217;re like most object-oriented developers, you&#8217;d expect that <code>SeqOneByteString</code> would have a number of public methods, as well as several private members, such as an array of characters or an integer storing the string&#8217;s length. However, that&#8217;s not the case! Instead, all internal object classes are actually just pointers to the heap where that data is stored.</p><p>As you can see from the code comment in <a href="https://github.com/v8/v8/blob/2f384600ac9bef4d8142d5b599590ea490550811/src/objects/objects.h#L39"><code>src/objects/objects.h</code></a>, there are roughly 150 internal classes that have the common parent class of <code>v8::internal::Object</code>. Each of these classes consists solely of a single 8-byte value (on a 64-bit machine) referring to the object&#8217;s heap location.</p><p>Keeping this in mind, here&#8217;s what our string looks like in memory:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!radi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe11584c6-484a-4ace-b8b0-d3bf6beed7da_800x551.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!radi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe11584c6-484a-4ace-b8b0-d3bf6beed7da_800x551.png 424w, https://substackcdn.com/image/fetch/$s_!radi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe11584c6-484a-4ace-b8b0-d3bf6beed7da_800x551.png 848w, https://substackcdn.com/image/fetch/$s_!radi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe11584c6-484a-4ace-b8b0-d3bf6beed7da_800x551.png 1272w, https://substackcdn.com/image/fetch/$s_!radi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe11584c6-484a-4ace-b8b0-d3bf6beed7da_800x551.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!radi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe11584c6-484a-4ace-b8b0-d3bf6beed7da_800x551.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e11584c6-484a-4ace-b8b0-d3bf6beed7da_800x551.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!radi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe11584c6-484a-4ace-b8b0-d3bf6beed7da_800x551.png 424w, https://substackcdn.com/image/fetch/$s_!radi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe11584c6-484a-4ace-b8b0-d3bf6beed7da_800x551.png 848w, https://substackcdn.com/image/fetch/$s_!radi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe11584c6-484a-4ace-b8b0-d3bf6beed7da_800x551.png 1272w, https://substackcdn.com/image/fetch/$s_!radi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe11584c6-484a-4ace-b8b0-d3bf6beed7da_800x551.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>There are many interesting parts here:</p><h4>The SeqOneByteString Object</h4><p>As mentioned, this is not a fully-fledged string class, but is instead a pointer to the on-heap location of the string&#8217;s actual content. On a 64-bit machine, this &#8220;pointer&#8221; will be an 8-byte <code>unsigned long</code>, with the type alias of <code>Address</code>. Note that the data on the heap (on the right side of the diagram) is not actually a real C++ object, so there&#8217;s no point in treating this <code>Address</code> as if it was a pointer to something strongly-typed (such as <code>String *</code>)</p><p>But, you might be wondering why the additional level of indirection exists in the first place. Why not simply access the heap block directly? This approach makes sense when you consider that garbage collection can result in objects being moved around the heap. It&#8217;s important that data can move, without the client application getting confused.</p><p>To clarify, in <em>Generational Garbage Collection</em>, objects are first allocated in the <em>Young Generation (New Space)</em>, and if they survive long enough, they&#8217;ll be moved to the <em>Old Generation (Old Space)</em>. To make this work, the garbage collector copies the heap block to the new heap space, then updates the <code>Address</code> value to point to the new memory location. Given that the <code>SeqOneByteString</code> object itself is still at exactly the same memory address as before, the client software won&#8217;t notice the change.</p><h4>Compressed Pointer To Map (Bytes 0&#8211;3 of Heap&nbsp;Block)</h4><p>JavaScript is a dynamically-typed language, which means that <em>variables</em> don&#8217;t have types, yet the <em>values stored in those variables</em> do have types. The &#8220;map&#8221; is V8&#8217;s way of associating each object in the heap with a description of the object&#8217;s data type. After all, if the object wasn&#8217;t tagged with its type, the heap block becomes a meaningless sequence of bytes.</p><p>We won&#8217;t go into much detail about the map for our <code>1 + 1</code> string, other than mentioning that maps are also a type of heap object, stored in the <em>Read Only Space</em>. Maps (also known as <a href="https://mathiasbynens.be/notes/shapes-ics">Shapes or Hidden Classes</a>) can become very complex, although our constant string uses a pre-defined map by calling <code>read_only_roots().one_byte_string_map()</code> (see <a href="https://github.com/v8/v8/blob/2f384600ac9bef4d8142d5b599590ea490550811/src/heap/factory-base.cc#L536"><code>src/heap/factory-base.cc</code></a>).</p><p>Interestingly, although this map field is a pointer to another heap object, it cleverly uses <a href="https://v8.dev/blog/pointer-compression">Pointer Compression</a> to store a 64-bit pointer value in a 32-bit field.</p><h4>Object Hash Value (Bytes 4&#8211;7 of Heap&nbsp;Block)</h4><p>Every object has an internal hash value, but in this example it defaults to <code>kEmptyHashField</code> (value of 3) to indicate the hash is not yet computed.</p><h4>String Length (Bytes 8&#8211;11 of Heap&nbsp;Block)</h4><p>This is the number of bytes in the string (5).</p><h4>The Characters and the Padding (Bytes 12&#8211;19 of Heap&nbsp;Block)</h4><p>As you&#8217;d expect, the five single-byte characters are stored next. Additionally, to ensure that future heap objects are aligned based on the CPU&#8217;s architecture requirements, an additional three bytes of padding are added (aligning the object to a 4-byte boundary).</p><h3>Allocating Memory From the&nbsp;Heap</h3><p>We briefly mentioned that the factory class allocates a block of memory from the heap (20 bytes in our case), then fills that block with the object&#8217;s data. One remaining question is <em>how</em> that 20 bytes is allocated.</p><p>In <a href="https://en.wikipedia.org/wiki/Cheney%27s_algorithm">Cheney&#8217;s algorithm</a> for garbage collection, the <em>Young Generation (New Space)</em> is divided into two semi-spaces. To allocate a block of memory in the heap, the allocator determines if there are enough free bytes between the <code>limit</code> of the current semi-space, and the current <code>top</code> of that semi-space. If there&#8217;s enough room, the algorithm returns the address of the next block, then increments the <code>top</code> pointer by the requested number of bytes.</p><p>This basic case is shown here, showing the <em>before</em> and <em>after</em> states of the current semi-space:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0uIB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e00ef56-3730-48fb-b7ab-34b30e64d396_800x440.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0uIB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e00ef56-3730-48fb-b7ab-34b30e64d396_800x440.png 424w, https://substackcdn.com/image/fetch/$s_!0uIB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e00ef56-3730-48fb-b7ab-34b30e64d396_800x440.png 848w, https://substackcdn.com/image/fetch/$s_!0uIB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e00ef56-3730-48fb-b7ab-34b30e64d396_800x440.png 1272w, https://substackcdn.com/image/fetch/$s_!0uIB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e00ef56-3730-48fb-b7ab-34b30e64d396_800x440.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0uIB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e00ef56-3730-48fb-b7ab-34b30e64d396_800x440.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7e00ef56-3730-48fb-b7ab-34b30e64d396_800x440.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0uIB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e00ef56-3730-48fb-b7ab-34b30e64d396_800x440.png 424w, https://substackcdn.com/image/fetch/$s_!0uIB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e00ef56-3730-48fb-b7ab-34b30e64d396_800x440.png 848w, https://substackcdn.com/image/fetch/$s_!0uIB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e00ef56-3730-48fb-b7ab-34b30e64d396_800x440.png 1272w, https://substackcdn.com/image/fetch/$s_!0uIB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e00ef56-3730-48fb-b7ab-34b30e64d396_800x440.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>If the current semi-space were to run out of free memory (<code>top</code> and <code>limit</code> get too close), then the <em>collection</em> portion of Cheney&#8217;s algorithm starts. Once collection is complete, all the <em>live</em> objects will have been copied to the beginning of the second semi-space, and all <em>dead</em> objects (remaining in the first semi-space) will be discarded. No matter what, a semi-space is guaranteed to have all its <em>used</em> space at the bottom, and all its <em>free</em> space at the top, so it&#8217;ll always look like the above diagram.</p><p>In our case though, there&#8217;s plenty of free memory in the current semi-space, so we carve off 20 bytes, then increase the <code>top</code> pointer. There&#8217;s no need for garbage collection, and the second semi-space isn&#8217;t involved. In the V8 code, there are numerous special cases to consider, but the final allocation of 20 bytes is handled by the <code>NewSpace::AllocateFastUnaligned</code> method in <a href="https://github.com/v8/v8/blob/2f384600ac9bef4d8142d5b599590ea490550811/src/heap/new-spaces-inl.h#L109"><code>src/heap/new-spaces-inl.h</code></a>.</p><h3>Returning a&nbsp;Handle</h3><p>Now that we have a pointer to a heap block, fully populated with the string&#8217;s content (including length, hash, and map) the pointer must be returned to the client application. If you recall, the client invoked this line of code:</p><pre><code>Local&lt;String&gt; source = String::NewFromUtf8Literal(isolate, "1 + 1");</code></pre><p>But, what exactly is the type of <code>source</code>, and what does <code>Local&lt;String&gt;</code> actually mean? There are two key observations here:</p><h4>Translating Internal to External&nbsp;Classes</h4><p>First, it&#8217;s interesting to recall that V8 stored our string object using the <code>v8::internal::SeqOneByteString</code> class, which is simply a pointer to the data on the heap. However, the client application expects the data to be of type <code>v8::String</code>, which is part of the V8 API.</p><p>What may surprise you is that <code>v8::internal::SeqOneByteString</code> (a subclass of <code>v8::internal::String</code>) is in a completely different class hierarchy than <code>v8::String</code>. In fact, all of the internal classes are defined in the <a href="https://github.com/v8/v8/tree/8.8.276/src/objects"><code>src/objects</code></a> directory using the <code>v8::internal</code> namespace, whereas the external classes are defined in <a href="https://github.com/v8/v8/blob/8.8.276/include/v8.h"><code>include/v8.h</code></a> using the <code>v8</code> namespace.</p><p>Revisiting the <code>NewFromUtf8Literal</code> method we discussed earlier (see <a href="https://github.com/v8/v8/blob/2f384600ac9bef4d8142d5b599590ea490550811/src/api/api.cc#L6447"><code>src/api/api.cc</code></a>), the very last step before returning the object pointer to the client application is to cast the result from a <code>v8::internal::String</code> to a <code>v8::String</code>.</p><pre><code>return Utils::ToLocal(handle_result);</code></pre><p>This conversion magic is done using macros defined in <a href="https://github.com/v8/v8/blob/2f384600ac9bef4d8142d5b599590ea490550811/src/api/api-inl.h#L74"><code>src/api/api-inl.h</code></a>.</p><h4>Managing the &#8220;Roots&#8221; for Garbage Collection</h4><p>Second, let&#8217;s discuss what <code>Local&lt;String&gt;</code> means (which incidentally is an abbreviation for <code>v8::Local&lt;v8::String&gt;</code>). The <em>Local</em> concept is how we deal with garbage collection of the string object when it&#8217;s no longer needed.</p><p>As any JavaScript developer will know, objects are garbage collected when there are no remaining references to them. The collection algorithm starts at the &#8220;roots&#8221;, then traverses the entire heap to find all reachable objects. A root is a non-heap reference, such as a global variable, or a stack-based local variable that&#8217;s still in scope. If these variables are assigned new values, or if they were to go out of scope (their enclosing function ends), the data they once pointed to is potentially now garbage.</p><p>In the case of the<code>hello-world.cc</code> program, we also have pointers on the C++ stack that can refer to heap objects. These have no corresponding JavaScript variable name, since they only exist in the context of the C++ application (such as <code>hello-world.cc</code>, or Chrome, or NodeJS). For example:</p><pre><code>Local&lt;String&gt; source = ...</code></pre><p>In this case, <code>source</code> is a reference to a heap object, although now with an additional level of indirection. This diagram will explain:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VXho!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd16e479-b4cd-4a9d-85b0-d33235de7f15_800x732.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VXho!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd16e479-b4cd-4a9d-85b0-d33235de7f15_800x732.png 424w, https://substackcdn.com/image/fetch/$s_!VXho!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd16e479-b4cd-4a9d-85b0-d33235de7f15_800x732.png 848w, https://substackcdn.com/image/fetch/$s_!VXho!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd16e479-b4cd-4a9d-85b0-d33235de7f15_800x732.png 1272w, https://substackcdn.com/image/fetch/$s_!VXho!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd16e479-b4cd-4a9d-85b0-d33235de7f15_800x732.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VXho!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd16e479-b4cd-4a9d-85b0-d33235de7f15_800x732.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bd16e479-b4cd-4a9d-85b0-d33235de7f15_800x732.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VXho!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd16e479-b4cd-4a9d-85b0-d33235de7f15_800x732.png 424w, https://substackcdn.com/image/fetch/$s_!VXho!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd16e479-b4cd-4a9d-85b0-d33235de7f15_800x732.png 848w, https://substackcdn.com/image/fetch/$s_!VXho!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd16e479-b4cd-4a9d-85b0-d33235de7f15_800x732.png 1272w, https://substackcdn.com/image/fetch/$s_!VXho!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd16e479-b4cd-4a9d-85b0-d33235de7f15_800x732.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>On the left side is the C++ stack, which grows from top to bottom as the program executes, and the right side is the heap memory we saw earlier. As the client application executes, it pushes a <code>HandleScope</code> object onto the local C++ stack (see <a href="https://github.com/v8/v8/blob/2f384600ac9bef4d8142d5b599590ea490550811/samples/hello-world.cc#L29"><code>src/samples/hello-world.cc</code></a>). Next, the return value from calling <code>String::NewFromUtf8Literal()</code> is stored on the C++ stack as a <code>Local&lt;String&gt;</code> object.</p><p>It looks like we&#8217;ve add yet another level of indirection, but there are benefits of doing this:</p><ul><li><p><strong>Finding roots is easier</strong>&#8202;&#8212;&#8202;The <code>HandleScope</code> object is a place to store &#8220;Handles&#8221; (aka pointers) to heap objects. As you recall, this is exactly what our <code>SeqOneByteString</code> object was, an 8-byte pointer to the underlying heap data. When garbage collection is initiated, V8 quickly scans the <code>HandleScope</code> object to find all the root pointers. It can then update those pointers if the underlying heap data is moved.</p></li><li><p><strong>Local pointers are easy to manage</strong>&#8202;&#8212;&#8202;In contrast to <code>HandleScope</code> which is quite large, the <code>Local&lt;String&gt;</code> object is an 8-byte value on the C++ stack, which can be used in the same context as any other 8-byte value, such as pointers or integers. In particular, it can be stored in CPU registers, be passed to functions, or provided as return values. What&#8217;s notable is that the garbage collector is not required to locate or update these values when garbage collection occurs.</p></li><li><p><strong>Eliminating scopes is easy</strong>&#8202;&#8212;&#8202;Finally, when the C++ function in the client application finishes, the <code>HandleScope</code> and <code>Local</code> objects on the C++ stack are removed, but only after their C++ object destructors have been called. These destructors remove all the handles from the garbage collector&#8217;s list of roots. They&#8217;re no longer in scope, so the underlying heap objects may have become garbage.</p></li></ul><p>To conclude the story, the <code>source</code> variable, referring to our <code>1 + 1</code> string is now ready to be passed to the next line in our client application:</p><pre><code>Local&lt;Script&gt; script = 
    Script::Compile(context, source).ToLocalChecked();</code></pre><h3>Next Time&#8230;</h3><p>There was clearly a lot of work to allocate the <code>1 + 1</code> string on the heap. Hopefully it illustrated some parts of V8&#8217;s internal architecture, as well as how data is represented in different parts of the system. In future blog posts, I&#8217;ll look more into how our simple expression is parsed and executed, which will expose a lot more about how V8 operates.</p><p>In <a href="https://medium.com/compilers/calculating-1-1-in-javascript-part-2-e01f336503d0">Part 2 of this blog post series</a>, I&#8217;ll dig into how the <em>Compilation Cache</em> works, to avoid compiling code more than necessary.</p>]]></content:encoded></item><item><title><![CDATA[Testing the V8 JavaScript Engine]]></title><description><![CDATA[In this blog post, I&#8217;ll summarize the different test suites included with the source code for the V8 JavaScript engine.]]></description><link>https://www.petersmith.net/p/testing-the-v8-javascript-engine-cbda7d9272e6</link><guid isPermaLink="false">https://www.petersmith.net/p/testing-the-v8-javascript-engine-cbda7d9272e6</guid><dc:creator><![CDATA[Peter Smith]]></dc:creator><pubDate>Sun, 27 Sep 2020 19:46:36 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/bfc2c16a-3cab-4d23-bf8b-8978ad2f4710_800x468.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>I&#8217;m a compiler enthusiast, who has been learning how the<a href="https://v8.dev/"> V8 JavaScript Engine</a> works. Of course, the best way to learn something is to write about it, so that&#8217;s why I&#8217;m sharing my experiences here. I hope this might be interesting to others too.</em></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nvbo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c8e810f-788e-4d74-90bf-02667fa572a0_800x468.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nvbo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c8e810f-788e-4d74-90bf-02667fa572a0_800x468.png 424w, https://substackcdn.com/image/fetch/$s_!nvbo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c8e810f-788e-4d74-90bf-02667fa572a0_800x468.png 848w, https://substackcdn.com/image/fetch/$s_!nvbo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c8e810f-788e-4d74-90bf-02667fa572a0_800x468.png 1272w, https://substackcdn.com/image/fetch/$s_!nvbo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c8e810f-788e-4d74-90bf-02667fa572a0_800x468.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nvbo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c8e810f-788e-4d74-90bf-02667fa572a0_800x468.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2c8e810f-788e-4d74-90bf-02667fa572a0_800x468.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nvbo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c8e810f-788e-4d74-90bf-02667fa572a0_800x468.png 424w, https://substackcdn.com/image/fetch/$s_!nvbo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c8e810f-788e-4d74-90bf-02667fa572a0_800x468.png 848w, https://substackcdn.com/image/fetch/$s_!nvbo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c8e810f-788e-4d74-90bf-02667fa572a0_800x468.png 1272w, https://substackcdn.com/image/fetch/$s_!nvbo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c8e810f-788e-4d74-90bf-02667fa572a0_800x468.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>Given how widespread the <a href="https://v8.dev/">V8 JavaScript Engine</a> has become, being a major part of Google Chrome, Microsoft Edge, and NodeJS, it&#8217;s obviously important to test it carefully. In this blog post, I&#8217;ll summarize the different test suites included with the V8 source code.</p><p>If you&#8217;re following along at home, you&#8217;ll find these test suites in the <code>v8/test</code> directory within the <a href="https://chromium.googlesource.com/v8/v8">V8 source code repository</a>. Each subdirectory within <code>v8/test</code> is considered a test suite if it contains a <code>testcfg.py</code> file (not all of them do), although I excluded a few suites that don&#8217;t seem to do much. Each suite can be invoked with the&nbsp;<code>./tools/run-tests.py</code> command.</p><pre><code>% ls -1 test/*/testcfg.py</code></pre><pre><code>test/benchmarks/testcfg.py
test/cctest/testcfg.py
test/debugger/testcfg.py
...
test/test262/testcfg.py
test/unittests/testcfg.py
test/wasm-api-tests/testcfg.py
test/wasm-js/testcfg.py
test/wasm-spec-tests/testcfg.py
test/webkit/testcfg.py</code></pre><pre><code>% ./tools/run-tests.py --outdir=out/x64.release benchmarks</code></pre><pre><code>Build found: /Users/peter_smith/CompilerProjects/v8/out/x64.release
&gt;&gt;&gt; Autodetected:
pointer_compression
&gt;&gt;&gt; Running tests for x64.release
&gt;&gt;&gt; Running with test processors
[00:06|% 100|+  55|-   0]: Done                                               
&gt;&gt;&gt; 55 base tests produced 55 (100%) non-filtered tests
&gt;&gt;&gt; 55 tests ran</code></pre><p>We&#8217;ll do a quick tour of all 15 test suites in the <code>v8/test</code> directory:</p><ul><li><p><code>benchmarks</code>&#8202;&#8212;&#8202;Standard performance tuning benchmarks.</p></li><li><p><code>test262</code>&#8202;&#8212;&#8202;Conformance tests against the ECMAScript specification.</p></li><li><p><code>mjsunit</code>&#8202;&#8212;&#8202;Unit tests written in JavaScript.</p></li><li><p><code>cctest</code>/ <code>unittests</code>&#8212; C++ unit tests for internal V8 classes.</p></li><li><p><code>fuzzer</code>&#8202;&#8212;&#8202;Input fuzzer tests providing invalid input, possibly crashing V8.</p></li><li><p><code>intl</code>&#8202;&#8212;&#8202;Tests for Internationalization features of ECMAScript.</p></li><li><p><code>message</code>&#8202;&#8212;&#8202;Validates error messages produced by invalid JavaScript code.</p></li><li><p><code>webkit</code>&#8202;&#8212;&#8202;Test cases borrowed from the WebKit JavaScript Engine.</p></li><li><p><code>mozilla</code> &#8212;Test cases borrowed from the Mozilla JavaScript Engine.</p></li><li><p><code>wasm-js</code>&#8202;&#8212;&#8202;Validation of WebAssembly, using the JavaScript API.</p></li><li><p><code>wasm-api-tests</code>&#8202;&#8212;&#8202;Validation of WebAssembly, using the C++ API.</p></li><li><p><code>wasm-spec-tests</code>&#8202;&#8212;&#8202;Conformance to the WebAssembly specification.</p></li><li><p><code>inspector</code>&#8202;&#8212;&#8202;Validates the V8 inspector interface (for debugging)</p></li><li><p><code>debugger</code>&#8202;&#8212;&#8202;Validates the built-in debugger command.</p></li></ul><p>If you refer to my previous <a href="https://medium.com/compilers/v8-javascript-engine-compiling-with-gn-and-ninja-8673e7c5e14a">blog post on building V8 from source code</a>, you&#8217;ll know that <code>run-tests.py</code> is invoked by the <code>gm.py</code> build script. All of the test suites depend on binary executables first being compiled. Many suites use the <code>d8</code> executable (a simple JavaScript command shell) for executing JavaScript programs and validating the results. However, other test suites such as the code-level unit tests, require a special purpose test driver.</p><p>Let&#8217;s dig into the detail&#8230;</p><h3>Test Suite: benchmarks</h3><ul><li><p><strong>Run time: 32 seconds (single threaded with <code>run-tests.py -j 1</code> on a 2015 MacBook Pro)</strong></p></li><li><p><strong>Test binary: <code>d8</code></strong></p></li></ul><p>This first test suite is focused on three performance-tuning benchmarks. The goal of each benchmark is to provide a comparison between different JavaScript engines when faced with typical code scenarios, such as processing JSON input, decompressing data, or rendering graphics. The competing JavaScript engines (such as V8, <a href="https://en.wikipedia.org/wiki/WebKit#JavaScriptCore">JavaScriptCore</a>, or <a href="https://en.wikipedia.org/wiki/SpiderMonkey">SpiderMonkey</a>) are evaluated side-by-side to show how quickly they can compile and evaluate each benchmark. As a result, a lot of time has been spent on optimizing V8 to out-perform the competing JavaScript engines.</p><p>Unfortunately, <a href="https://v8.dev/blog/retiring-octane">experience shows</a> that relying too much on specific benchmarks leads to over-fitting of the optimizations, with too much emphasis placed on the exact benchmark code. More recently, effort has been put into optimizing <a href="https://v8.dev/blog/real-world-performance">for real-world scenarios</a> that are more representative of a web browser&#8217;s overall workload. For example, by observing the loading time for common applications such as Facebook, or Google Maps, optimizations will be more applicable to everyday use.</p><p>Inside the <code>v8/test/benchmarks</code> directory, there is code for three important benchmarks, each having their own unique origin:</p><ul><li><p><strong>The &#8220;SunSpider&#8221; Benchmark</strong>&#8202;&#8212;&#8202;Originally created by Apple in 2007, as part of their <a href="https://webkit.org/">WebKit</a> project, the <a href="https://webkit.org/perf/sunspider/sunspider.html">SunSpider</a> benchmark focuses on intensive algorithms such as cryptography, string manipulation, and ray tracing. According to their website, this benchmark is no longer supported (as of 2015) and has been replaced by the <a href="https://browserbench.org/JetStream/">JetStream</a> benchmark.</p></li><li><p><strong>The &#8220;Kraken&#8221; Benchmark</strong>&#8202;&#8212;&#8202;Created as part of the Mozilla project in 2010, the <a href="https://wiki.mozilla.org/Kraken">Kraken</a> benchmark also focuses on complex algorithms that were extracted from real-world workloads (but are not the full workload itself). Kraken still appears to be maintained, and can even be <a href="https://krakenbenchmark.mozilla.org/">executed in your browser</a>.</p></li><li><p><strong>The &#8220;Octane&#8221; Benchmark&#8202;&#8212;&#8202;</strong>First released by Google in 2012, and then retired in 2017, the <a href="https://developers.google.com/octane/">Octane</a> benchmark similarly focuses on computationally complex algorithms. It can also be <a href="http://chromium.github.io/octane/">executed inside a browser</a>.</p></li></ul><p>Let&#8217;s now look at how V8 is tested for conformance against the ECMAScript specification.</p><h3>Test Suite:&nbsp;test262</h3><ul><li><p><strong>Run time: 37 minutes (also single threaded, on a 2015 MacBook Pro)</strong></p></li><li><p><strong>Test binary: <code>d8</code></strong></p></li></ul><p>The official name for the JavaScript language is <em><a href="https://en.wikipedia.org/wiki/ECMAScript">ECMAScript</a></em>, where <em>JavaScript </em>is more of a marketing name. The <a href="https://www.ecma-international.org/publications/standards/Ecma-262.htm">ECMA262 standard</a> provides an exact specification of the language and standard libraries. All JavaScript engines are required to conform to this standard, with all major browser vendors being involved in the ECMA262 committee (<a href="https://www.ecma-international.org/memento/tc39.htm">known as TC39</a>). Obviously, loading a web page into Chrome must have the same effect as loading it into Firefox, Safari, or Edge, so conforming to this specification is vital.</p><p>To help with this conformance, a test suite known as <em><a href="https://github.com/tc39/test262">Test262</a></em> has been created. This is a browser-agnostic test suite maintained by supporters of the ECMA262 standard. Test262 contains test cases to validate the ECMAScript language and libraries, the Internationalization API, and the JSON Data Interchange Format. Although the maintainers claim there&#8217;s always room for improvement, Test262 does an excellent job of validating conformance to ECMA262.</p><p>At the language level, every aspect of the specification is covered, including grammar definition, expressions, statements, modules, and pretty much everything else. To be specific, Test262 contains 43665 individual JavaScript test files, resulting in 74677 test cases that are run through the <code>d8</code> interpreter. On my machine, executing these test cases took 37 minutes, using a single CPU core.</p><p>As an example, the following test case validates the new <em>Optional Chain</em> feature in JavaScript. In the header comment, a reference is made to the exact part of the ECMAScript specification, showing how optional chains can appear within loops (in this case, within a <code>for-in</code> statement):</p><pre><code>/*---
esid: prod-OptionalExpression
description: &gt;
  optional chain in test portion of do while statement
info: |
  IterationStatement
    for (LeftHandSideExpression in Expression) Statement
features: [optional-chaining]
---*/</code></pre><pre><code>const obj = {
    inner: {
        a: 1,
        b: 2
    }
};</code></pre><pre><code>let str = '';
for (const key in obj?.inner) {
    str += key;
}</code></pre><pre><code>assert.sameValue('ab', str);</code></pre><p>When this&nbsp;<code>.js</code> file is passed into V8 (specifically the <code>d8</code> executable), the code snippet executes, with the final <code>assert.sameValue</code> validating whether the behaviour was correct or not.</p><h3>Test Suite:&nbsp;mjsunit</h3><ul><li><p><strong>Run time: 3 minutes 37 seconds</strong></p></li><li><p><strong>Test binary: <code>d8</code></strong></p></li></ul><p>The <code>mjsunit</code> suite is similar to Test262, although was specifically written for V8 rather than being browser-agnostic. There are 5068 test cases implemented in&nbsp;<code>.js</code> or&nbsp;<code>.mjs</code> files, taking several minutes to execute.</p><p>As an example, the <code>function-arguments-duplicate.js</code> file validates V8&#8217;s behaviour when the same parameter name is used twice in the same function.</p><pre><code>function f(a, a) {
  assertEquals(2, a);
  assertEquals(1, arguments[0]);
  assertEquals(2, arguments[1]);
  assertEquals(2, arguments.length);
  %HeapObjectVerify(arguments);
}</code></pre><pre><code>f(1, 2);</code></pre><p>The familiar <code>assertEquals</code> function is available (as are many other matchers), and this particular code shows the <code>%HeapObjectVerify</code> function which is built-in to V8, but not very well documented.</p><h3>Test Suite:&nbsp;cctest</h3><ul><li><p><strong>Run time: 5 minutes 40 seconds</strong></p></li><li><p><strong>Test binary: <code>cctest</code></strong></p></li></ul><p>This test suite contains almost 7000 unit tests, spread across 246 C++ files. These are code-level unit tests directly invoking methods within the V8 core. As such, a special <code>cctest</code> executable is first compiled, which is in contrast to other test suites relying on the <code>d8</code> executable to parse and validate JavaScript code.</p><p>Each C++ file has one or more test methods, each conforming to the<code>TEST(testName)</code> signature. Methods call the V8 internal classes, then use macros such as <code>CHECK</code>, <code>CHECK_EQ</code>, or<code>CHECK_GE</code> to validate the results. For example, in <code>test-heap.cc</code>:</p><pre><code>TEST(InitialObjects) {
   LocalContext env;
   HandleScope scope(CcTest::i_isolate());
   Handle&lt;Context&gt; context = v8::Utils::OpenHandle(*env);</code></pre><pre><code>   // Initial ArrayIterator prototype.
   CHECK_EQ(
      context-&gt;initial_array_iterator_prototype(),
      *v8::Utils::OpenHandle(*CompileRun("[][Symbol.iterator]
      ().__proto__")));</code></pre><pre><code>   ...
   
   // Initial Object prototype.
   CHECK_EQ(context-&gt;initial_object_prototype(),
      *v8::Utils::OpenHandle(*CompileRun("Object.prototype")));
}</code></pre><p>If a test fails, a stack trace is displayed, making it easy to debug the problem.</p><p>The <code>run-tests.py</code> script allows invocation of individual test cases. For example:&nbsp;<code>./tools/run-tests.py cctest/test-code-pages/*</code> runs the seven test methods in <code>test-code-pages.cc</code>, whereas&nbsp;<code>./tools/run-tests.py cctest/test-code-pages/OptimizedCodeWithCodePages</code> invokes only that single test case.</p><p>These unit tests are clearly designed with developers in mind. Each test provides concise examples of how to call the V8 APIs, as well as the internal methods and data structures. I&#8217;ve found these particular test cases to be invaluable for learning the V8 internals. I&#8217;m sure I&#8217;ll be writing more about them in the future.</p><h3>Test Suite: unittests</h3><ul><li><p><strong>Run time: 2 minutes 58 seconds</strong></p></li><li><p><strong>Test binary: <code>unittests</code></strong></p></li></ul><p>The <code>unittests</code> suite is very similar to the <code>cctest</code> suite, providing 3763 test cases spread across 237 different C++ source files. It&#8217;s actually not clear if there&#8217;s any fundamental difference between the two suites, although perhaps the distinction is purely historical.</p><h3>Test Suite:&nbsp;fuzzer</h3><ul><li><p><strong>Run time: unknown</strong></p></li><li><p><strong>Test binary: many (see below)</strong></p></li></ul><p>The <code>fuzzer</code> test suite allows for <a href="https://en.wikipedia.org/wiki/Fuzzing">fuzz testing</a> of the input passed into V8. These tests randomly modify valid JavaScript programs, surgically generating <em>invalid </em>inputs in the hopes of crashing the V8 engine. This doesn&#8217;t just cause a JavaScript-level exception, but could instead cause corruption in the actual C++ code, possibly allowing for security breaches.</p><p>For example, to identify a potential bug in V8&#8217;s <em>expression evaluation</em>, the fuzzer modifies only <em>that</em> part of the code, yet provides valid input for the remainder of the program. Starting with valid code:</p><pre><code>function f(a, b) {
  console.log(a + b)
}</code></pre><p>the fuzzer creates an erroneous program:</p><pre><code>function f(a, b) {
  console.log(a + %)
}</code></pre><p>All portions of this code before, and after, the <code>a + %</code> expression must be valid, otherwise V8 simply rejects the program before reaching the expression evaluation code. For more detail, see how <a href="https://chromium.googlesource.com/chromium/src/+/master/testing/libfuzzer/README.md">fuzzing is done</a> with Chromium.</p><p>This <code>fuzzer</code> test suite generates several different executable programs, to support the range of different input-types that can be fuzzed.</p><pre><code>v8_simple_json_fuzzer
v8_simple_multi_return_fuzzer
v8_simple_parser_fuzzer
v8_simple_regexp_builtins_fuzzer
v8_simple_regexp_fuzzer
v8_simple_wasm_async_fuzzer
v8_simple_wasm_code_fuzzer
v8_simple_wasm_compile_fuzzer
v8_simple_wasm_fuzzer</code></pre><p>Each of these executables has a main program (a C++ file), taking the fuzzed input and passing it into V8 using the necessary test fixtures, such as providing JSON as input, a regular expression, or WASM code.</p><pre><code>int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {</code></pre><pre><code>   ... code for calling V8 functions that might crash ...</code></pre><pre><code>   return 0;
}</code></pre><p>If the C++ function returns 0, the program ran (or was rejected) correctly, but if the fuzz-attack was successful, the V8 engine would have already crashed.</p><p>Clearly this type of testing could take a very long time to execute, especially with all the possible ways of mutating the input. Therefore tests are performed on a <a href="https://google.github.io/clusterfuzz/">large test cluster</a>, running for an extended period of time (many hours or days).</p><h3>Test Suite:&nbsp;intl</h3><ul><li><p><strong>Run time: 9 seconds</strong></p></li><li><p><strong>Test binary: <code>d8</code></strong></p></li></ul><p>This test suite, consisting of 218 JavaScript source files, performs validation of the <em>internationalization </em>features of V8. For example, there are test cases for time and date formats, time zone manipulation, character set collation (sort orders), as well as numeric data formats. Each test case uses JavaScript functions such as <code>assertEquals</code> or <code>assertFalse</code> to validate their results.</p><h3>Test Suite:&nbsp;message</h3><ul><li><p><strong>Run time: 8 seconds</strong></p></li><li><p><strong>Test binary: <code>d8</code></strong></p></li></ul><p>This test suite provides validation of error messages. Each test case has a single&nbsp;<code>.js</code> (or&nbsp;<code>.mjs</code>) file, and a corresponding&nbsp;<code>.out</code> file sharing the same base file name. For example, here&#8217;s the content of <code>arrow-formal-parameters.js</code>, which contains invalid JavaScript code.</p><pre><code>(b, a, a, d) =&gt; a</code></pre><p>and the corresponding <code>arrow-formal-parameters.out</code> file specifies the expected error message when <code>arrow-formal-parameters.js</code> is passed through the <code>d8</code> interpreter.</p><pre><code>*%(basename)s:5: SyntaxError: Duplicate parameter name not allowed in this context
(b, a, a, d) =&gt; a
       ^
SyntaxError: Duplicate parameter name not allowed in this context</code></pre><p>If the actual output doesn&#8217;t match the expected output, the test case is considered a failure. A very simple, yet very effective test suite.</p><h3>Test Suite:&nbsp;webkit</h3><ul><li><p><strong>Run time: 20 seconds</strong></p></li><li><p><strong>Test binary: <code>d8</code></strong></p></li></ul><p>This test suite is borrowed from the <a href="https://webkit.org/">WebKit</a> project, the basis of the Safari Browser. It consists of 543 JavaScript files (with&nbsp;<code>.js</code> suffix), each paired with a corresponding <code>-expected.txt</code> file. Each JavaScript file is passed through the <code>d8</code> executable, with the actual console output being captured and compared against the expected output.</p><h3>Test Suite:&nbsp;mozilla</h3><ul><li><p><strong>Run time: unknown</strong></p></li><li><p><strong>Test binary: <code>d8</code></strong></p></li></ul><p>This test suite appears to be a clone of the regression testing for the Mozilla JavaScript tests. According to the <a href="https://chromium.googlesource.com/v8/deps/third_party/mozilla-tests.git">repository commits</a>, the snapshot taken from Mozilla is at least five years old, possibly even ten years old.</p><p>This repository contains 3481 individual&nbsp;<code>.js</code> files, as well as some&nbsp;<code>.java</code> files! After running the test suite, <code>run-tests.py</code> reported that 1921 test were executed, although it also showed a number of test failures, with a total completion of 0%. I suspect this test suite isn&#8217;t actively maintained.</p><h3>Test Suite:&nbsp;wasm-js</h3><ul><li><p><strong>Run time: 16 seconds</strong></p></li><li><p><strong>Test binary: <code>d8</code></strong></p></li></ul><p>This suite validates the standard <code>WebAssembly</code> object, used for accessing the <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/WebAssembly">WebAssembly functionality</a> within V8. There are 94 JavaScript source files, each exercising the <code>WebAssembly</code> object in some way.</p><h3>Test Suite: wasm-api-tests</h3><ul><li><p><strong>Run time: 1 second</strong></p></li><li><p><strong>Test binary: <code>wasm_api_tests</code></strong></p></li></ul><p>Similar to the previous test cases, these validate the WebAssembly functionality within V8. However, rather than expressing the tests in JavaScript (using the <code>WebAssembly</code> object), they directly call V8&#8217;s C++ API. There are 17 such test cases in this suite.</p><h3>Test Suite: wasm-spec-tests</h3><ul><li><p><strong>Run time: 22 seconds</strong></p></li><li><p><strong>Test binary: <code>d8</code></strong></p></li></ul><p>This third WebAssembly-related test suite provides 190 different JavaScript source files, and the same number of matching&nbsp;<code>.wast</code> files (a human-readable <a href="https://webassembly.github.io/spec/core/text/index.html">WebAssembly format</a>). Here&#8217;s an example of this format:</p><pre><code>(module
  (memory 1)
  (data (i32.const 0) "abcdefghijklmnopqrstuvwxyz")
  (func (export "8u_good1") (param $i i32) (result i32)
    (i32.load8_u offset=0 (local.get $i))             ;; 97 'a'
  )
  ...
)</code></pre><p>Presumably, these test cases are derived from the <a href="https://webassembly.github.io/spec/core/">WebAssembly specification</a>.</p><h3>Test Suite: inspector</h3><ul><li><p><strong>Run time: 12 seconds</strong></p></li><li><p><strong>Test binary: <code>inspector-test</code></strong></p></li></ul><p>This suite validates the <a href="https://v8.dev/docs/inspector">Inspector Protocol</a>, used by external debuggers (such as Chrome DevTools) to inspect and control the state of the JavaScript engine. There are 282 individual test cases (with&nbsp;<code>.js</code> file extension) paired up with the same number of <code>-expected.txt</code> files. The JavaScript file is executed, and the expected output is compared with the actual behaviour.</p><p>For example, here&#8217;s the content of the <code>scoped-variables.js</code> test case, showing how snippets of code can be injected into a running V8 engine:</p><pre><code>InspectorTest.log('Evaluating \'let a = 42;\'');
var {result:{result}} = await Protocol.Runtime.evaluate({  
    expression:'let a = 42;'});
InspectorTest.logMessage(result);</code></pre><pre><code>InspectorTest.log('Evaluating \'a\'');
var {result:{result}} = await Protocol.Runtime.evaluate({
    expression:'a'});
InspectorTest.logMessage(result);</code></pre><p>The output of this test run is compared against the expected output in <code>scope-variables-expected.txt</code>&nbsp;:</p><pre><code>Evaluating 'let a = 42;'
{
    type : undefined
}
Evaluating 'a'
{
    description : 42
    type : number
    value : 42
}</code></pre><p>As you&#8217;d expect, if the output doesn&#8217;t match, the Inspector Protocol has a failure.</p><h3>Test Suite:&nbsp;debugger</h3><ul><li><p><strong>Run time: 18 seconds</strong></p></li><li><p><strong>Test binary</strong>: <code>d8</code></p></li></ul><p>This final test suite validates the built-in debugger. All 316 JavaScript test files invoke the <code>debugger</code> command (or a similar feature) to halt the program execution. The script then uses the standard debugger features to introspect the state of the program, ensuring that breakpoint debugging works as expected.</p><h3>Summary</h3><p>That&#8217;s it! A total of 15 different test suites for validating various aspects of the V8 JavaScript engine. Some of the test suites are written in JavaScript, whereas others are written directly in C++. Some of the test suites were written by the V8 maintainers, whereas others were from third-parties. In all, these test suites are a major reason why V8 is such a high-quality and performant product.</p>]]></content:encoded></item><item><title><![CDATA[V8 JavaScript Engine: Compiling with GN and Ninja]]></title><description><![CDATA[I&#8217;m a compiler enthusiast, who has been learning how the V8 JavaScript Engine works. Of course, the best way to learn something is to&#8230;]]></description><link>https://www.petersmith.net/p/v8-javascript-engine-compiling-with-gn-and-ninja-8673e7c5e14a</link><guid isPermaLink="false">https://www.petersmith.net/p/v8-javascript-engine-compiling-with-gn-and-ninja-8673e7c5e14a</guid><dc:creator><![CDATA[Peter Smith]]></dc:creator><pubDate>Sun, 30 Aug 2020 16:14:06 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/f9278b7f-8c6f-4d2f-8fa7-13eec9c67028_800x466.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>I&#8217;m a compiler enthusiast, who has been learning how the<a href="https://v8.dev/"> V8 JavaScript Engine</a> works. Of course, the best way to learn something is to write about it, so that&#8217;s why I&#8217;m sharing my experiences here. I hope this might be interesting to others too.</em></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YWZC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4276bce4-8edc-4670-acdc-b520e6f4458e_800x466.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YWZC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4276bce4-8edc-4670-acdc-b520e6f4458e_800x466.png 424w, https://substackcdn.com/image/fetch/$s_!YWZC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4276bce4-8edc-4670-acdc-b520e6f4458e_800x466.png 848w, https://substackcdn.com/image/fetch/$s_!YWZC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4276bce4-8edc-4670-acdc-b520e6f4458e_800x466.png 1272w, https://substackcdn.com/image/fetch/$s_!YWZC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4276bce4-8edc-4670-acdc-b520e6f4458e_800x466.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YWZC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4276bce4-8edc-4670-acdc-b520e6f4458e_800x466.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4276bce4-8edc-4670-acdc-b520e6f4458e_800x466.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YWZC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4276bce4-8edc-4670-acdc-b520e6f4458e_800x466.png 424w, https://substackcdn.com/image/fetch/$s_!YWZC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4276bce4-8edc-4670-acdc-b520e6f4458e_800x466.png 848w, https://substackcdn.com/image/fetch/$s_!YWZC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4276bce4-8edc-4670-acdc-b520e6f4458e_800x466.png 1272w, https://substackcdn.com/image/fetch/$s_!YWZC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4276bce4-8edc-4670-acdc-b520e6f4458e_800x466.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>This first blog post is an overview of how V8 is compiled. As you can see from the <a href="https://chromium.googlesource.com/v8/v8">V8 source code repository</a>, the V8 Engine is mostly written in C++, requiring source code to be compiled into executable files. This should be no surprise, given that V8&#8217;s primary purpose is fast compilation and execution of JavaScript programs.</p><p>I&#8217;ll be discussing three main topics related to compiling the V8 executables:</p><ul><li><p>The <code>gm.py</code> wrapper script, providing a convenient approach to compile V8 from source, and for invoking the test suites.</p></li><li><p>The <a href="https://gn.googlesource.com/gn/">GN meta-build system</a> (invoked by <code>gm.py</code>) taking an easy-to-read description of the software components, then auto-generating a machine-readable build description suitable for the Ninja build tool.</p></li><li><p>Finally, the <a href="https://ninja-build.org/">Ninja build tool</a> uses that same machine-readable build description to analyze inter-file dependencies and invoke the relevant compilers.</p></li></ul><p>The earlier diagram (at the top of this blog post) illustrates the overall flow of tool invocation, and shows which files are <em>read</em>, <em>generated</em>, and <em>invoked</em>.</p><p>Let&#8217;s examine each step in detail. If you&#8217;re new to this type of compilation process, I&#8217;ll put in a shameless plug for this book on <a href="https://play.google.com/store/books/details/Peter_Smith_PhD_Software_Build_Systems?id=vQodKA7C-EAC">Software Build Systems</a>. It&#8217;s been almost ten years since I wrote that book, but the underlying concepts remain the same.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lmN-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bac9b98-19c7-440c-9940-fec97f82d201_95x95.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lmN-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bac9b98-19c7-440c-9940-fec97f82d201_95x95.png 424w, https://substackcdn.com/image/fetch/$s_!lmN-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bac9b98-19c7-440c-9940-fec97f82d201_95x95.png 848w, https://substackcdn.com/image/fetch/$s_!lmN-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bac9b98-19c7-440c-9940-fec97f82d201_95x95.png 1272w, https://substackcdn.com/image/fetch/$s_!lmN-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bac9b98-19c7-440c-9940-fec97f82d201_95x95.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lmN-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bac9b98-19c7-440c-9940-fec97f82d201_95x95.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8bac9b98-19c7-440c-9940-fec97f82d201_95x95.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lmN-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bac9b98-19c7-440c-9940-fec97f82d201_95x95.png 424w, https://substackcdn.com/image/fetch/$s_!lmN-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bac9b98-19c7-440c-9940-fec97f82d201_95x95.png 848w, https://substackcdn.com/image/fetch/$s_!lmN-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bac9b98-19c7-440c-9940-fec97f82d201_95x95.png 1272w, https://substackcdn.com/image/fetch/$s_!lmN-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bac9b98-19c7-440c-9940-fec97f82d201_95x95.png 1456w" sizes="100vw"></picture><div></div></div></a></figure></div><h3>The gm.py&nbsp;Script</h3><p>The first time you compile V8, you should use the recommended <code>gm.py</code> script to fully compile all the object files, libraries, and executables.</p><pre><code>$ ./tools/dev/gm.py x64.release.check</code></pre><p>This is described in the <a href="https://v8.dev/docs/build-gn">V8 documentation</a> as a <em>convenience script</em> because it&#8217;s a one-step solution for all the steps you need to get started. It takes about 20 minutes to run to completion (on my MacBook). Here&#8217;s what it&#8217;s doing:</p><h4><strong>1. Creating and Configuring the Build Output Directories</strong></h4><p>Using the best practice that object and executable files should be stored separately from the source code, the <code>gm.py</code> script creates the <code>v8/out/x64.release</code> directory. In this example, we&#8217;ve asked for V8 to be compiled for the <code>x64.release</code> target (Intel x86 64-bit for release images), although if you also want to compile for different targets (such as <code>x64.debug</code> or <code>arm64.debug</code>), separate directories would be created for those.</p><p>This step also generates the <code>args.gn</code> file in the <code>v8/out/x64.release</code> directory, specifying the build options for this configuration (most notable are the <code>is_debug</code> and <code>target_cpu</code> options).</p><pre><code>is_component_build = false
is_debug = false
target_cpu = "x64"
use_goma = false
goma_dir = "None"
v8_enable_backtrace = true
v8_enable_disassembler = true
v8_enable_object_print = true
v8_enable_verify_heap = true</code></pre><h4>2. Auto-Generating N<code>inja</code> files from the <code>GN</code> Build Specification</h4><p>The next step in the build process is for <code>gm.py</code> to invoke the GN tool to translate the human-readable <code>BUILD.gn</code> file into lower-level files for the Ninja tool (with&nbsp;<code>.ninja</code> suffix).</p><p>The <code>BUILD.gn</code> file contains easy-to-read directives specifying the content of each build target. In the following example, the <code>d8</code> executable is constructed from a small number of C++ source files, linked together with additional libraries that contain the core JavaScript engine.</p><pre><code>v8_executable("d8") {
  sources = [
    "src/d8/async-hooks-wrapper.cc",
    "src/d8/async-hooks-wrapper.h",
    "src/d8/d8-console.cc",
    ...
    "src/d8/d8.cc",
    "src/d8/d8.h",
  ]</code></pre><pre><code>  ...</code></pre><pre><code>  deps = [
    ":v8",
    ":v8_libbase",
    ":v8_libplatform",
    ...
  ]
}</code></pre><p>Later in this blog post, there&#8217;ll be more detail about this file format. For now, let&#8217;s look at what happens when the <code>gn gen</code> command generates all the&nbsp;<code>.ninja</code> files from the hand-written <code>BUILD.gn</code> file.</p><pre><code>$ gn gen out/x64.release</code></pre><p>This results in a collection of roughly 100&nbsp;<code>.ninja</code> files in the <code>out/x64.release</code> directory. Each&nbsp;<code>.ninja</code> file corresponds to one of the build targets described in the <code>BUILD.gn</code> file.</p><pre><code>./toolchain.ninja
./build.ninja
./obj/d8.ninja
./obj/v8_libbase.ninja
./obj/v8_simple_wasm_compile_fuzzer.ninja
./obj/v8_libplatform.ninja
./obj/v8_simple_multi_return_fuzzer.ninja
...
./obj/test/unittests/cppgc_unittests_sources.ninja
./obj/test/unittests/unittests_sources.ninja
./obj/test/wasm-api-tests/wasm_api_tests.ninja
./obj/test/common_test_headers.ninja
./obj/test/cctest/generate-bytecode-expectations.ninja
./obj/test/cctest/cctest_sources.ninja
./obj/test/cctest/cctest_headers.ninja
./obj/test/cctest/cctest.ninja</code></pre><p>As an example, here&#8217;s the content of the <code>d8.ninja</code> file. At first glance, this output is quite similar to an old-style Makefile&#8202;&#8212;&#8202;that is, not very readable!.</p><pre><code>defines = -D_LIBCPP_HAS_NO_ALIGNED_ALLOCATION -DCR_XCODE_VERSION=1160 -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D_FORTIFY_SOURCE=2 -D_LIB
CPP_ABI_UNSTABLE -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -D_LIBCXXABI_DISABLE_VISIBILITY_ANNOTATIONS -D_LIBCPP_ENABLE_NODISCARD -DCR_LIBCXX_REVISION=375504 ...</code></pre><pre><code>include_dirs = -I../.. -Igen -I../.. -I../../include -Igen -I../../include -Igen/include -I../../third_party/icu/source/common -I../../third_party/icu/source/i18n -I../../include
cflags = -fno-strict-aliasing -fstack-protector -fcolor-diagnostics -fmerge-all-constants -fcrash-diagnostics-dir=../../tools/clang/crashreports -mllvm -instcombine-lower-dbg-declare=0 -fcomplete-member-pointers -arch x86_64 -Wno-builtin-macro-redefined ...</code></pre><pre><code>build obj/d8/async-hooks-wrapper.o: cxx ../../src/d8/async-hooks-wrapper.cc || obj/d8.inputdeps.stamp
build obj/d8/d8-console.o: cxx ../../src/d8/d8-console.cc || obj/d8.inputdeps.stamp
build obj/d8/d8-js.o: cxx ../../src/d8/d8-js.cc || obj/d8.inputdeps.stamp
build obj/d8/d8-platforms.o: cxx ../../src/d8/d8-platforms.cc || obj/d8.inputdeps.stamp
build obj/d8/d8.o: cxx ../../src/d8/d8.cc || obj/d8.inputdeps.stamp
build obj/d8/d8-posix.o: cxx ../../src/d8/d8-posix.cc || obj/d8.inputdeps.stamp
...</code></pre><p>Now that we have all the&nbsp;<code>.ninja</code> files, we can start to compile the source code.</p><h4>3. Using the Ninja Build Tool to Compile the Objects and Executables</h4><p>The next step in the build process is for <code>gm.py</code> to invoke the C++ compiler (amongst other tools). This is done by invoking the <code>autoninja</code> command, which itself is a wrapper for the <code>ninja</code> command.</p><pre><code>$ autoninja -C out/x64.release d8</code></pre><p>This command reads the relevant&nbsp;<code>.ninja</code> files, determines which object files are missing (or out of date), then invokes the C++ compiler to create them. This process is familiar to anyone who has used the Make build tool (or similar).</p><p>After roughly 20 minutes (on my MacBook), we end up with a fully populated build tree of roughly 2700 files, including auto-generated source files (&nbsp;<code>.cc</code> and&nbsp;<code>.h</code> suffix), object files (<code>.o</code> suffix), library files (<code>.a</code> suffix), and a small number of executable files:</p><pre><code>out/x64.release/obj
out/x64.release/obj/v8_libbase/time.o
out/x64.release/obj/v8_libbase/semaphore.o
out/x64.release/obj/v8_libbase/platform-macos.o
out/x64.release/obj/v8_libbase/condition-variable.o
out/x64.release/obj/v8_libbase/ieee754.o
out/x64.release/obj/v8_libbase/file-utils.o
...
out/x64.release/obj/v8_compiler/effect-control-linearizer.o
out/x64.release/obj/v8_compiler/js-native-context-specialization.o
out/x64.release/obj/v8_compiler/store-store-elimination.o
out/x64.release/obj/v8_compiler/code-assembler.o
...
out/x64.release/gen/torque-generated/src/wasm/wasm-objects-tq-csa.h
out/x64.release/gen/torque-generated/src/wasm/wasm-objects-tq-csa.cc
out/x64.release/gen/torque-generated/src/objects/map-tq-csa.h
out/x64.release/gen/torque-generated/src/objects/code-tq-csa.h
...
out/x64.release/obj/libv8_libplatform.a
out/x64.release/obj/libwee8.a
out/x64.release/obj/third_party/zlib/libchrome_zlib.a
out/x64.release/obj/third_party/icu/libicui18n.a
out/x64.release/obj/third_party/icu/libicuuc.a
out/x64.release/obj/libv8_libbase.a
...
out/x64.release/obj/d8/d8.o
out/x64.release/obj/d8/d8-posix.o
out/x64.release/obj/d8/d8-console.o
out/x64.release/obj/d8/async-hooks-wrapper.o
out/x64.release/obj/d8/d8-js.o
out/x64.release/obj/d8/d8-platforms.o
out/x64.release/d8
...</code></pre><p>Everything is now compiled, so I can run the <code>d8</code> program to execute some JavaScript code:</p><pre><code>$ ./out/x64.release/d8 
V8 version 8.6.0 (candidate)
d8&gt; console.log(2 + 2);
4
undefined
d8&gt;</code></pre><p>Looking at this example output, you might think that <code>d8</code> is actually the same thing as NodeJS (and the <code>node</code> command), but it&#8217;s actually a simple wrapper around the core V8 libraries. It doesn&#8217;t add any of the <a href="https://nodejs.org/en/docs/">additional functionality that NodeJS provides</a>, but instead just supports the core JavaScript language. It&#8217;s this core library that&#8217;s linked into NodeJS, the Chrome browser, and any other software that needs to compile JavaScript.</p><h4>4. Executing the Unit&nbsp;Tests</h4><p>The final step of the <code>gm.py</code> wrapper script is to execute the unit tests. This is done using the <code>run-tests.py</code> script.</p><pre><code>$ ./tools/run-tests.py --outdir=out/x64.release \
       debugger intl mjsunit cctest message unittests</code></pre><p>I plan to talk about V8 testing in another blog post, so I won&#8217;t give more detail here. Let&#8217;s instead dig deeper into both the GN build tool, and the Ninja build tool.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fpJL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14f140a2-e039-459e-bc15-2c9e5951fd50_95x95.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fpJL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14f140a2-e039-459e-bc15-2c9e5951fd50_95x95.png 424w, https://substackcdn.com/image/fetch/$s_!fpJL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14f140a2-e039-459e-bc15-2c9e5951fd50_95x95.png 848w, https://substackcdn.com/image/fetch/$s_!fpJL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14f140a2-e039-459e-bc15-2c9e5951fd50_95x95.png 1272w, https://substackcdn.com/image/fetch/$s_!fpJL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14f140a2-e039-459e-bc15-2c9e5951fd50_95x95.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fpJL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14f140a2-e039-459e-bc15-2c9e5951fd50_95x95.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/14f140a2-e039-459e-bc15-2c9e5951fd50_95x95.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fpJL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14f140a2-e039-459e-bc15-2c9e5951fd50_95x95.png 424w, https://substackcdn.com/image/fetch/$s_!fpJL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14f140a2-e039-459e-bc15-2c9e5951fd50_95x95.png 848w, https://substackcdn.com/image/fetch/$s_!fpJL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14f140a2-e039-459e-bc15-2c9e5951fd50_95x95.png 1272w, https://substackcdn.com/image/fetch/$s_!fpJL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14f140a2-e039-459e-bc15-2c9e5951fd50_95x95.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>The GN Meta Build&nbsp;Tool</h3><p>The <a href="https://gn.googlesource.com/gn/">GN Build Tool </a>is classified as a &#8220;meta build&#8221; tool in that it doesn&#8217;t actually invoke the C++ compiler directly, but instead converts a human-readable build description into a lower-level format suitable for the Ninja build tool. This concept was popularized by <a href="https://cmake.org/">CMake</a>, which (amongst other things) is capable of auto-generating a tree of<code>Makefile</code> files, to be used by the Make tool</p><h4>GN Command Line&nbsp;Options</h4><p>We&#8217;ve already seen how GN is used (with the <code>gn gen</code> option) to generate all the&nbsp;<code>.ninja</code> files, but what else can it do? Here are some interesting examples:</p><p>First, we can list all the possible build targets for V8:</p><pre><code>$ gn ls out/x64.release</code></pre><pre><code>//:bytecode_builtins_list_generator
//:cppgc
//:cppgc_base
//:cppgc_for_testing
//:cppgc_for_v8_embedders
//:cppgc_standalone
//:d8
//:fuzzer_support
//:gen-regexp-special-case
//:generate_bytecode_builtins_list
//:gn_all
//:json_fuzzer
//:lib_wasm_fuzzer_common
...</code></pre><p>Next, we can show all the compilation flags, input files, and dependent libraries for one of these targets:</p><pre><code>$ gn desc out/x64.release //:d8</code></pre><pre><code>type: executable
toolchain: //build/toolchain/mac:clang_x64</code></pre><pre><code>...</code></pre><pre><code>sources
  //src/d8/async-hooks-wrapper.cc
  //src/d8/async-hooks-wrapper.h
  //src/d8/d8-console.cc
  //src/d8/d8-console.h
  //src/d8/d8-js.cc
  //src/d8/d8-platforms.cc
  //src/d8/d8-platforms.h
  //src/d8/d8.cc
  //src/d8/d8.h
  //src/d8/d8-posix.cc
  ...</code></pre><pre><code>cflags
  -fno-strict-aliasing
  -fstack-protector
  -fcolor-diagnostics
  -fmerge-all-constants
  ...</code></pre><pre><code>defines
  _LIBCPP_HAS_NO_ALIGNED_ALLOCATION
  CR_XCODE_VERSION=1160
  CR_CLANG_REVISION="llvmorg-12-init-1771-g1bd7046e-3"
  __STDC_CONSTANT_MACROS
  __STDC_FORMAT_MACROS
  ...</code></pre><pre><code>Direct dependencies
  //:v8
  //:v8_dump_build_config
  //:v8_libbase
  //:v8_libplatform
  //:v8_tracing
  //build/config:executable_deps
  //build/win:default_exe_manifest</code></pre><p>Finally, the reverse operation is to show which targets will be built from a specific source file.</p><pre><code>$ gn refs out/x64.release //src/d8/d8-platforms.cc</code></pre><pre><code>//:d8
//tools/gcmole:v8_run_gcmole</code></pre><p>Note that none of these commands actually tell you which targets are currently out of date. As we&#8217;ll see later, that&#8217;s the responsibility of the Ninja tool.</p><h4>Understanding BUILD.gn for the &#8220;d8&#8221;&nbsp;Target</h4><p>The commands shown above are very useful, but how does GN know about the targets and their dependencies? Let&#8217;s spend some time looking at the highlights of the <code>v8/BUILD.gn</code> file. If you want more information on the <code>BUILD.gn</code> syntax, <a href="https://docs.google.com/presentation/d/15Zwb53JcncHfEwHpnG_PoIbbzQ3GQi_cpujYwbpcbZo/htmlpresent">an excellent introductory presentation</a> is also available.</p><p>We&#8217;ll be looking at how the <code>d8</code> executable is constructed. This following is the code starting at line 4744 of my copy of <code>BUILD.gn</code> (it&#8217;s a long file!). I&#8217;ve added the &#8220;section&#8221; comments to make the code easier to refer to.</p><pre><code># Section 1 - The v8_executable Template
v8_executable("d8") {</code></pre><pre><code>  # Section 2 - Defining the Sources
  sources = [
    "src/d8/async-hooks-wrapper.cc",
    "src/d8/async-hooks-wrapper.h",
    "src/d8/d8-console.cc",
    "src/d8/d8-console.h",
    "src/d8/d8-js.cc",
    "src/d8/d8-platforms.cc",
    "src/d8/d8-platforms.h",
    "src/d8/d8.cc",
    "src/d8/d8.h",
  ]</code></pre><pre><code>  # Section 3 - Optional Sources
  if (v8_fuzzilli) {
    sources += [
      "src/d8/cov.cc",
      "src/d8/cov.h",
    ]
  }</code></pre><pre><code>  # Section 4 - Compilation Configuration
  configs = [
    ":internal_config_base",
    ":v8_tracing_config",
  ]</code></pre><pre><code>  # Section 5 - Additional Dependencies
  deps = [
    ":v8",
    ":v8_libbase",
    ":v8_libplatform",
    ":v8_tracing",
    "//build/win:default_exe_manifest",
  ]</code></pre><pre><code>  ...
}</code></pre><p>Let&#8217;s learn some of the main concepts of GN by walking through this example.</p><p><strong>Section 1&#8202;&#8212;&#8202;The <code>v8_executable</code> Template:</strong></p><p>Out of the box, GN provides the <code>executable</code> command for describing how to construct an executable program. For V8, we actually use the <code>v8_executable</code> &#8220;template&#8221; (a GN concept) that wraps the basic <code>executable</code> command, providing some additional functionality for compiling V8 executables. This template is defined by including <code>import("gni/v8.gni")</code> at the top of the <code>BUILD.gn</code> file. The <code>v8.gni</code> file itself contains this snippet of code:</p><pre><code>...</code></pre><pre><code>template("v8_executable") {
   executable(target_name) {
     ...
   }
   ...
}</code></pre><pre><code>...</code></pre><p>This file also contains similar templates for <code>v8_static_library</code>, <code>v8_shared_library</code>, and <code>v8_source_set</code> that build upon the corresponding GN standard commands. In addition, the main <code>BUILD.gn</code> file also contains some template definitions, making the build description more concise by abstracting away the complexity.</p><p><strong>Section 2&#8202;&#8212;&#8202;Defining the Sources:</strong></p><p>To specify the C++ source files to be included in the <code>d8</code> executable, we define a variable that contains a list of file paths. The GN tool supports a simple programming language, including the concept of variables and values, as well as lists of values.</p><pre><code>  sources = [
    "src/d8/async-hooks-wrapper.cc",
    "src/d8/async-hooks-wrapper.h",
    ...
  ]</code></pre><p>Note that unlike many build tools, we&#8217;re only required to list the file paths. We don&#8217;t need to construct file name pattern matching, or specify dependencies between files. The mechanism for doing that is hidden from you in the auto-generated&nbsp;<code>.ninja</code> files.</p><p><strong>Section 3&#8202;&#8212;&#8202;Optional Sources:</strong></p><p>There are numerous build variants for V8, supporting a wide range of host platforms, target CPUs, optimization choices, JavaScript language-level selection, and additional feature libraries. To support all these variants, GN provides an <code>if</code> statement for us to test variables and conditionally modify the list of sources (using <code>sources +=</code>)</p><pre><code>  if (v8_fuzzilli) {
    sources += [
      "src/d8/cov.cc",
      "src/d8/cov.h",
    ]
  }</code></pre><p>In this particular example, we&#8217;re adding support for the <a href="https://github.com/googleprojectzero/fuzzilli">Fuzzilli fuzzing tool</a> which requires additional code-coverage functionality.</p><p><strong>Section 4&#8202;&#8212;&#8202;Compilation Configuration:</strong></p><p>To specify additional compilation flags for the <code>d8</code> target, we make reference to a couple of &#8220;configs&#8221;:</p><pre><code>configs = [
  ":internal_config_base",
  ":v8_tracing_config",
]</code></pre><p>Here&#8217;s the definition of <code>internal_config_base</code> that appears earlier in the <code>BUILD.gn</code> file.</p><pre><code>config("internal_config_base") {
  visibility = [ ":*" ]</code></pre><pre><code>  configs = [ ":v8_tracing_config" ]</code></pre><pre><code>  include_dirs = [
    ".",
    "include",
    "$target_gen_dir",
  ]
}</code></pre><p>A &#8220;config&#8221; is a way to package together include paths, C++ symbol definitions, compiler flags, and additional libraries. These configs can obviously become quite complex, especially with support for multiple host platforms. But luckily, build targets simply need to reference the config by name, rather than worrying about all of those details.</p><p><strong>Section 5&#8202;&#8212;&#8202;Additional Dependencies:</strong></p><p>Finally, to specify additional source files, or libraries to be linked into the <code>d8</code> executable, we define the <code>deps</code> variable. Each entry in the list specifies a V8 build target, which itself provides a static/shared library, or a set of source files to include.</p><pre><code>deps = [
    ":v8",
    ":v8_libbase",
    ":v8_libplatform",
    ":v8_tracing",
    "//build/win:default_exe_manifest",
  ]</code></pre><p>That&#8217;s it! A relatively simple way of specifying how to construct the <code>d8</code> executable, without burdening the developer with the complexities of compilation flags, dependencies, and file pattern matching. There are plenty of other GN commands/directives that we haven&#8217;t discussed, but the GN documentation shows them all.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ifAU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa67f7328-715c-4e39-8fbf-9ed2bb0b4bb7_95x95.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ifAU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa67f7328-715c-4e39-8fbf-9ed2bb0b4bb7_95x95.png 424w, https://substackcdn.com/image/fetch/$s_!ifAU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa67f7328-715c-4e39-8fbf-9ed2bb0b4bb7_95x95.png 848w, https://substackcdn.com/image/fetch/$s_!ifAU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa67f7328-715c-4e39-8fbf-9ed2bb0b4bb7_95x95.png 1272w, https://substackcdn.com/image/fetch/$s_!ifAU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa67f7328-715c-4e39-8fbf-9ed2bb0b4bb7_95x95.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ifAU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa67f7328-715c-4e39-8fbf-9ed2bb0b4bb7_95x95.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a67f7328-715c-4e39-8fbf-9ed2bb0b4bb7_95x95.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ifAU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa67f7328-715c-4e39-8fbf-9ed2bb0b4bb7_95x95.png 424w, https://substackcdn.com/image/fetch/$s_!ifAU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa67f7328-715c-4e39-8fbf-9ed2bb0b4bb7_95x95.png 848w, https://substackcdn.com/image/fetch/$s_!ifAU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa67f7328-715c-4e39-8fbf-9ed2bb0b4bb7_95x95.png 1272w, https://substackcdn.com/image/fetch/$s_!ifAU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa67f7328-715c-4e39-8fbf-9ed2bb0b4bb7_95x95.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>The Ninja Build&nbsp;Tool</h3><p>The last step in the V8 build process (with the exception of running tests) is to invoke the <a href="https://ninja-build.org/">Ninja Build Tool</a> to generate the object files, libraries, and executables. Given that users aren&#8217;t expected to look at the auto-generated&nbsp;<code>.ninja</code> files, there&#8217;s no need to look at further examples. However, it&#8217;s interesting to learn more about invoking Ninja, and the various command-line options available.</p><h4>Speed is Everything</h4><p>One of the interesting selling points of Ninja is its raw speed. Given my extensive history of using build tools like Make, I was very curious about what makes Ninja so responsive. When dealing with hundreds (or thousands) of source files, a lot of build tools will &#8220;pause&#8221; for 20&#8211;30 seconds as they determine which files are out of date. With Ninja, incremental builds seem to start instantly.</p><p>Here are some interesting facts about Ninja:</p><ul><li><p>First, the build description files (with&nbsp;<code>.ninja</code> suffix) are very simplistic. There is no complicated language to be parsed, and no advanced features requiring time to execute. For this reason, the documentation describes the syntax as &#8220;machine code&#8221;. The&nbsp;<code>.ninja</code> files are also very compact, often with minimal white space. Keeping them small and simple makes them fast to read into memory, and to parse.</p></li><li><p>Second, implicit dependencies are stored in a single cache file, the&nbsp;<code>.ninja_deps</code> file (a 2MB binary file on my computer). In a Make-based environment, it&#8217;s common to have a unique&nbsp;<code>.d</code> file corresponding to each&nbsp;<code>.cc</code> file to store the list of C++ header files depended-on by the main&nbsp;<code>.cc</code> file. As a result, the build tool parses a very large number of files each time an incremental build is invoked. However, for Ninja, reading the one-and-only&nbsp;<code>.ninja_deps</code> file is extremely fast!</p></li></ul><p>In the case of V8, the <code>ninja</code> tool starts reading <code>build.ninja</code>, which then imports the <code>toolchain.ninja</code> file. It&#8217;s this second file that recursively imports all the other&nbsp;<code>.ninja</code> files in the <code>out/x64.release</code> directory (shown earlier). Despite having roughly 100&nbsp;<code>.ninja</code> files, reading and processing them is very fast.</p><h4>Ninja Command-Line Options</h4><p>To finish off, let&#8217;s show some of the commonly-used Ninja commands. To list all the available built targets, use the <code>targets</code> command:</p><pre><code>$ ninja -t targets</code></pre><pre><code>build.ninja: gn
obj/test/common_test_headers.inputdeps.stamp: stamp
obj/test/unittests/unittests.inputdeps.stamp: stamp
cppgc: phony
cppgc_base: phony
cppgc_for_testing: phony
fuzzer_support: phony
generate_bytecode_builtins_list: phony
...
:d8: phony
:fuzzer_support: phony
:gen-regexp-special-case: phony
:generate_bytecode_builtins_list: phony
:gn_all: phony</code></pre><p>Naturally, these build targets are similar to what was shown in the <code>BUILD.gn</code> file, and reported by the <code>gn ls</code> command.</p><p>To compile a specific target, just mention it on the command line, optionally with the <code>-v</code> flag if you want to see the underlying C++ compiler invocations.</p><pre><code>$ ninja -v d8</code></pre><pre><code>[1/1506] ../../third_party/llvm-build/Release+Asserts/bin/clang++ -MMD -MF ...
[2/1506] ../../third_party/llvm-build/Release+Asserts/bin/clang++ -MMD -MF ...</code></pre><pre><code>...</code></pre><pre><code>[1506/1506] ...</code></pre><p>To show all of the compilation commands required to build a target, without actually invoking the compiler, use the <code>commands</code> option:</p><pre><code>$ ninja -t commands d8</code></pre><pre><code>...</code></pre><p>Finally, to show where a particular file is used (that is, which libraries or executables depend on it), use the <code>query</code> command:</p><pre><code>$ ninja -t query ./obj/v8_libbase/mutex.o</code></pre><pre><code>obj/v8_libbase/mutex.o:
  input: cxx
    ../../src/base/platform/mutex.cc
  outputs:
    obj/libv8_libbase.a
    obj/libwee8.a</code></pre><p>These are the basics, but for more advanced options, see the Ninja documentation for further detail.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TsWE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4760214-989b-4026-8fac-813c9855e503_95x95.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TsWE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4760214-989b-4026-8fac-813c9855e503_95x95.png 424w, https://substackcdn.com/image/fetch/$s_!TsWE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4760214-989b-4026-8fac-813c9855e503_95x95.png 848w, https://substackcdn.com/image/fetch/$s_!TsWE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4760214-989b-4026-8fac-813c9855e503_95x95.png 1272w, https://substackcdn.com/image/fetch/$s_!TsWE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4760214-989b-4026-8fac-813c9855e503_95x95.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TsWE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4760214-989b-4026-8fac-813c9855e503_95x95.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e4760214-989b-4026-8fac-813c9855e503_95x95.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TsWE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4760214-989b-4026-8fac-813c9855e503_95x95.png 424w, https://substackcdn.com/image/fetch/$s_!TsWE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4760214-989b-4026-8fac-813c9855e503_95x95.png 848w, https://substackcdn.com/image/fetch/$s_!TsWE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4760214-989b-4026-8fac-813c9855e503_95x95.png 1272w, https://substackcdn.com/image/fetch/$s_!TsWE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4760214-989b-4026-8fac-813c9855e503_95x95.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Summary</h3><p>The V8 JavaScript engine has an excellent build system, comprised of a top-level convenience script (<code>gm.py</code>), which invokes the GN meta-build tool to generate lower-level build description files to be executed by the Ninja build tool. This combination of tools allows developers to work with the human-readable <code>BUILD.gn</code> file format, while allowing for a fast execution of the build steps, especially for incremental build.</p>]]></content:encoded></item><item><title><![CDATA[Message Queues in Database Transactions]]></title><description><![CDATA[Like many scalable SaaS applications, the HighBond platform from Galvanize uses an event-driven architecture. When a state change is made&#8230;]]></description><link>https://www.petersmith.net/p/message-queues-in-database-transactions-f830718f4f12</link><guid isPermaLink="false">https://www.petersmith.net/p/message-queues-in-database-transactions-f830718f4f12</guid><dc:creator><![CDATA[Peter Smith]]></dc:creator><pubDate>Tue, 03 Sep 2019 13:01:01 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/a7c94dd5-7358-4f78-9e1a-3f643977cba3_698x363.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!INQ2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80b328a6-934f-4e21-8207-df9c1098654e_698x363.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!INQ2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80b328a6-934f-4e21-8207-df9c1098654e_698x363.png 424w, https://substackcdn.com/image/fetch/$s_!INQ2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80b328a6-934f-4e21-8207-df9c1098654e_698x363.png 848w, https://substackcdn.com/image/fetch/$s_!INQ2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80b328a6-934f-4e21-8207-df9c1098654e_698x363.png 1272w, https://substackcdn.com/image/fetch/$s_!INQ2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80b328a6-934f-4e21-8207-df9c1098654e_698x363.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!INQ2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80b328a6-934f-4e21-8207-df9c1098654e_698x363.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/80b328a6-934f-4e21-8207-df9c1098654e_698x363.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!INQ2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80b328a6-934f-4e21-8207-df9c1098654e_698x363.png 424w, https://substackcdn.com/image/fetch/$s_!INQ2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80b328a6-934f-4e21-8207-df9c1098654e_698x363.png 848w, https://substackcdn.com/image/fetch/$s_!INQ2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80b328a6-934f-4e21-8207-df9c1098654e_698x363.png 1272w, https://substackcdn.com/image/fetch/$s_!INQ2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80b328a6-934f-4e21-8207-df9c1098654e_698x363.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>Like many scalable SaaS applications, the <a href="https://www.wegalvanize.com/highbond/">HighBond</a> platform from <a href="https://www.wegalvanize.com">Galvanize</a> uses an <em>event-driven architecture</em>. When a state change is made within one service (such as updating a record in the user module), other services are notified via descriptive events posted to a message queue. This allows for loose-coupling of services, leading to better scalability.</p><p>In this blog post, we&#8217;ll describe our approach for ensuring that events are sent reliably, even in the presence of failure. At Galvanize, we use <a href="https://rubyonrails.org/">Ruby on Rails</a> as the main framework for application development, in conjunction with <a href="https://www.postgresql.org/">PostgreSQL</a> (via <a href="https://guides.rubyonrails.org/active_record_basics.html">Active Record</a>), and Amazon&#8217;s <a href="https://aws.amazon.com/sqs/">SQS</a> message queue service.</p><p>Here&#8217;s the basic workflow</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6c6k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dc5cd6c-3f6c-48c3-9bf2-648a304d1bfa_800x312.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6c6k!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dc5cd6c-3f6c-48c3-9bf2-648a304d1bfa_800x312.png 424w, https://substackcdn.com/image/fetch/$s_!6c6k!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dc5cd6c-3f6c-48c3-9bf2-648a304d1bfa_800x312.png 848w, https://substackcdn.com/image/fetch/$s_!6c6k!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dc5cd6c-3f6c-48c3-9bf2-648a304d1bfa_800x312.png 1272w, https://substackcdn.com/image/fetch/$s_!6c6k!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dc5cd6c-3f6c-48c3-9bf2-648a304d1bfa_800x312.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6c6k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dc5cd6c-3f6c-48c3-9bf2-648a304d1bfa_800x312.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6dc5cd6c-3f6c-48c3-9bf2-648a304d1bfa_800x312.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6c6k!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dc5cd6c-3f6c-48c3-9bf2-648a304d1bfa_800x312.png 424w, https://substackcdn.com/image/fetch/$s_!6c6k!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dc5cd6c-3f6c-48c3-9bf2-648a304d1bfa_800x312.png 848w, https://substackcdn.com/image/fetch/$s_!6c6k!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dc5cd6c-3f6c-48c3-9bf2-648a304d1bfa_800x312.png 1272w, https://substackcdn.com/image/fetch/$s_!6c6k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dc5cd6c-3f6c-48c3-9bf2-648a304d1bfa_800x312.png 1456w" sizes="100vw"></picture><div></div></div></a></figure></div><ol><li><p>An HTTP request comes into the web application server (Ruby on Rails).</p></li><li><p>The application makes a transactional state change to its local database (PostgreSQL).</p></li><li><p>A notification event is transmitted to other services, via a message queue (Amazon SQS).</p></li><li><p>An HTTP response is provided back to the user&#8217;s browser.</p></li></ol><p>Although we use Rails, PostgreSQL, and SQS, the concepts are quite general and apply to other web development frameworks, database servers, and message queues.</p><h3>The Problem</h3><p>Although it appears simple to generate an SQS message whenever there&#8217;s a state change in a service, it&#8217;s much harder to ensure it&#8217;s done reliably, especially in the face of failure. Here&#8217;s the most important rule:</p><pre><code>An event should be sent to the message queue if, and only if, the state change has successfully been committed to the originating database.</code></pre><p>That is, we must avoid having the application change its local database, without the message queue receiving notification of that change. Conversely, the event notification must not be sent if the expected state change didn&#8217;t actually happen. Either of these scenarios cause the SaaS application&#8217;s services to provide conflicting views of data.</p><p>Why is this challenging? Here are three common reasons for failure:</p><ol><li><p>The database transaction, within the source application, may fail and be rolled back.</p></li><li><p>The message queue may be temporarily unavailable, most likely due to network glitches.</p></li><li><p>Our application may fail, due to coding issues, or if the underlying operating system becomes unavailable.</p></li></ol><p>Given these potential failures, we came up with several solutions for ensuring <em>atomicity</em>&#8202;&#8212;&#8202;that is, the data must change consistently throughout the entire system, or none of the changes should occur at all. The system should not allow partial changes to be visible.</p><h3>Attempt 1&#8202;&#8212;&#8202;Send to the Queue From Within the Database Transaction</h3><p>The first approach we tried, which doesn&#8217;t actually work correctly, is to treat the message queue event as part of the database transaction. For example, in Rails we&#8217;d use the following syntax:</p><pre><code>user.transaction do</code></pre><pre><code>  user.update!(salary: user.salary + 100)</code></pre><pre><code>  queue.send(user.name, user.salary)</code></pre><pre><code>  ... other updates ...</code></pre><pre><code>end</code></pre><p>In this case, we update the user&#8217;s salary in the PostgreSQL database, then send a notification to the message queue indicating the user&#8217;s new salary. Once the <code>end</code> statement is reached, the entire database change is committed atomically, which is the expected behaviour from PostgreSQL.</p><p>This <em>happy path</em> sequence can be visualized in this way:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5Bmv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c6db0d5-a0f7-492b-94fa-b11cdd0411a1_550x456.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5Bmv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c6db0d5-a0f7-492b-94fa-b11cdd0411a1_550x456.png 424w, https://substackcdn.com/image/fetch/$s_!5Bmv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c6db0d5-a0f7-492b-94fa-b11cdd0411a1_550x456.png 848w, https://substackcdn.com/image/fetch/$s_!5Bmv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c6db0d5-a0f7-492b-94fa-b11cdd0411a1_550x456.png 1272w, https://substackcdn.com/image/fetch/$s_!5Bmv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c6db0d5-a0f7-492b-94fa-b11cdd0411a1_550x456.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5Bmv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c6db0d5-a0f7-492b-94fa-b11cdd0411a1_550x456.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9c6db0d5-a0f7-492b-94fa-b11cdd0411a1_550x456.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5Bmv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c6db0d5-a0f7-492b-94fa-b11cdd0411a1_550x456.png 424w, https://substackcdn.com/image/fetch/$s_!5Bmv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c6db0d5-a0f7-492b-94fa-b11cdd0411a1_550x456.png 848w, https://substackcdn.com/image/fetch/$s_!5Bmv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c6db0d5-a0f7-492b-94fa-b11cdd0411a1_550x456.png 1272w, https://substackcdn.com/image/fetch/$s_!5Bmv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c6db0d5-a0f7-492b-94fa-b11cdd0411a1_550x456.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Here&#8217;s what happens if the transaction fails&#8202;&#8212;&#8202;that is, if the&nbsp;<em><code>... other updates&nbsp;...</code></em> cause an error.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wEIO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9c01b0c-cd28-467f-b477-9ef233461c47_559x453.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wEIO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9c01b0c-cd28-467f-b477-9ef233461c47_559x453.png 424w, https://substackcdn.com/image/fetch/$s_!wEIO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9c01b0c-cd28-467f-b477-9ef233461c47_559x453.png 848w, https://substackcdn.com/image/fetch/$s_!wEIO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9c01b0c-cd28-467f-b477-9ef233461c47_559x453.png 1272w, https://substackcdn.com/image/fetch/$s_!wEIO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9c01b0c-cd28-467f-b477-9ef233461c47_559x453.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wEIO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9c01b0c-cd28-467f-b477-9ef233461c47_559x453.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c9c01b0c-cd28-467f-b477-9ef233461c47_559x453.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wEIO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9c01b0c-cd28-467f-b477-9ef233461c47_559x453.png 424w, https://substackcdn.com/image/fetch/$s_!wEIO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9c01b0c-cd28-467f-b477-9ef233461c47_559x453.png 848w, https://substackcdn.com/image/fetch/$s_!wEIO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9c01b0c-cd28-467f-b477-9ef233461c47_559x453.png 1272w, https://substackcdn.com/image/fetch/$s_!wEIO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9c01b0c-cd28-467f-b477-9ef233461c47_559x453.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>However, this solution fails because the event was already sent to the message queue, and other services believe the change took place, even though it didn&#8217;t. Not what we wanted!</p><h3>Attempt 2&#8202;&#8212;&#8202;Send to Queue After Transaction Commits</h3><p>To work around this failed-transaction problem, our second approach sends the event only when the transaction is known to be successful. Our code example is now:</p><pre><code>user.transaction do</code></pre><pre><code>  user.update!(salary: user.salary + 100)</code></pre><pre><code>  ... other updates ...</code></pre><pre><code>end</code></pre><pre><code>... if we get here, transaction was successful ...
queue.send(user.name, user.salary)</code></pre><p>If this code is successful (no rollback exception was thrown), we proceed to send the event on the message queue. There are several ways to do this, but in our case we attached an <a href="https://apidock.com/rails/ActiveRecord/Transactions/ClassMethods/after_commit"><code>after_commit</code></a> hook on the Active Record model. This hook is triggered only when the change is actually committed to the database, but not if it was rolled back.</p><p>This sequence can be visualized as:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4L4b!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6ddae7e-33b6-48d7-9a2e-4feec22e0185_581x455.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4L4b!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6ddae7e-33b6-48d7-9a2e-4feec22e0185_581x455.png 424w, https://substackcdn.com/image/fetch/$s_!4L4b!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6ddae7e-33b6-48d7-9a2e-4feec22e0185_581x455.png 848w, https://substackcdn.com/image/fetch/$s_!4L4b!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6ddae7e-33b6-48d7-9a2e-4feec22e0185_581x455.png 1272w, https://substackcdn.com/image/fetch/$s_!4L4b!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6ddae7e-33b6-48d7-9a2e-4feec22e0185_581x455.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4L4b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6ddae7e-33b6-48d7-9a2e-4feec22e0185_581x455.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e6ddae7e-33b6-48d7-9a2e-4feec22e0185_581x455.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4L4b!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6ddae7e-33b6-48d7-9a2e-4feec22e0185_581x455.png 424w, https://substackcdn.com/image/fetch/$s_!4L4b!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6ddae7e-33b6-48d7-9a2e-4feec22e0185_581x455.png 848w, https://substackcdn.com/image/fetch/$s_!4L4b!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6ddae7e-33b6-48d7-9a2e-4feec22e0185_581x455.png 1272w, https://substackcdn.com/image/fetch/$s_!4L4b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6ddae7e-33b6-48d7-9a2e-4feec22e0185_581x455.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>As you&#8217;d expect, the failure case is now handled correctly. If the transaction is rolled back, no update is made to the database and no event is sent on the message queue.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JQYH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31a4d24e-81e2-400a-995e-f828f5f8f8b6_544x459.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JQYH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31a4d24e-81e2-400a-995e-f828f5f8f8b6_544x459.png 424w, https://substackcdn.com/image/fetch/$s_!JQYH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31a4d24e-81e2-400a-995e-f828f5f8f8b6_544x459.png 848w, https://substackcdn.com/image/fetch/$s_!JQYH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31a4d24e-81e2-400a-995e-f828f5f8f8b6_544x459.png 1272w, https://substackcdn.com/image/fetch/$s_!JQYH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31a4d24e-81e2-400a-995e-f828f5f8f8b6_544x459.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JQYH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31a4d24e-81e2-400a-995e-f828f5f8f8b6_544x459.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/31a4d24e-81e2-400a-995e-f828f5f8f8b6_544x459.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JQYH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31a4d24e-81e2-400a-995e-f828f5f8f8b6_544x459.png 424w, https://substackcdn.com/image/fetch/$s_!JQYH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31a4d24e-81e2-400a-995e-f828f5f8f8b6_544x459.png 848w, https://substackcdn.com/image/fetch/$s_!JQYH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31a4d24e-81e2-400a-995e-f828f5f8f8b6_544x459.png 1272w, https://substackcdn.com/image/fetch/$s_!JQYH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31a4d24e-81e2-400a-995e-f828f5f8f8b6_544x459.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>However, this solution isn&#8217;t perfect either. What if the database transaction is committed, but for some reason we&#8217;re unable to send the event afterwards?</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!X5tX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0237ab42-f447-4836-bd78-75a193e41204_555x456.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!X5tX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0237ab42-f447-4836-bd78-75a193e41204_555x456.png 424w, https://substackcdn.com/image/fetch/$s_!X5tX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0237ab42-f447-4836-bd78-75a193e41204_555x456.png 848w, https://substackcdn.com/image/fetch/$s_!X5tX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0237ab42-f447-4836-bd78-75a193e41204_555x456.png 1272w, https://substackcdn.com/image/fetch/$s_!X5tX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0237ab42-f447-4836-bd78-75a193e41204_555x456.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!X5tX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0237ab42-f447-4836-bd78-75a193e41204_555x456.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0237ab42-f447-4836-bd78-75a193e41204_555x456.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!X5tX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0237ab42-f447-4836-bd78-75a193e41204_555x456.png 424w, https://substackcdn.com/image/fetch/$s_!X5tX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0237ab42-f447-4836-bd78-75a193e41204_555x456.png 848w, https://substackcdn.com/image/fetch/$s_!X5tX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0237ab42-f447-4836-bd78-75a193e41204_555x456.png 1272w, https://substackcdn.com/image/fetch/$s_!X5tX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0237ab42-f447-4836-bd78-75a193e41204_555x456.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>There are two main reasons this might happen:</p><ol><li><p>The application might fail <em>immediately after</em> committing the database change, but <em>before</em> sending the event. This could happen due to badly-written code, or if the underlying OS fails. Although this seems unlikely, it&#8217;s sure to eventually happen, especially on a busy system.</p></li><li><p>A more common problem is when the message queue is unreachable for a period of time, possibly due to network glitches. In this case, the application should repeatedly try to resend the event, until it succeeds.</p></li></ol><p>In the first failure case, the application server restarts and continues processing HTTP requests. However, at that point in time we&#8217;ve already lost track of the state change, with no way to know the event hadn&#8217;t been sent. In the second failure case, how long should we retry for? Web page requests are supposed to be fast, and the browser is expecting a quick response, so we can&#8217;t retry indefinitely.</p><h3>Attempt 3&#8202;&#8212;&#8202;Send Events in a Delayed&nbsp;Job</h3><p>The main problem with our previous approach is that events were not <em>durable</em> (stored on disk) if they hadn&#8217;t yet reached the message queue. Therefore, any programming logic or system failure made us completely lose track of the event.</p><p>We need a solution that guarantees the events are durable whenever the database transaction commits. Luckily, the Rails <em>delayed job</em> mechanism provides this ability.</p><pre><code>user.transaction do</code></pre><pre><code>  user.update!(salary: user.salary + 100)</code></pre><pre><code>  user.delay.send_event()</code></pre><pre><code>  ... other updates ...</code></pre><pre><code>end</code></pre><p>In Rails, adding<code>delay</code>to your method call implicitly inserts a snippet of code into a special <code>delayed_job</code>database table, acting as queue of work to be completed at a future time. A separate Rails server repeatedly pulls code from this table, executing it in the background, usually after the original HTTP request has completed.</p><p>This solutions looks similar to our previous cases, but with the addition of a delayed job mechanism:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jPNG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa5583f9-127b-44f7-a8b1-10a0aecd9cb5_734x454.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jPNG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa5583f9-127b-44f7-a8b1-10a0aecd9cb5_734x454.png 424w, https://substackcdn.com/image/fetch/$s_!jPNG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa5583f9-127b-44f7-a8b1-10a0aecd9cb5_734x454.png 848w, https://substackcdn.com/image/fetch/$s_!jPNG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa5583f9-127b-44f7-a8b1-10a0aecd9cb5_734x454.png 1272w, https://substackcdn.com/image/fetch/$s_!jPNG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa5583f9-127b-44f7-a8b1-10a0aecd9cb5_734x454.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jPNG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa5583f9-127b-44f7-a8b1-10a0aecd9cb5_734x454.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/aa5583f9-127b-44f7-a8b1-10a0aecd9cb5_734x454.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jPNG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa5583f9-127b-44f7-a8b1-10a0aecd9cb5_734x454.png 424w, https://substackcdn.com/image/fetch/$s_!jPNG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa5583f9-127b-44f7-a8b1-10a0aecd9cb5_734x454.png 848w, https://substackcdn.com/image/fetch/$s_!jPNG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa5583f9-127b-44f7-a8b1-10a0aecd9cb5_734x454.png 1272w, https://substackcdn.com/image/fetch/$s_!jPNG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa5583f9-127b-44f7-a8b1-10a0aecd9cb5_734x454.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>As it turns out, this approach solves all the problems we&#8217;ve discussed so far:</p><ol><li><p>By creating a delayed job inside a Rails transaction, the delayed job is <em>only</em> scheduled if the database transaction commits. In Rails, delayed jobs are queued for execution by inserting a record into a PostgreSQL table. If the transaction fails, the job is never committed to the table, so the job is never executed.</p></li><li><p>Delayed jobs are durable, so even if the application crashes and restarts, the delayed job is still in the database, and the event is transmitted reliably.</p></li><li><p>If the message queue is unavailable for a period of time, the delayed job can re-execute until it&#8217;s successful. Given that it&#8217;s running in the background, it will not impact the performance of the end user&#8217;s HTTP request. Obviously there must be a limit to the number of retries (it can&#8217;t wait forever), but most intermittent failures can be resolved this way.</p></li></ol><p>Naturally, if the database transaction is rolled back, the delayed job won&#8217;t be scheduled and the event is never sent:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!onMs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F339071ad-8ac3-4987-9710-1d72375ee5c1_733x459.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!onMs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F339071ad-8ac3-4987-9710-1d72375ee5c1_733x459.png 424w, https://substackcdn.com/image/fetch/$s_!onMs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F339071ad-8ac3-4987-9710-1d72375ee5c1_733x459.png 848w, https://substackcdn.com/image/fetch/$s_!onMs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F339071ad-8ac3-4987-9710-1d72375ee5c1_733x459.png 1272w, https://substackcdn.com/image/fetch/$s_!onMs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F339071ad-8ac3-4987-9710-1d72375ee5c1_733x459.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!onMs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F339071ad-8ac3-4987-9710-1d72375ee5c1_733x459.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/339071ad-8ac3-4987-9710-1d72375ee5c1_733x459.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!onMs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F339071ad-8ac3-4987-9710-1d72375ee5c1_733x459.png 424w, https://substackcdn.com/image/fetch/$s_!onMs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F339071ad-8ac3-4987-9710-1d72375ee5c1_733x459.png 848w, https://substackcdn.com/image/fetch/$s_!onMs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F339071ad-8ac3-4987-9710-1d72375ee5c1_733x459.png 1272w, https://substackcdn.com/image/fetch/$s_!onMs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F339071ad-8ac3-4987-9710-1d72375ee5c1_733x459.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Is this solution good enough? Yes, it can certainly work, and parts of our system are using this in production. However, we&#8217;ve also created a more optimal solution.</p><h3>Final Solution&#8202;&#8212;&#8202;Queue the Message Inside the Database, Send it&nbsp;Later</h3><p>Our final solution is very similar to using delayed jobs, but is more efficient as the system scales. The key problem with delayed jobs is that scheduling a unique job for each event is quite expensive. Ideally we&#8217;d like to group multiple events, sending them to the queue as a single batch&#8202;&#8212;&#8202;a feature provided by SQS to eliminate network round-trips.</p><p>Our final solution involves the use of an <code>outbox</code> database table, which is similar to <code>delayed_jobs</code>, but is exclusively for persisting outgoing events. As you&#8217;d expect, this table is updated as part of the same database transaction, so changes are rolled back if the main database transaction fails.</p><p>Next, instead of using a generic delayed job server that can execute arbitrary code, we have a dedicated <code>outbox_flush</code> loop, focused on grouping events together and posting them to SQS in a batch. By batching into groups of 10 events at a time, the cost of posting to SQS is reduced from 20ms per event (sometimes as high as 50ms), down to an average of 2&#8211;5ms per event. In case of failure, the events are always durable, with the flush loop retrying until the events are sent successfully.</p><p>Here&#8217;s the new architecture:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LkiA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17538ba8-646b-43b4-b53a-0986dd84a234_800x361.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LkiA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17538ba8-646b-43b4-b53a-0986dd84a234_800x361.png 424w, https://substackcdn.com/image/fetch/$s_!LkiA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17538ba8-646b-43b4-b53a-0986dd84a234_800x361.png 848w, https://substackcdn.com/image/fetch/$s_!LkiA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17538ba8-646b-43b4-b53a-0986dd84a234_800x361.png 1272w, https://substackcdn.com/image/fetch/$s_!LkiA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17538ba8-646b-43b4-b53a-0986dd84a234_800x361.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LkiA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17538ba8-646b-43b4-b53a-0986dd84a234_800x361.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/17538ba8-646b-43b4-b53a-0986dd84a234_800x361.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LkiA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17538ba8-646b-43b4-b53a-0986dd84a234_800x361.png 424w, https://substackcdn.com/image/fetch/$s_!LkiA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17538ba8-646b-43b4-b53a-0986dd84a234_800x361.png 848w, https://substackcdn.com/image/fetch/$s_!LkiA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17538ba8-646b-43b4-b53a-0986dd84a234_800x361.png 1272w, https://substackcdn.com/image/fetch/$s_!LkiA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17538ba8-646b-43b4-b53a-0986dd84a234_800x361.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">Applications persists events to &#8220;outbox&#8221; table, while flush loop batches and sends to&nbsp;queue.</figcaption></figure></div><p>As before, we have a similar flow of data. We transmit message queue events if, and only if, the main database transaction completes.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iN90!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a012828-cb49-4485-8782-80d3f4b9049d_748x456.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iN90!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a012828-cb49-4485-8782-80d3f4b9049d_748x456.png 424w, https://substackcdn.com/image/fetch/$s_!iN90!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a012828-cb49-4485-8782-80d3f4b9049d_748x456.png 848w, https://substackcdn.com/image/fetch/$s_!iN90!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a012828-cb49-4485-8782-80d3f4b9049d_748x456.png 1272w, https://substackcdn.com/image/fetch/$s_!iN90!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a012828-cb49-4485-8782-80d3f4b9049d_748x456.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iN90!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a012828-cb49-4485-8782-80d3f4b9049d_748x456.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8a012828-cb49-4485-8782-80d3f4b9049d_748x456.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iN90!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a012828-cb49-4485-8782-80d3f4b9049d_748x456.png 424w, https://substackcdn.com/image/fetch/$s_!iN90!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a012828-cb49-4485-8782-80d3f4b9049d_748x456.png 848w, https://substackcdn.com/image/fetch/$s_!iN90!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a012828-cb49-4485-8782-80d3f4b9049d_748x456.png 1272w, https://substackcdn.com/image/fetch/$s_!iN90!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a012828-cb49-4485-8782-80d3f4b9049d_748x456.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>To allow this mechanism to be used consistently across our platform, we&#8217;ve wrapped this mechanism into a proprietary Ruby gem for reuse within Galvanize&#8217;s products.</p><h3>Conclusion</h3><p>We&#8217;ve shown how to improve scalability and reliability for an event-driven application by combining database transactions with events to be sent on a message queue. There are many reasons for failure, but by integrating events into the main database transaction, making them atomic and durable, we can be confident they will not be lost. We also have the ability to improve our message queue performance by batching together smaller events.</p><p>Of course, although these solutions address the problem from the source application&#8217;s perspective, there are still plenty of opportunities for failure that we haven&#8217;t discussed. Most notably, if the destination service (or services) fail to process the event correctly, we&#8217;ll still have a problem with inconsistent data. However, that&#8217;s another topic for a different blog post.</p>]]></content:encoded></item><item><title><![CDATA[Apache Spark on Amazon EMR]]></title><description><![CDATA[By Dr Peter Smith, Principal Software Engineer, ACL.]]></description><link>https://www.petersmith.net/p/apache-spark-on-amazon-emr-98f04fd346c9</link><guid isPermaLink="false">https://www.petersmith.net/p/apache-spark-on-amazon-emr-98f04fd346c9</guid><dc:creator><![CDATA[Peter Smith]]></dc:creator><pubDate>Mon, 17 Dec 2018 17:01:00 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/4127e949-6868-4f80-9b0e-3bb45f8e851f_800x613.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>By <a href="https://www.linkedin.com/in/peter-smith-phd-9140441/">Dr Peter Smith</a>, Principal Software Engineer, <a href="https://www.acl.com">ACL</a>.</p><p>I recently had the good fortune of presenting at the <a href="https://www.meetup.com/Vancouver-Amazon-Web-Services-User-Group/">Vancouver Amazon Web Services User Group</a>. This monthly event, organized by <a href="https://www.onica.com">Onica</a>, is a great opportunity to network with like-minded people in the community, and to discuss AWS-related topics.</p><p>In my presentation, I provided an introduction to the <a href="https://spark.apache.org/">Apache Spark</a> analytics framework, and gave a quick demo of using <a href="https://aws.amazon.com/emr/">Amazon EMR (Elastic Map Reduce)</a> to perform a few basic queries. Here&#8217;s a summary of what was discussed.</p><h3>Apache Spark&#8202;&#8212;&#8202;Unified Analytics Engine</h3><p>Apache Spark has rapidly become a mainstream solution for big data analytics. Numerous organizations take advantage of Spark&#8202;&#8212;&#8202;processing terabytes of data with the goal of discovering new insights they wouldn&#8217;t otherwise have. This includes processing of financial data, analyzing web click streams, and monitoring and reacting to data from IoT sensors.</p><p>There are many ways to perform analytics with Spark. When Spark is used in a batch-processing environment, input data is placed into cheap storage (such as Amazon S3). At a later time, a Spark cluster reads the data, performs complex analytics (sometimes taking minutes or hours), then writes the final result to the output. In addition to this traditional batch-processing model, Spark also supports machine learning, real-time streaming analytics, and graph-based analytics.</p><p>What makes Spark so powerful is the ability to divide and conquer. Multiple worker nodes are created, with the analytics computation being distributed amongst them. The following diagram illustrates a Spark cluster with four worker nodes (EC2 instances). Input data is stored in S3 files, and then partitioned and shared amongst the workers. The result of the analytic computation can later be written back to another S3 bucket.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Up5f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a2daaa4-74a2-4aed-9796-c184b8165028_800x613.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Up5f!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a2daaa4-74a2-4aed-9796-c184b8165028_800x613.png 424w, https://substackcdn.com/image/fetch/$s_!Up5f!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a2daaa4-74a2-4aed-9796-c184b8165028_800x613.png 848w, https://substackcdn.com/image/fetch/$s_!Up5f!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a2daaa4-74a2-4aed-9796-c184b8165028_800x613.png 1272w, https://substackcdn.com/image/fetch/$s_!Up5f!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a2daaa4-74a2-4aed-9796-c184b8165028_800x613.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Up5f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a2daaa4-74a2-4aed-9796-c184b8165028_800x613.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4a2daaa4-74a2-4aed-9796-c184b8165028_800x613.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Up5f!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a2daaa4-74a2-4aed-9796-c184b8165028_800x613.png 424w, https://substackcdn.com/image/fetch/$s_!Up5f!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a2daaa4-74a2-4aed-9796-c184b8165028_800x613.png 848w, https://substackcdn.com/image/fetch/$s_!Up5f!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a2daaa4-74a2-4aed-9796-c184b8165028_800x613.png 1272w, https://substackcdn.com/image/fetch/$s_!Up5f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a2daaa4-74a2-4aed-9796-c184b8165028_800x613.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>In addition to Apache Spark being a well-supported open source framework, with an active user community, AWS makes it trivial to create and manage Spark clusters as part of their EMR (Elastic Map Reduce) offering. More on that later.</p><h3>Spark is Different from a Relational Database</h3><p>Although Spark is often used to analyze tables of &#8220;rectangular&#8221; data (with rows and columns), and it also supports the familiar SQL language, it would be incorrect to refer to Spark as a relational database. In fact, there are numerous key differences between how Spark manipulates data, versus how the same task is performed in a relational database.</p><p>To help understand the benefits provided by Spark, let&#8217;s discuss these differences.</p><h4>Programming Languages</h4><p>Most relational database systems support the SQL language for querying data. In addition, many of these systems also support the concept of <em>stored procedures</em>, allowing user-defined code to execute inside the database. Although stored procedures provide immense value, they&#8217;re written in the database&#8217;s specific programming language, and are limited to the run-time environment provided by the database.</p><p>In the case of Spark, the SQL language is partially supported, but that&#8217;s only the starting point. Spark runs on a JVM (Java Virtual Machine) and therefore analytics code can be written in any JVM-based language, such as Java or Scala, providing compatibility with decades of existing code libraries. Additionally, the Python language is fully supported, allowing access to the great libraries and utilities that data scientists know and love.</p><h4>Scalability</h4><p>Relational databases can utilize multiple CPU cores, providing excellent <em>vertical</em> <em>scalability</em>. However, many of the advanced features (such as concurrency, locking, and failure recovery) are easier to support if those CPU cores are tightly coupled within a single server host. That is, all the CPUs must share the same memory space and therefore be inside the same physical host.</p><p>In the case of Spark, support for distributed computation is of primary importance, allowing a Spark cluster to <em>horizontally scale</em> up to much larger data sets (running on 100s or 1000s of hosts). Of course, the distributed (multi-server) nature of Spark means that concurrency, locking, and failure recovery must be handled very differently than with a centralized database.</p><h4>Data Storage&nbsp;Formats</h4><p>Because of the tightly-coupled nature of a relational database, the server has complete control over how data is stored on disk. The operations for querying, inserting, and updating data rows are optimized to use data structures such as B-Trees and WALs. The database user (a human) likely knows nothing about how these data structures work, and will never examine the underlying data files. The complexity of the database is therefore hidden.</p><p>In a Spark environment, the data formats are fully under the control of users. Data is read from disk in a generic format, such as <a href="https://en.wikipedia.org/wiki/Comma-separated_values">CSV</a>, <a href="https://en.wikipedia.org/wiki/JSON">JSON</a>, or <a href="https://en.wikipedia.org/wiki/Apache_Parquet">Parquet</a>, and the final output is written back to disk in a similar user-selected format.</p><h4>Read/Write Versus Read-Only</h4><p>As a result of Spark allowing arbitrary user-chosen disk formats, all reading of input, and writing of output, happens in a user-directed way. Spark doesn&#8217;t have control of how data is placed on disk, and therefore isn&#8217;t able to insert new data rows, or update individual fields, as you&#8217;d often do in a relational database.</p><p>Instead, Spark reads the data from the input file into main memory (as much as will fit at one time), then performs the analytic computation. Once the final result is complete, the output is fully written back to disk. The key point is that Spark is <em>not</em> suited for transactional operations where small in-place updates are made to existing data.</p><h4>Resilience</h4><p>In a relational database, it&#8217;s common to use a <em>master-slave</em> arrangement to recover from failures. The slave server functions in a passive state, simply tracking all the changes made to the master&#8217;s data. However, if the master server fails, the slave is promoted to become the new master, with very little downtime.</p><p>Spark uses a very different approach&#8202;&#8212;&#8202;rather than having a hot-backup for each of the worker nodes, any failure results in the failed worker&#8217;s computation being repeated again from the beginning (or the latest check point). More specifically, Spark tracks the data&#8217;s <em>lineage</em>, so it knows how to regenerate the computation by replaying the same analytic tasks on a different server.</p><p>With 1000s of worker nodes, there&#8217;s a good chance that one of them will fail and its work must be replayed. Note however, it would be significantly more expensive to have 1000 slaves nodes acting as hot-backups for the 1000 primary worker nodes!</p><h4>Always-On or On-Demand?</h4><p>Relational databases run on a 24/7 basis. As new data arrives, or existing data is updated, the server is always up-and-running, and available to receive and store the updates. If you have a large database with lots of CPU power and lots of RAM, the infrastructure costs start to add up.</p><p>In a Spark environment, it&#8217;s common to collect data (in CSV or JSON format) and immediately place it into cheap storage (such as Amazon S3). If nothing else is done with the data at that point in time, there&#8217;s no need for Spark workers to be available. All you pay for is the low monthly cost of data in S3.</p><p>However, when it&#8217;s time to perform some analytics (for example, at the end of the month, or the fiscal year), we fire up a large Spark cluster with lots of worker nodes. Only at that time is the data read into the cluster, and the intense computation is performed. Once the work is complete, the Spark cluster is shut down to save the infrastructure cost.</p><h3>A Practical Example</h3><p>As mentioned earlier, Apache Spark is an open source package, freely available for download. However, there&#8217;s still plenty of effort required to configure the worker nodes and install the software. Luckily for us, <a href="https://aws.amazon.com/emr/">Amazon EMR</a> makes this trivial, allowing creation of a Spark cluster in a matter of minutes.</p><h4>Starting the&nbsp;Cluster</h4><p>The following screenshot shows how little must be done to start up the Spark cluster we showed earlier, with five EC2 instances (one master and four workers) of size <code>m4.2xlarge</code>:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dAWr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bdb29f-38f3-4f85-8534-4da65333f097_1089x915.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dAWr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bdb29f-38f3-4f85-8534-4da65333f097_1089x915.png 424w, https://substackcdn.com/image/fetch/$s_!dAWr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bdb29f-38f3-4f85-8534-4da65333f097_1089x915.png 848w, https://substackcdn.com/image/fetch/$s_!dAWr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bdb29f-38f3-4f85-8534-4da65333f097_1089x915.png 1272w, https://substackcdn.com/image/fetch/$s_!dAWr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bdb29f-38f3-4f85-8534-4da65333f097_1089x915.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dAWr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bdb29f-38f3-4f85-8534-4da65333f097_1089x915.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d2bdb29f-38f3-4f85-8534-4da65333f097_1089x915.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dAWr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bdb29f-38f3-4f85-8534-4da65333f097_1089x915.png 424w, https://substackcdn.com/image/fetch/$s_!dAWr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bdb29f-38f3-4f85-8534-4da65333f097_1089x915.png 848w, https://substackcdn.com/image/fetch/$s_!dAWr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bdb29f-38f3-4f85-8534-4da65333f097_1089x915.png 1272w, https://substackcdn.com/image/fetch/$s_!dAWr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bdb29f-38f3-4f85-8534-4da65333f097_1089x915.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Enter this information onto the single page shown above, then press the &#8220;Create cluster&#8221; button. In less than five minutes you&#8217;ll have a Spark cluster up and running!</p><h4>Starting an Interactive Session</h4><p>The next step is to initiate an interactive session using <a href="https://zeppelin.apache.org/">Apache Zeppelin</a>, which is also installed as part of EMR. This configuration is slightly complicated, involving the creation of an SSH tunnel, then running a browser plugin to access the Zeppelin interface via that tunnel. Luckily, EMR provides excellent online help to get this working.</p><p>Once connected, you&#8217;ll see an interactive console in your browser:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XLzv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0f824ef-b342-46de-8ae2-2594f6c324b1_671x273.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XLzv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0f824ef-b342-46de-8ae2-2594f6c324b1_671x273.png 424w, https://substackcdn.com/image/fetch/$s_!XLzv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0f824ef-b342-46de-8ae2-2594f6c324b1_671x273.png 848w, https://substackcdn.com/image/fetch/$s_!XLzv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0f824ef-b342-46de-8ae2-2594f6c324b1_671x273.png 1272w, https://substackcdn.com/image/fetch/$s_!XLzv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0f824ef-b342-46de-8ae2-2594f6c324b1_671x273.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XLzv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0f824ef-b342-46de-8ae2-2594f6c324b1_671x273.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f0f824ef-b342-46de-8ae2-2594f6c324b1_671x273.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XLzv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0f824ef-b342-46de-8ae2-2594f6c324b1_671x273.png 424w, https://substackcdn.com/image/fetch/$s_!XLzv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0f824ef-b342-46de-8ae2-2594f6c324b1_671x273.png 848w, https://substackcdn.com/image/fetch/$s_!XLzv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0f824ef-b342-46de-8ae2-2594f6c324b1_671x273.png 1272w, https://substackcdn.com/image/fetch/$s_!XLzv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0f824ef-b342-46de-8ae2-2594f6c324b1_671x273.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h4>Loading Data into&nbsp;Spark</h4><p>The next step is to tell Spark how to load the data from S3. In this example, we&#8217;re loading a CSV-formatted file that we&#8217;d previously stored in an S3 bucket. We&#8217;ve chosen to use the Scala programming language, but other languages will be similar:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bH9b!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c63e9e5-add3-47a1-aadb-c2f7628a63db_800x183.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bH9b!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c63e9e5-add3-47a1-aadb-c2f7628a63db_800x183.png 424w, https://substackcdn.com/image/fetch/$s_!bH9b!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c63e9e5-add3-47a1-aadb-c2f7628a63db_800x183.png 848w, https://substackcdn.com/image/fetch/$s_!bH9b!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c63e9e5-add3-47a1-aadb-c2f7628a63db_800x183.png 1272w, https://substackcdn.com/image/fetch/$s_!bH9b!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c63e9e5-add3-47a1-aadb-c2f7628a63db_800x183.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bH9b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c63e9e5-add3-47a1-aadb-c2f7628a63db_800x183.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7c63e9e5-add3-47a1-aadb-c2f7628a63db_800x183.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bH9b!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c63e9e5-add3-47a1-aadb-c2f7628a63db_800x183.png 424w, https://substackcdn.com/image/fetch/$s_!bH9b!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c63e9e5-add3-47a1-aadb-c2f7628a63db_800x183.png 848w, https://substackcdn.com/image/fetch/$s_!bH9b!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c63e9e5-add3-47a1-aadb-c2f7628a63db_800x183.png 1272w, https://substackcdn.com/image/fetch/$s_!bH9b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c63e9e5-add3-47a1-aadb-c2f7628a63db_800x183.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h4>Querying the&nbsp;Data</h4><p>We can now perform a query on the data. In this case, we&#8217;re computing the <em>year</em> portion of the <em>timestamp</em> values from column <code>_c5</code>. We then count how many records are associated with each of the years, and will display the result for the first ten years.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tbju!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29b7a166-9365-4fd3-a242-506711105c05_470x448.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tbju!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29b7a166-9365-4fd3-a242-506711105c05_470x448.png 424w, https://substackcdn.com/image/fetch/$s_!tbju!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29b7a166-9365-4fd3-a242-506711105c05_470x448.png 848w, https://substackcdn.com/image/fetch/$s_!tbju!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29b7a166-9365-4fd3-a242-506711105c05_470x448.png 1272w, https://substackcdn.com/image/fetch/$s_!tbju!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29b7a166-9365-4fd3-a242-506711105c05_470x448.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tbju!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29b7a166-9365-4fd3-a242-506711105c05_470x448.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/29b7a166-9365-4fd3-a242-506711105c05_470x448.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tbju!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29b7a166-9365-4fd3-a242-506711105c05_470x448.png 424w, https://substackcdn.com/image/fetch/$s_!tbju!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29b7a166-9365-4fd3-a242-506711105c05_470x448.png 848w, https://substackcdn.com/image/fetch/$s_!tbju!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29b7a166-9365-4fd3-a242-506711105c05_470x448.png 1272w, https://substackcdn.com/image/fetch/$s_!tbju!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29b7a166-9365-4fd3-a242-506711105c05_470x448.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Summary</h3><p>The presentation I gave to the Vancouver Amazon Web Services User Group was well-received, with a lot of people showing interest in Spark, EMR, and the general concepts of distributed computation.</p><p>Spark has quite a different architecture from a typical relational database, and it&#8217;s worth understanding those differences when deciding how to perform your analytics computation. In particular, Spark is well-suited for performing large-scale analytics on read-only data. However, it&#8217;s not the correct solution if you need transactional updates made to existing data.</p><p>Finally, Amazon&#8217;s EMR service makes it really easy (and cheap) to perform large-scale analytics. Why not try it for yourself?</p>]]></content:encoded></item><item><title><![CDATA[Streaming Structured JSON]]></title><description><![CDATA[JavaScript Object Notation (JSON) is perhaps the most ubiquitous way of transmitting data between the components of a SaaS application&#8230;]]></description><link>https://www.petersmith.net/p/streaming-structured-json-18da4edd4f20</link><guid isPermaLink="false">https://www.petersmith.net/p/streaming-structured-json-18da4edd4f20</guid><dc:creator><![CDATA[Peter Smith]]></dc:creator><pubDate>Mon, 24 Sep 2018 17:36:01 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/84f966c8-5895-4fea-9740-45c26688af05_800x450.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><a href="https://www.ietf.org/rfc/rfc4627.txt">JavaScript Object Notation</a> (JSON) is perhaps the most ubiquitous way of transmitting data between the components of a SaaS application. It&#8217;s the native data format for web browsers and Node.js, with practically every other programming language providing libraries to serialize data to and from JSON.</p><p>In this article, we&#8217;ll discuss the idea of <em>JSON Streaming</em>&#8202;&#8212;&#8202;that is, how do we process streams of JSON data that are extremely large, or potentially infinite in length. In such cases, the JSON messages are too large to be held entirely in a single computer&#8217;s RAM, and must instead be processed incrementally as the data is being read from, or written to, external locations.</p><p>More specifically, in this article <em><strong>we&#8217;ll talk about streaming JSON data that has a non-trivial structure.</strong></em></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qwPh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bc506e9-8f99-4b6f-8642-9bf517d283ae_800x450.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qwPh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bc506e9-8f99-4b6f-8642-9bf517d283ae_800x450.jpeg 424w, https://substackcdn.com/image/fetch/$s_!qwPh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bc506e9-8f99-4b6f-8642-9bf517d283ae_800x450.jpeg 848w, https://substackcdn.com/image/fetch/$s_!qwPh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bc506e9-8f99-4b6f-8642-9bf517d283ae_800x450.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!qwPh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bc506e9-8f99-4b6f-8642-9bf517d283ae_800x450.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qwPh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bc506e9-8f99-4b6f-8642-9bf517d283ae_800x450.jpeg" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3bc506e9-8f99-4b6f-8642-9bf517d283ae_800x450.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qwPh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bc506e9-8f99-4b6f-8642-9bf517d283ae_800x450.jpeg 424w, https://substackcdn.com/image/fetch/$s_!qwPh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bc506e9-8f99-4b6f-8642-9bf517d283ae_800x450.jpeg 848w, https://substackcdn.com/image/fetch/$s_!qwPh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bc506e9-8f99-4b6f-8642-9bf517d283ae_800x450.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!qwPh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3bc506e9-8f99-4b6f-8642-9bf517d283ae_800x450.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><h3>The Problem&#8202;&#8212;&#8202;Not all JSON is Predictable</h3><p>The concept of JSON Streaming isn&#8217;t new, and numerous methods are documented on <a href="https://en.wikipedia.org/wiki/JSON_streaming">Wikipedia</a>. However, most streaming techniques assume the elements of the JSON stream are predictable, and don&#8217;t depend on each other.</p><p>For example, a JSON stream that reports data from weather stations may consist of a sequence of JSON objects, separated by newline characters. The application reads each line as a separate record, without any need to load the entire data set into RAM. Using this model, we can process GB or TB of JSON data while only using KB of RAM!</p><pre><code>{ "stationID": 1234, "temperature": 65, "wind": 12 }
{ "stationID": 1362, "temperature": 20, "wind": 23 }
...
{ "stationID": 1362, "temperature": 19, "wind": 24 }</code></pre><p>However, what if the JSON contained multiple sections, with the first section providing meta-data necessary to understand the later sections? We no longer have a repeating pattern, but instead must store and update information in an internal database, as we progress through the JSON stream. In the later sections of the stream, we can refer back to that database to interpret the newly-arriving values.</p><p>For example, what if our weather data includes detail of each weather station:</p><pre><code>{
  "stations": [
    { "stationID": 1234, "city": "Seattle", "units": "imperial" },
    { "stationID": 1362, "city": "Vancouver", "units": "metric" },
    ...
  ],
  "reports": [
    { "stationID": 1234, "temperature": 65, "wind": 12 },
    { "stationID": 1362, "temperature": 20, "wind": 23 },
    ...
    { "stationID": 1362, "temperature": 19, "wind": 24 }
  ]
}</code></pre><p>In this example, we must first read the <code>stations</code> array to determine whether each weather station reports in <code>metric</code> or <code>imperial</code> units. When we later process the <code>reports</code> array, the values for <code>temperature</code> and <code>wind</code> will be scaled appropriately.</p><p>The concern here is that the JSON input is no longer <em><strong>trivial</strong></em> or <em><strong>repeating. </strong></em>Instead, some elements of the JSON object depend on values provided in previous parts of the same object. Our example is fairly simple, but imagine a more complicated JSON object structure with more dependencies between them. The basic JSON streaming approaches mentioned in Wikipedia are simply not going to help.</p><p>Before we discuss solutions, it&#8217;s worth mentioning an important assumption. Experts will note that JSON objects are an <em><strong>unordered</strong></em> collection of key/value pairs. For our purposes, however, we need to assume the <code>stations</code> key appears earlier in the stream than the <code>reports</code> key. The software generating the JSON stream must abide by this rule.</p><p>Let&#8217;s look at the architecture of how this can be solved.</p><h3>The Big&nbsp;Picture</h3><p>The following diagram illustrates our overall solution for reading a stream of JSON, while maintaining derived information as we progress through the input. The two main components we should focus on are the <em><strong>Tokenizer</strong></em> and the <em><strong>State Machine</strong></em>.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!whph!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc5dcf7f-d446-49ff-b987-675571777140_1076x455.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!whph!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc5dcf7f-d446-49ff-b987-675571777140_1076x455.png 424w, https://substackcdn.com/image/fetch/$s_!whph!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc5dcf7f-d446-49ff-b987-675571777140_1076x455.png 848w, https://substackcdn.com/image/fetch/$s_!whph!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc5dcf7f-d446-49ff-b987-675571777140_1076x455.png 1272w, https://substackcdn.com/image/fetch/$s_!whph!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc5dcf7f-d446-49ff-b987-675571777140_1076x455.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!whph!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc5dcf7f-d446-49ff-b987-675571777140_1076x455.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dc5dcf7f-d446-49ff-b987-675571777140_1076x455.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!whph!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc5dcf7f-d446-49ff-b987-675571777140_1076x455.png 424w, https://substackcdn.com/image/fetch/$s_!whph!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc5dcf7f-d446-49ff-b987-675571777140_1076x455.png 848w, https://substackcdn.com/image/fetch/$s_!whph!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc5dcf7f-d446-49ff-b987-675571777140_1076x455.png 1272w, https://substackcdn.com/image/fetch/$s_!whph!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc5dcf7f-d446-49ff-b987-675571777140_1076x455.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>In this model, the input is a sequence of text characters (streamed from a file, or from a network connection), which is tokenized into the basic building blocks of a JSON object (such as <code>StartOfObject</code> or <code>StringValue</code>&#8202;&#8212;&#8202;more on these later). We then use a state machine to traverse the JSON object&#8217;s structure and pull out the interesting values. As we progress through the JSON object (by transitioning between states), we update the database accordingly. Finally, the transformed data is sent to the output.</p><p>Let&#8217;s look at those steps in more detail.</p><h3>The Tokenizer</h3><p>This component of our pipeline reads a continuous stream of characters from the input. Its job is to group the input characters into meaningful atomic tokens. To illustrate, let&#8217;s revisit our earlier example:</p><pre><code>{
  "stations": [
    { "stationID": 1234, "city": "Seattle", "units": "imperial" },
    { "stationID": 1362, "city": "Vancouver", "units": "metric" },
    ...
  ],
  ...
}</code></pre><p>In this example, the Tokenizer outputs the following stream of tokens:</p><pre><code>StartOfObject
FieldName("stations")
StartOfArray
StartOfObject
FieldName("stationID")
NumberValue(1234)
FieldName("city")
StringValue("Seattle")
FieldName("units")
StringValue("imperial")
EndOfObject
...
EndOfArray
...
EndOfObject</code></pre><p>If you read carefully through the stream of input characters, you&#8217;ll see a one-to-one mapping with the tokens sent to the output. We record the type of each token (such as <code>FieldName</code>), along with an optional data value (such as <code>units</code>).</p><p>It&#8217;s important to remember that this stream of tokens could be infinitely long, simply because the stream of input characters might be infinitely long. In reality, any JSON object that&#8217;s too large to fit into RAM is a candidate for this approach.</p><p>Streaming software generally reads input characters in small batches (for example, 4KB-8KB at a time). Although you might intuitively feel that streamed data should be processed one character at a time, that would be highly inefficient&#8202;&#8212;&#8202;we instead read a full disk block, or read a full network packet each time. When we run out of characters, we ask for the next block. Our memory footprint is therefore proportional to the size of an input block (such as 4KB), rather than the size of the entire JSON object.</p><p>Because of the way the token stream is created, we can also be confident the JSON object is syntactically well-formed. That is, all the open and close braces match, and the keys and values are paired correctly. However, we don&#8217;t yet know whether the JSON object is semantically correct. For example, we must still confirm that the <code>"stations"</code>key exists and it refers to a JSON array.</p><p>This is where the State Machine comes into action.</p><h3>The State&nbsp;Machine</h3><p>The purpose of a state machine is to remember which part of the JSON object we&#8217;re currently processing. In our weather station example, we start by scanning through the <code>"stations"</code> section while collecting meta-data about the location and measurement units of each station. Once we reach the end of the array, we then switch to a different state for processing the content of the <code>reports</code> array.</p><p>The following diagram shows a (partial) state machine for scanning through the stream of tokens, transitioning from one state to another based on the token&#8217;s type.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Aglt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feef77340-f4fa-4ad2-b6b6-39b9c941bf21_1100x482.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Aglt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feef77340-f4fa-4ad2-b6b6-39b9c941bf21_1100x482.png 424w, https://substackcdn.com/image/fetch/$s_!Aglt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feef77340-f4fa-4ad2-b6b6-39b9c941bf21_1100x482.png 848w, https://substackcdn.com/image/fetch/$s_!Aglt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feef77340-f4fa-4ad2-b6b6-39b9c941bf21_1100x482.png 1272w, https://substackcdn.com/image/fetch/$s_!Aglt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feef77340-f4fa-4ad2-b6b6-39b9c941bf21_1100x482.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Aglt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feef77340-f4fa-4ad2-b6b6-39b9c941bf21_1100x482.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eef77340-f4fa-4ad2-b6b6-39b9c941bf21_1100x482.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Aglt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feef77340-f4fa-4ad2-b6b6-39b9c941bf21_1100x482.png 424w, https://substackcdn.com/image/fetch/$s_!Aglt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feef77340-f4fa-4ad2-b6b6-39b9c941bf21_1100x482.png 848w, https://substackcdn.com/image/fetch/$s_!Aglt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feef77340-f4fa-4ad2-b6b6-39b9c941bf21_1100x482.png 1272w, https://substackcdn.com/image/fetch/$s_!Aglt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feef77340-f4fa-4ad2-b6b6-39b9c941bf21_1100x482.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>As with all state machines, we begin at the start state (top left) and progress from one state to the next, as we consume tokens from the input stream. If a particular state doesn&#8217;t have a transition for the next token in the input, the JSON object is considered invalid (we won&#8217;t discuss this situation). Given our example token stream, you should be able to trace through the state machine and imagine all the tokens being successfully consumed.</p><p>As discussed earlier, the state machine also has the ability to record the weather station information, for later use in the same stream. When we transition from one state to another, and that transition is annotated with an action box, the state machine performs the provided action. This allows us to update our internal database with the weather station details.</p><p>Note that the <em>&#8220;Record Field Name&#8221;</em> and <em>&#8220;Record Field Value&#8221;</em> boxes are fairly simple and merely save the values into local RAM. However, the <em>&#8220;Validate and Save Record&#8221;</em> box has the task of ensuring that all required fields (<code>stationId</code>, <code>city</code>, and <code>units</code>) were correctly provided, and they have meaningful values. The entire record can then be written to the database, or some other persistent storage.</p><p>Although we don&#8217;t show the second part of the state machine (where the <code>reports</code> section is consumed), the approach is generally the same. In our particular example, we&#8217;re not planning to store the output from <code>reports</code> in a database, but will instead send it downstream to some other consumer, or will perhaps discard the data after computing a running average. Therefore, the key difference in the state machine is that we only <em>retrieve</em> previous information from the database, not store it.</p><h3>The Final&nbsp;Output</h3><p>In order to do something useful, the state machine must contain an action to generate output. In our weather station example, we&#8216;ll generate a stream of comma-separated values (CSV) data showing the equivalent information, but always using metric units (degrees celsius, and kilometres per hour).</p><p>Here&#8217;s the corresponding output:</p><pre><code>1234,18.3,19.3
1362,20,23
...
1362,19,24</code></pre><p>The actual data in the output, and the format you choose, is entirely your decision. The important fact is that we&#8217;ve processed a very large amount of JSON data on the input, without requiring that we load the entire JSON object into RAM at one time.</p><h3>But, Isn&#8217;t This All Too Complicated?</h3><p>Probably by now you&#8217;re wondering whether there&#8217;s a simpler solution. It feels like a lot of work to tokenize the input, then build a state machine, so why should we go to such extremes? Let&#8217;s discuss some design considerations:</p><h4>1. Does Your JSON Really Need to Have Dependencies?</h4><p>Our whole discussion has focused on using information from one part of the JSON message to interpret data from later parts of that same message. If you have a choice, simply avoid merging the information into the same stream in the first place. This makes parsing the data much easier.</p><p>If you don&#8217;t have a choice, read on&#8230;</p><h4>2. Does the JSON Stream Have Predictable Structure?</h4><p>If you do a Google search for &#8220;JSON Streaming&#8221; and throw in the name of your favourite programming language, you&#8217;ll find a bunch of libraries that address the problem in their own unique way.</p><p>Some of the advanced libraries support the <a href="http://goessner.net/articles/JsonPath/">JSON Path</a> concept. That is, given a stream of JSON, and one or more path expressions, the library will extract all the JSON elements matching the specified paths. For example, we can extract all the weather station data by <em>listening to</em> the following two paths:</p><pre><code>$.stations[*]   // on match, record the station details.
$.reports[*]    // on match, normalize the data and output to CSV.</code></pre><p>Note that <code>$</code> is the object root, and <code>[*]</code> means all elements in the array.</p><p>In our example, we need a library that can listen to multiple JSON paths for the same stream, performing different actions depending on which path was matched. As an example, for JVM-based languages (Java, Scala, etc), you could try <a href="https://github.com/jsurfer/JsonSurfer">JsonSurfer</a>.</p><h4>3. What If the JSON Object Has Dynamic Structure?</h4><p>Where our state machine becomes worth the investment is when the JSON schema is defined dynamically. For example, the following JSON message specifies the names and types of the data that will appear later in the stream.</p><pre><code>{
  "types": [
    "name": "string",
    "birth": "date",
    "children": "array[string]"
  ],
  "data": [
    { "name": "Fred", "birth": "1966/02/03", "children": [...] },
    { "name": "Mary", "birth": "1976/10/13", "children": [...] },
    ...
  ]
}</code></pre><p>It wouldn&#8217;t be possible to construct a suitable JSON Path if we hadn&#8217;t already read the <code>types</code> element. Sure, we could use <code>[*]</code> to extract each row from <code>data</code>, but we&#8217;ll still need additional logic to traverse and validate the hierarchy of each sub-object, even if it&#8217;s now entirely in RAM.</p><p>Of course, building a state machine to accept a dynamically-structured JSON message isn&#8217;t easy, but that&#8217;s a topic for another day.</p><h4>4. What If It&#8217;s Not&nbsp;JSON?</h4><p>Finally, although we&#8217;ve focused extensively on JSON, the general approach of tokenizing characters, and then passing them through a state machine is just a good concept to be aware of. Many types of streaming data can be processed using this technique&#8202;&#8212;&#8202;it doesn&#8217;t need to be JSON. In fact, this is the exact approach used by the <a href="https://en.wikipedia.org/wiki/Parsing">parsing</a> function contained within most programming language compilers.</p><h3>Summary</h3><p>The use of state machines provides greater flexibility than most naive JSON streaming solutions. Those solutions can provide a stream of <code>stations</code> information, or <code>reports</code> information, but they can&#8217;t mix the two together. In our case, the structure of the JSON object can vary as we progress through the stream, with different actions being taken in each section.</p><p>Although our example was fairly simple, there are very few limits to the complexity of the JSON object we could handle, or the relationships between the various components. The only requirement is that data appears in the necessary order within the stream&#8202;&#8212;&#8202;that is, you can&#8217;t make use of data that hasn&#8217;t yet appeared.</p><p>Finally, this technique is fairly advanced, and you should consider carefully whether you actually need the full power of a state machine. Depending on your particular use-case, a simpler solution might be possible.</p>]]></content:encoded></item><item><title><![CDATA[Book Review: Making Sense of Stream Processing]]></title><description><![CDATA[If you&#8217;re a software architect of any kind, I encourage you to read Making Sense of Stream Processing by Martin Kleppmann. It&#8217;s one of&#8230;]]></description><link>https://www.petersmith.net/p/book-review-making-sense-of-stream-processing-1b75042f1a1e</link><guid isPermaLink="false">https://www.petersmith.net/p/book-review-making-sense-of-stream-processing-1b75042f1a1e</guid><dc:creator><![CDATA[Peter Smith]]></dc:creator><pubDate>Sun, 12 Aug 2018 17:34:05 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/06a3fca1-07f1-4724-9614-4de76867ccd1_180x270.gif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you&#8217;re a software architect of any kind, I encourage you to read <em><strong>Making Sense of Stream Processing</strong></em> by <a href="https://martin.kleppmann.com/">Martin Kleppmann</a>. It&#8217;s one of those free O&#8217;Reilly books they hand out at conferences, or you can choose to <a href="https://www.oreilly.com/data/free/stream-processing.csp">download a copy</a> after giving your contact information.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YOTJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7513407-a7f5-46e5-abda-1fa2469ff5e6_180x270.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YOTJ!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7513407-a7f5-46e5-abda-1fa2469ff5e6_180x270.gif 424w, https://substackcdn.com/image/fetch/$s_!YOTJ!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7513407-a7f5-46e5-abda-1fa2469ff5e6_180x270.gif 848w, https://substackcdn.com/image/fetch/$s_!YOTJ!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7513407-a7f5-46e5-abda-1fa2469ff5e6_180x270.gif 1272w, https://substackcdn.com/image/fetch/$s_!YOTJ!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7513407-a7f5-46e5-abda-1fa2469ff5e6_180x270.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YOTJ!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7513407-a7f5-46e5-abda-1fa2469ff5e6_180x270.gif" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e7513407-a7f5-46e5-abda-1fa2469ff5e6_180x270.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YOTJ!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7513407-a7f5-46e5-abda-1fa2469ff5e6_180x270.gif 424w, https://substackcdn.com/image/fetch/$s_!YOTJ!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7513407-a7f5-46e5-abda-1fa2469ff5e6_180x270.gif 848w, https://substackcdn.com/image/fetch/$s_!YOTJ!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7513407-a7f5-46e5-abda-1fa2469ff5e6_180x270.gif 1272w, https://substackcdn.com/image/fetch/$s_!YOTJ!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe7513407-a7f5-46e5-abda-1fa2469ff5e6_180x270.gif 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>Processing of data via streams is something you hear a lot about these days (especially alongside <em>Big Data</em>). However, this is one of the few books that clearly explains <em><strong>why you should stream data around your distributed system, rather than using centralized data stores</strong></em>. This includes large-scale SaaS applications that process large amounts of data&#8202;&#8212;&#8202;think of your favourite social media application such as Facebook or LinkedIn.</p><p>Throughout the book, a selection of practical use-cases are discussed, first with the traditional non-streaming solution, followed by a detailed explanation of how streaming-based solutions do a better job. These design trade-offs are what makes the book worth reading from cover to cover. They&#8217;ll make you think twice before building an application with a centralized database!</p><p>Although this book is sponsored by <a href="https://www.confluent.io/">Confluent</a>, the creators of open source <a href="https://kafka.apache.org/">Apache Kafka</a>, at no point did it feel like a sales pitch. Instead, this book is about concepts and design patterns that apply to any data streaming solutions (including <a href="https://aws.amazon.com/kinesis/data-streams/">Amazon Kinesis</a> or <a href="https://doc.akka.io/docs/akka/current/stream/index.html?language=scala">Akka Streams</a>). Don&#8217;t expect to see much source code though&#8202;&#8212;&#8202;that&#8217;s not what this book is about.</p><h3>Key Lessons</h3><p>Here are the key takeaways that make this book worth reading:</p><h4><strong>There&#8217;s No Single Way to Access&nbsp;Data</strong></h4><p>For me, the most important message of this book is that it&#8217;s unwise to store all of your application&#8217;s data in a single database, using a single schema. Instead, different parts of the application should <em><strong>store the same data in different ways</strong></em>, depending on their access patterns.</p><p>For example, if you press the &#8220;Like&#8221; button on a social media post, this may look like a single update to the application&#8217;s state, but actually results in multiple things happening&#8202;&#8212;&#8202;notifications are sent via email, statistics databases are updated to track the number of likes per article, and machine learning algorithms are triggered to recommend similar articles.</p><p>It&#8217;s unreasonable for all parts of the software application to query the same database for all these different purposes, so instead, we send the &#8220;like&#8221; event on an event stream for all those different services to listen to. They can then store and process the data however is best for them.</p><h4>Logging to Sequential Files Is Very&nbsp;Scalable</h4><p>A second key point is that logging data changes to a sequential log stream is more efficient (considering disk access time) than adding, updating, or splitting a B-Tree index, as used inside relational databases. If your application doesn&#8217;t actually need random read/write access to data, then a sequential stream of changes is potentially more useful.</p><p>Also, log streams can benefit from partitioning/sharding to reach higher levels of scale, they provide a global ordering in which the events actually happened, and can provide the ability to replay events from the past (until you decide to purge old data).</p><h4>Integrating Streams into Your Existing Application</h4><p>If you&#8217;re working on an application that currently uses a centralized database, and you&#8217;d instead like to start streaming your data, consider the <em>Change Data Capture</em> approach. By listening to incremental changes (<em>inserts </em>or <em>updates</em>) occurring inside your database, it&#8217;s possible to generate a stream of changes, in the order they occurred. We then get all the benefits of streaming data changes around the system, without modifying the original application.</p><h4>Similarity to Unix Pipelines</h4><p>Towards the end of the book, there&#8217;s a great comparison drawn with the Unix shell pipelines that many of us have used for decades. Standard tools such as <code>grep</code>, <code>awk</code>, <code>sort</code>, and <code>uniq</code> resemble micro-services that perform a single function, and the pipe operator <code>|</code> joins them together by passing streams of character data.</p><h3>Summary</h3><p>I recommend that anybody interested in software architecture should understand the concepts from this book. It&#8217;s free to download, and at 170 pages you can easily skip through sections you&#8217;re already familiar with.</p><p>I&#8217;m looking forward to reading Martin Kleppmann&#8217;s latest book, titled <a href="http://dataintensive.net/">Designing Data-Intensive Applications</a>.</p>]]></content:encoded></item><item><title><![CDATA[Hey SparkSQL, What’s the Average Date?]]></title><description><![CDATA[Apache Spark is one of the leading open source analytics frameworks, but it can&#8217;t do everything. In this blog post, we&#8217;ll look at a few&#8230;]]></description><link>https://www.petersmith.net/p/hey-sparksql-whats-the-average-date-2f25ee2c5be5</link><guid isPermaLink="false">https://www.petersmith.net/p/hey-sparksql-whats-the-average-date-2f25ee2c5be5</guid><dc:creator><![CDATA[Peter Smith]]></dc:creator><pubDate>Tue, 05 Jun 2018 21:10:24 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/be56a968-d5a0-4a28-8b86-40d023cd7315_600x267.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><a href="http://spark.apache.org/">Apache Spark</a> is one of the leading open source analytics frameworks, but it can&#8217;t do everything. In this blog post, we&#8217;ll look at a few different approaches to computing the <em>average of date-typed data</em>, which isn&#8217;t natively supported in Spark (as of version 2.3.0). Luckily though, Spark is highly customizable, allowing new analytic functions to be added quite easily.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Qar8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f1f2ba2-94cd-4286-92a4-3d885f54167a_600x267.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Qar8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f1f2ba2-94cd-4286-92a4-3d885f54167a_600x267.png 424w, https://substackcdn.com/image/fetch/$s_!Qar8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f1f2ba2-94cd-4286-92a4-3d885f54167a_600x267.png 848w, https://substackcdn.com/image/fetch/$s_!Qar8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f1f2ba2-94cd-4286-92a4-3d885f54167a_600x267.png 1272w, https://substackcdn.com/image/fetch/$s_!Qar8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f1f2ba2-94cd-4286-92a4-3d885f54167a_600x267.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Qar8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f1f2ba2-94cd-4286-92a4-3d885f54167a_600x267.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0f1f2ba2-94cd-4286-92a4-3d885f54167a_600x267.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Qar8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f1f2ba2-94cd-4286-92a4-3d885f54167a_600x267.png 424w, https://substackcdn.com/image/fetch/$s_!Qar8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f1f2ba2-94cd-4286-92a4-3d885f54167a_600x267.png 848w, https://substackcdn.com/image/fetch/$s_!Qar8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f1f2ba2-94cd-4286-92a4-3d885f54167a_600x267.png 1272w, https://substackcdn.com/image/fetch/$s_!Qar8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f1f2ba2-94cd-4286-92a4-3d885f54167a_600x267.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>You may be asking why you&#8217;d need to find the average of date values, especially since an <em>average</em> (or <em>arithmetic mean</em>) is typically an operation on numbers. For example, the average of <em><strong>January 7th, 1987</strong></em>, <em><strong>June 23rd, 1994</strong></em>, and <em><strong>December 10th, 1992</strong></em>, gives you the central date value of <em><strong>June 24th, 1991,</strong></em> but how is that useful?</p><p>One such case is <em>anomaly detection</em>, where it&#8217;s useful to identify date values falling outside the range of what&#8217;s considered &#8220;normal&#8221;. That is, if we determine the <em>average</em> of a series dates, as well as the <em>standard deviation</em> of those dates, we can then identify the <em>outliers</em> that fall beyond two standard deviations from the average.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_ajY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46da1e21-760b-4264-885a-f0310a01dbcc_597x262.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_ajY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46da1e21-760b-4264-885a-f0310a01dbcc_597x262.png 424w, https://substackcdn.com/image/fetch/$s_!_ajY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46da1e21-760b-4264-885a-f0310a01dbcc_597x262.png 848w, https://substackcdn.com/image/fetch/$s_!_ajY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46da1e21-760b-4264-885a-f0310a01dbcc_597x262.png 1272w, https://substackcdn.com/image/fetch/$s_!_ajY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46da1e21-760b-4264-885a-f0310a01dbcc_597x262.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_ajY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46da1e21-760b-4264-885a-f0310a01dbcc_597x262.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/46da1e21-760b-4264-885a-f0310a01dbcc_597x262.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_ajY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46da1e21-760b-4264-885a-f0310a01dbcc_597x262.png 424w, https://substackcdn.com/image/fetch/$s_!_ajY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46da1e21-760b-4264-885a-f0310a01dbcc_597x262.png 848w, https://substackcdn.com/image/fetch/$s_!_ajY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46da1e21-760b-4264-885a-f0310a01dbcc_597x262.png 1272w, https://substackcdn.com/image/fetch/$s_!_ajY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46da1e21-760b-4264-885a-f0310a01dbcc_597x262.png 1456w" sizes="100vw"></picture><div></div></div></a></figure></div><p>Identifying the outliers gives us an opportunity to remove invalid data from our data set, or perhaps to start investigating the root cause of the anomaly. In this scenario, computing the average date from a large data set is important.</p><p>In this blog post, we&#8217;ll look at two different approaches to computing an average date using the Spark framework. We&#8217;ll also discuss some accuracy and performance implications.</p><h3>The Problem</h3><p>If Spark (particularly SparkSQL) already supported this functionally, we&#8217;d be able to compute the average of a series of dates by reading them into a Spark DataFrame (a table with rows and columns), then invoking the <code>avg</code> function on the appropriate column.</p><p>In this example, we&#8217;ll read the dates from the single-column CSV file<code>dates.csv</code>:</p><pre><code>2017-01-02
2017-03-04
2017-05-06
2017-08-01
...
2017-10-12</code></pre><p>To read this file into a Spark DataFrame, ensuring that the single column of data is interpreted as a <code>Date</code>-typed value, we use the following code:</p><pre><code>import org.apache.spark.sql.types._</code></pre><pre><code>// The single column ("datecol") must be interpreted as a Date
val schema = StructType(
  Seq(
    StructField("datecol", DateType)
  )
)</code></pre><pre><code>// Define a new DataFrame, based off the content of the CSV file
val df = spark.read.schema(schema).csv("dates.csv")</code></pre><pre><code>// Compute the average of the datecol column
df.agg(avg('datecol)))</code></pre><p>Unfortunately, this simple solution fails with the following error message:</p><pre><code>org.apache.spark.sql.AnalysisException: cannot resolve &#8216;avg(`datecol`)&#8217; due to data type mismatch: function average requires numeric types, not DateType;</code></pre><p>It&#8217;s clear that Spark&#8217;s built-in <code>avg</code> function isn&#8217;t designed to support <code>DateType</code> columns, so a workaround is required.</p><h3>Approach 1&#8202;&#8212;&#8202;Convert to Number, and Back&nbsp;Again</h3><p>Given that the <code>avg</code> function is intended to operate on numeric values, our first approach is to translate all the <code>Date</code> values into corresponding <code>Int</code> values, representing the number of days since a fixed point in time. We then perform the <code>avg</code> operation, and convert the result back to a<code>Date</code> value.</p><p>Unix-based systems use <em><strong>January 1st, 1970</strong></em> as the point at which everything started (known as the &#8220;Epoch&#8221;), so we&#8217;ll do the same by converting all <code>Date</code> values to the number of days since the Epoch. For example, <em><strong>January 2nd, 2017</strong></em> equates to <em><strong>17168, </strong></em>and <em><strong>March 4th, 1812 </strong></em>equates to <em><strong>-57646.</strong></em></p><p>This numeric conversion is possible using SparkSQL&#8217;s <a href="https://spark.apache.org/docs/2.3.0/api/java/org/apache/spark/sql/functions.html#datediff-org.apache.spark.sql.Column-org.apache.spark.sql.Column-"><code>datediff</code> function</a>.</p><pre><code>import java.sql.Date</code></pre><pre><code>// Dates are first converted to number of days since this date
val baseDate = lit(Date.valueOf("1970-01-01"))</code></pre><pre><code>// Compute a DataFrame containing the average number of days
val avgDayDataFrame = df.agg(
  avg(
    datediff('datecol, baseDate)
  )
)</code></pre><p>This approach works well, but gives an <code>Int</code> as a return value, rather than a <code>Date</code>.</p><pre><code>scala&gt; avgDayDataFrame.show</code></pre><pre><code>+-----------------------------------------+
|avg(datediff(datecol, DATE '1970-01-01'))|
+-----------------------------------------+
|                                  17303.8|
+-----------------------------------------+</code></pre><p>As it turns out, there&#8217;s no native SparkSQL function that does the opposite of <code>datediff</code>, to obtain a <code>Date</code> value from our numeric average. The <a href="https://spark.apache.org/docs/2.3.0/api/java/org/apache/spark/sql/functions.html#date_add-org.apache.spark.sql.Column-int-"><code>date_add</code> function</a> looked like it might work, but instead requires a constant <code>Int</code> number of days, rather than taking the <code>Int</code> result from a DataFrame.</p><p>Our solution is to extract the average value from the first column of the first row of the DataFrame, and then explicitly create a new <code>Date</code> object (which expects the number of milliseconds since the Epoch). Note that we use <code>Long</code> arithmetic here, to avoid exceeding the 2&#179;&#185; limit of <code>Int</code> values.</p><pre><code>// Trigger query (by running collect), and extract a native Long
// from the 0th column and 0th row of the DataFrame.
val avgDay = avgDayDataFrame.collect()(0).getDouble(0).toLong

// Now convert the number of days back to a Date type. The Date
// constructor requires the number of milliseconds since 1970-01-01.
val avgDate = new Date(avgDay * 24 * 60 * 60 * 1000)</code></pre><p>This works, but isn&#8217;t very elegant, particularly since the final conversion to <code>Date</code> is done outside the context of Spark DataFrames. We therefore can&#8217;t do additional DataFrame processing in the same Spark query. The solution is to encapsulate those last few lines in a Spark UDF (User Defined Function).</p><pre><code>import org.apache.spark.sql.expressions.UserDefinedFunction</code></pre><pre><code>// define a function that takes an Int, and returns the Date
val daysToDate: Int =&gt; Date = { days =&gt;
  new Date(days * 24 * 60 * 60 * 1000)
}</code></pre><pre><code>// convert this to a UDF-based function that uses Spark's Column
// data type as the input and output.
val daysToDateUDF: UserDefinedFunction = udf(daysToDate)</code></pre><p>A Spark UDF is essentially a function that accepts a Spark SQL <code>Column</code>-typed value as input, and returns a <code>Column</code>-typed value as output. This allows the function to be used entirely within a Spark query.</p><p>The new Spark query, returning a <code>Date</code> value is now:</p><pre><code>val avgDayDataFrame = df.agg(
  daysToDateUDF(
    avg(
      datediff('datecol, baseDate)
    )
  )
)
val avgDay = avgDayDataFrame.collect()(0).getDate(0)</code></pre><p>This gives us exactly what we need. Note that Spark UDFs are <a href="https://medium.com/@mrpowers/spark-user-defined-functions-udfs-6c849e39443b">often reported to be inefficient</a>, particularly because the Spark SQL optimizer is unable to understand them, and therefore unable to optimize them. In our case, we&#8217;re only running the UDF once per DataFrame (not once per row), so the performance impact should be minimal.</p><h3>Approach 2&#8202;&#8212;&#8202;User Defined Aggregate Functions</h3><p>An alternative approach is to define a <a href="https://docs.databricks.com/spark/latest/spark-sql/udaf-scala.html">User Defined Aggregation Function</a> (UDAF). Whereas a regular UDF acts on a single table cell, a UDAF operates on a full column to produce a single aggregated value.</p><p>Here&#8217;s how an <code>avgdate</code> function would be used in a Spark query:</p><pre><code>val avgdate = new AvgDateUDF</code></pre><pre><code>val avgDayDataFrame = df.agg(avgdate('datecol))
val avgDay = avgDayDataFrame.collect()(0).getDate(0)</code></pre><p>This syntax is much more readable than the previous example, given that you&#8217;re calling the <code>avgdate</code> function on a <code>Date</code> column, and getting back a resulting<code>Date</code> value.</p><p>To define a Spark UDAF, we must extend the <code>UserDefinedAggregateFunction</code> class and override the class members. The key members to be overridden are:</p><ul><li><p><code>inputSchema</code>&#8202;&#8212;&#8202;Defines the type of values that UDAF can operate on (that is, <code>Date</code> values).</p></li><li><p><code>bufferSchema</code>&#8202;&#8212;&#8202;Defines the intermediate counters used during the aggregation. In this example, we track the <code>count</code> of the number of date values we&#8217;ve seen, as well as the running<code>total</code> of the dates.</p></li><li><p><code>dataType</code>&#8202;&#8212;&#8202;Defines the type of the output data, in this case <code>DateType</code>.</p></li><li><p><code>initialize()</code>&#8202;&#8212;&#8202;A method setting the counters to their initial values.</p></li><li><p><code>update()</code>&#8202;&#8212;&#8202;A method called to add each new <code>Date</code> value to our intermediate counter values. Note the special handling for <code>null</code> field values.</p></li><li><p><code>merge()</code>&#8202;&#8212;&#8202;Given that Spark is a distributed analytics framework, this method joins together the counters from different Spark partitions that were potentially executed on different compute nodes.</p></li><li><p><code>evaluate()</code>&#8202;&#8212;&#8202;Converts the intermediate counters into a final <code>Date</code> value. This is done by simply dividing the <code>total</code> by the <code>count</code>, and then converting to a <code>Date</code> type.</p></li></ul><pre><code>class AvgDateUDF extends UserDefinedAggregateFunction {

  val BaseDate = Date.valueOf("1970-01-01")

  // each value being aggregated has this type
  override def inputSchema: StructType =
    StructType(StructField("dateValue", DateType) :: Nil)

  // intermediate values used during aggregation
  override def bufferSchema: StructType = StructType(
    StructField("count", LongType) ::
    StructField("total", LongType) :: Nil
  )

  // output type of the aggregation
  override def dataType: DataType = DateType

  // This aggregation always returns a consistent output, 
  // given a consistent input
  override def deterministic: Boolean = true

  // Initialize our internal counters.
  override 
  def initialize(buffer: MutableAggregationBuffer): Unit = {
    buffer(0) = 0L
    buffer(1) = 0L
  }

  // Update our counters with a new data value.
  override 
  def update(buffer: MutableAggregationBuffer, input: Row): Unit = {    
    val thisDate = input.getAs[Date](0)
    if (thisDate != null) {
      buffer(0) = buffer.getAs[Long](0) + 1
      buffer(1) = buffer.getAs[Long](1) +    
          thisDate.toLocalDate.toEpochDay
    }
  }

  // merge counters from two different Spark partitions
  override 
  def merge(buff1: MutableAggregationBuffer, buff2: Row): Unit = {
    buff1(0) = buff1.getAs[Long](0) + buff2.getAs[Long](0)
    buff1(1) = buff1.getAs[Long](1) + buff2.getAs[Long](1)
  }

  // Return the final value, as a Date
  override def evaluate(buffer: Row): Any = {
    val avgDays = buffer.getAs[Long](1) / buffer.getAs[Long](0)
    java.sql.Date.valueOf(LocalDate.ofEpochDay(avgDays))
  }
}</code></pre><p>Note that for performance reasons, we access the intermediate counter variables as <code>buffer(0)</code> and <code>buffer(1)</code>, rather than using their symbolic <code>"count"</code> and <code>"total"</code> names.</p><h3>Arithmetic Overflow</h3><p>Even if you&#8217;re not a Scala expert, you can hopefully get the gist of the previous code. That is, initialize a counter to 0, and a sum to 0, and then for every new date value, add the number of days (since the base date) to the sum, and increment the counter. Finally, divide the total sum by the count of items seen.</p><p>One limitation of this approach is Arithmetic Overflow. That is, the <strong>total</strong> variable has type <strong>Long</strong>, implying it has a maximum value of 2&#8310;&#179;-1 (or 9,223,372,036,854,775,807). That&#8217;s a pretty large number, but it&#8217;s still possible to overflow that data type and have it wrap around to zero. If that was to happen, we&#8217;d get a totally incorrect result.</p><p>In reality though, this is unlikely to happen with the <strong>Long</strong> data types (it would definitely be a problem with <strong>Int</strong>). Given that we&#8217;ll likely be dealing with <em>recent</em> dates (that is, near to the year 2018), most of the numbers we add to will be around 17,000 (days since 1970). We&#8217;d therefore need to find the average of 500 trillion date values before overflow would happen. It&#8217;s probably not worth worrying about this case.</p><p>In fact, looking at Spark&#8217;s <code>avg</code> function, it uses either <code>Double</code> data type which can reach 10&#179;&#8304;&#8312;, or the <code>BigDecimal</code> data type, which can be arbitrarily large (depending on your RAM). Clearly this is not a problem for most Spark users.</p><p>If we wanted to be really paranoid, there are ways to avoid overflow by either dividing the data set into equal-sized data sets, and then averaging the averages. Or perhaps use an approach of <a href="http://www.heikohoffmann.de/htmlthesis/node134.html">iteratively refining the average</a>. We&#8217;ll leave those solutions for another day.</p><h3>Performance</h3><p>Finally, let&#8217;s get a rough indication of the performance of these different approaches. It&#8217;s interesting to measure performance, since User Defined Functions are <a href="https://issues.apache.org/jira/browse/SPARK-14083">reportedly slower</a> than using native Spark functions which are handled better by Spark&#8217;s Catalyst Optimizer. Although our <code>avgdate</code> function is easier to use in queries, it might just be slower.</p><p>In our tests, we used Amazon EMR-5.13.0 (with Spark 2.3.0) with one master and two core nodes of type m4.2xlarge (8 CPU and 32GB of RAM). The input data set was 10M rows of randomly-generated data in CSV format, with each row having 100 columns. The test case involved computing the average date for a particular date-typed column.</p><p>For each test case, the result was computed six times, with the data from the first test run being discarded (to ignore the impact of cold caches). The reported result is the average duration (in milliseconds) of the remaining five test runs.</p><ul><li><p>Base Case (675ms&#8202;&#8212;&#8202;StdDev 41ms)&#8202;&#8212;&#8202;The standard Spark <strong>min</strong> function was used as an indication of how fast native Spark functions could read data, therefore defining a base case scenario to compare against.</p></li><li><p>Approach 1 (947ms&#8202;&#8212;&#8202;StdDev 13ms)&#8202;&#8212;&#8202;This is our first approach of computing the<code>avg</code> of <code>datediff</code>.</p></li><li><p>Approach 2 (1390ms&#8202;&#8212;&#8202;StdDev 34ms) &#8212;Our second approach, using the <code>avgdate</code> user defined aggregation function.</p></li></ul><p>To eliminate the cost of reading the CSV file into memory (the source data resides in Amazon S3), the&nbsp;<code>.cache()</code> directive was used to pin the data into RAM. Therefore, the duration measurements are purely the time required to scan the date column and perform the averaging operation.</p><p>As you can see, our second approach takes almost 50% longer to compute the average date, compared to the first approach. It also takes twice as long as our base case of computing the minimum date value. Although some amount of code optimization is likely possible, our user-defined aggregation function is clearly not the best approach, even though it makes the code more readable.</p><h3>Conclusion</h3><p>Apache Spark is an excellent general-purpose framework for performing data analytics. Even though it comes with a variety of built-in analytic functions, it&#8217;s sometimes necessary to implement your own functions. Spark SQL provides a convenient mechanism for defining both cell-based UDFs and column-based UDAFs, making Spark queries easier to construct. Initial tests indicate that user-defined functions are less performant than native Spark functions.</p>]]></content:encoded></item></channel></rss>