Machine Learning and its consequences

Machine Learning has brought huge benefits in many domains and generated hundreds of billions of dollars in revenue. However, the second-order consequences of machine learning-based approaches can lead to potentially devastating outcomes. 

This article by Kashmir Hill in the New York Times is exceptional reporting on a very sensitive topic – the identification of abusive material or CSAM. 

As the parent of two young children in the COVID age, I rely on telehealth services and friends who are medical professionals to help with anxiety-provoking (yet often trivial) medical situations. I often send photos of weird rashes or bug bites to determine if it is something to worry about.  

In the article, a parent took a photo of their child to send to a medical professional. This photo was uploaded to Google Photos, where it was flagged as being potentially abusive material by a machine learning algorithm. 

Google ended up suspending and permanently deleting his Gmail account and his Google Fi phone and flagging his account to law enforcement. 

Just imagine how you might deal with losing both your primary email account, your phone number, and your authenticator app. 

Finding and reporting abuse is critical. But, as the article illustrates, ML-based approaches often lack context. A photo shared with a medical professional may share similar features to those showing abuse. 

Before we start devolving more and more of our day-to-day lives and decisions to machine learning-based algorithms, we may want to consider the consequences of removing humans from the loop.

George Saunders on Feedback

Feedback is an integral part of working in a team and managing people.

Code reviews, architectural reviews, 1-1s, and Sprint Retrospectives, are all situations that involve giving (and receiving) feedback as software engineers, product managers, and engineering managers. Yet, giving critical feedback can be a difficult and stressful experience. So how best to navigate these potentially adversarial situations?

George Saunders is one of my favorite contemporary writers. He has an excellent Substack called Story Club. In this week’s post, Saunders talks about giving feedback to other writers. While his advice is in the context of a writers workshop, I found it quite applicable to my work.

Saunders advices us to give specific yet kind feedback:

.. as we learn to analyze and diagnose with increased specificity and precision, the potential for hurt feelings diminishes, because we are offering specific, actionable ways (easy ways, often, ways that excite the writer, once she’s made aware of them) to make the story better. And who doesn’t want some of that?

George Saunders

Giving constructive or critical feedback is integral to working as a software engineer. Yet, these conversations can become challenging. 

One might be tempted not to say anything or speak in the most generic and broad terms to avoid offense. Instead, as Saunders suggests, the focus should be on giving thoughtful, specific, precise, and actionable feedback:

In this [giving feedback], we indicate that we are on the writer’s side, we are rooting for her and are glad to have found these small but definite ways to make her story better. There’s no snark, no competition, no dismissiveness, nothing negative or accusatory about it; just the feeling that we, her readers, are coming together with her, the writer, by way of craft. We’re all on the same team, the team of art.

George Saunders

Not much more to add is there?

Climate Change and Category Errors

Stuart Kirk, former journalist for the FT and now former banker at HSBC got into trouble last week for suggesting that climate change risks are overblown.

Before his suspension, he was the head of Responsible Investing for HSBC asset management.

Here is his presentation. It’s worth a watch.

In his presentation he says:

  • Climate change risks are overblown
  • The “markets”, in all probability, already priced in climate change risks
  • Climate change adaptation is more pragmatic and likely cheaper than mitigation
  • By the time climate change hits, we will all be dead anyway. So why bother?

I found the presentation interesting and a little horrifying – in the drunk uncle holding forth – sense. He makes some good points – about the short term nature of markets and investing, about the necessity of climate change mitigation, for example. But the general attitude can be summarised as 🥱🤷‍♂️.

I am still surprised that after 2008, after COVID, Ukraine and all the other shocks, people like Mr Kirk still think in terms of normal distributions. I.e. the probability of events can be modelled as a bell curve – with very bad or very good events having low probabilities, and predictable “average” events being the most common.

Or to channel mathematician, philosopher and truculent Twitter warrior N. Taleb, the likes of Mr Kirk believe that the impact of climate change to be an ergodic process while it is most definitely not.

Doing a Google search for “Ergodicity” will lead you to baffling mathematical and statistical explanations. But it is, at its core, an intuitive concept. In a non-ergodic system, things that are true for the aggregate may not be true for the individual.

In Mr Kirk’s presentation he plots economic growth from the 1930s to the present day and states, pretty much, that the “line goes up” despite world wars, economic upheaval, recessions etc. He uses this trend to assert that we will be fine despite the risks of climate change. The benefits of a growing economy will overcome the downsides of climate change.

However, the story of aggregate growth over the last 100 years hides tales of individual ruin.

For example, someone who invested all their savings in tech stocks in 2002 probably didn’t have anything left to make money when the market finally moved up. For those unlucky investors, it was game over. Therefore, we are modelling a process that is non-ergodic (individual outcomes can be radically different than aggregate outcomes) as an ergodic process.

So, what does this have to do with climate change?

I believe that the effects of climate change make our economic system even more non-ergodic. It makes it much more likely to have extreme events – heat waves, wild fires, hurricanes, droughts, etc. This makes modelling based on aggregate probabilities a little suspect. Sure, you could increase insurance premiums for coastal communities to account for higher flooding risk. This is what Mr Kirk means by the risk being “priced in”. But what happens when entire communities are wiped out due to an unprecedented storm surge, or heat wave, or forest fire?

Climate change adds more chaos to a complex system. It heightens the likelihood of extreme events that have catastrophic outcomes. Adaptation measures are necessary but they will do little to mitigate the impact of “black swan” events. So it doesn’t matter how complex your modelling is, and how sophisticated your investment strategy is. If you die due to a freak hurricane, you are done.

The likes of Mr Kirk are making a category error. The only way to “win” in an non-ergodic system is to survive. We should be thinking of what can be done to ensure that we don’t face catastrophic loss, so that we can continue to reap the benefits of growth in the future.

Further Reading

• An excellent primer on Ergodicity

Nassim Nicholas Taleb on Ergodicity

Crypto and Transaction Costs

You live in the up-and-coming suburb of Cryptoville and you want to buy a house. It costs $1m. 

There might be some transaction fees involved, but you won’t actually know how much the fees will be until you complete the transaction. Oh, you are not competing with anyone to buy the house, it’s just a transaction fee. Can’t be too bad right? 

On the day of closing, the transaction goes through. The transaction fees are $250,000! And there was no way to tell until you tried to buy the house. It’s just the way things work in Cryptoville.. 

This is pretty much what happened on Saturday when Yuga Labs, the company behind the Bored Ape Yacht Club, held a much anticipated virtual land / NFT sale on the Ethereum network. Gas fees (i.e. transaction fees on Ethereum) spiked as the network coped with thousands of ApeCoin holders looking to buy some virtual land for their virtual Apes. 

The shocking thing was that it caused the entire Ethereum network to clog up – raising transaction costs for everyone – not just those looking to buy virtual land. Folks looking to buy NFTs valued at under a dollar were seeing transaction fees of $3,500! 

This points to a serious, and well-known, issue with throughput on Ethereum. It does not scale under load. Perhaps the long-delayed migration to Proof of Stake may change this – when it happens.

But – do you know what happened to the “high-performance” blockchain Solana on Saturday? You see where this going..

Links:
Ethereum Gas Prices Spike
Solana Performance Issues
Introduction to Ethereum Scaling

Footnote
Ethereum can only process about 15 transactions per second. It is just the way it is designed. However, miners can be incentivized to process transactions by increasing gas (transaction) fees. This is what happened on Saturday – as the demand to mint NFTs skyrocketed, so did the transaction fees. Gas fees have since come down, but it shows the big issues that Ethereum continues to face as it remains the de-facto standard for blockchain development.

Elon Musk & The Twitter Algorithm

I have been trying to avoid the whole Elon Musk / Twitter drama, but it has been challenging. I am ambivalent about whether Mr. Musk’s takeover of Twitter is a good or bad thing. My vibe is 🤷🏾‍♂️.

But, I do have an issue with one of Mr. Musk’s ideas: open-sourcing the Twitter algorithm to ensure there is no “bias.”

I think this is disingenuous, and Mr. Musk is playing to his (adoring) audience a little bit. 

It is improbable that there is the “one true algorithm” at Twitter. They probably use a combination of machine learning-based recommendation models with other systems such as entity and intent detection. Take a look at Twitter’s engineering blog to see how much ML drives recommendations on the social network.

So, if the intention is to look at the code and delete any (left-wing | right-wing) bias, things will be.. difficult. 

Now, a discussion should be had about how the ML models are trained and if there are any biases in the labeled datasets that are used to drive recommendations, detect abusive content, etc. This is a complex problem, however! 

An important effect of the pervasive deployment of ML technologies is that it makes computing *probabilistic* instead of *deterministic*. i.e., we know what is likely to happen, but it is difficult to predict what *will* happen.

This paradigm shift makes it very difficult to point the finger at one or more woke/radical/reactionary programmer who decides to censor or advocate for free speech. 

Mr. Musk knows all this, of course. The entire Tesla “full self-driving” stack is built on ML. So, perhaps, a little bit of intellectual honesty might lead to a more interesting discourse about bias.

Links:
Why Elon Musk Wants to Open Source Twitter

Elon Musk’s Poll on whether the Twitter “algorithm” should be open-sourced: https://twitter.com/elonmusk/status/1507041396242407424

Twitter Engineering Blog: https://blog.twitter.com/engineering/en_us

MIT Technology Review has a good writeup about this: https://www.technologyreview.com/2022/04/27/1051472/the-problems-with-elon-musks-plan-to-open-source-the-twitter-algorithm/

Between Rock and a.. podcast?

Just because you can do it doesn’t make it a great business model. Take music streaming, for example.

Image by Chloe Ridgeway on Unsplash

Spotify, the world’s most popular streaming service, has been the target of some Internet ire in the last week or so. Neil Young, the creator of the legendary Pono digital media player (apparently he made some music too?), decided he didn’t want anything to do with Spotify. 

Why all the righteous indignation?

Spotify pays Joe Rogan, a media personality / MMA commentator / master of “doing his own research,” over $100m to have exclusive rights to his wildly popular podcast. 

Apparently, Mr. Rogan has some interesting ideas around COVID, vaccinations, and horse de-worming medication. Not particularly controversial topics 😬. 

Why is this a big deal for Spotify?

Music streaming is a terrible business. Spotify has been bleeding cash for years and only recently turned a meager profit. The company had an operating margin of 1.4% in the first nine months of last year. No hockey sticks in sight.

The reason? It has to pay royalties to music labels for each music stream. The value from streaming accrues to the music companies, not to the streamers or artists.

Spotify makes its money not from streaming but from selling subscriptions and advertising. 

This is where podcasts come in. Spotify pays millions to Joe Rogan because he brings in a massive audience in the highly desirable 18-34 demographic. Spotify offers targeted advertising on podcasts to its most important customers, advertisers. This makes much more economic sense than making tiny margins on each stream of, let’s say, “Rockin’ in the Free World.” 

The risk to Spotify in this, slightly ridiculous, situation is not losing access to rock & roll; its not being able to monetize their investments in podcasting. 

Spotify would rather you come for the music and stay for Elon Musk smoking some fine herb  with his buddy Joe Rogan. 

They have set up expectations for their users that they can stream any song at any time. So they have to double down on more economically viable content like the Joe Rogan Experience. 

I am sure there is a Neil Young song about rocks and hard places..

On crypto outages

What to do when your decentralized, scalable, performant blockchain turns out to be not so scalable, sort-of-centralized and not so performant? 

Crypto’s selling point is robustness that is built on decentralization. No single points of failures should mean no downtime right? Right?

Turns out, crypto’s weaknesses are same as those of other, more mundane technologies. Bad code, bad actors and the fact that building scalable, distributed (and decentralized) systems is hard!

Solana, a Layer 1 blockchain, suffered a long outage over the weekend. This happened when the crypto markets are melting down.. 

Solana is supposed to be the answer to Ethereum’s performance and scalability issues. And yet, Solana has been plagued by performance issues and outages over the last few months. 

This weekend’s issue was caused by “program cache exhaustion” due to “excessive duplicate transactions”. Solana developers released an emergency patch to resolve this issue and begged every validator to upgrade.

Where there is code, there are bugs.. 

Welcome to the brave new world, where the problems are the same as the ones in the old world. They just cost you a lot of funny money.

On Roblox’s Outage

Roblox is one of the world’s biggest game platforms. With over fifty million daily users, it is a wildly popular platform to build and play games. 

In October last year, they had an outage where the entire platform was down for over 72 hours. This was all over the news at the time..

Today, Roblox published a post mortem about the incident. It is fascinating reading for anyone interested in distributed systems, DevOps, and Engineering (link below). I will write up a more detailed note in a couple of days.

Summary
– The outage was due to an issue in their service discovery infrastructure which is implemented in Consul
– Roblox is deployed on-premise(!!) on 18,000 servers which run 170,000 service instances
– These services rely on Consul (from HashiCorp) for service discovery and configuration
– An upgrade to Consul and the resulting switch to the way services interact with Consul lead to a cascading set of failures resulting in the outage

Some Initial Thoughts
– Distributed systems are hard, and the use of service-oriented architectures come with costs of coordination and service discovery
– Microservice architectures do not reduce complexity, just move it up a layer of abstraction
– The complexity of the modern software stack comes not just from your code, but also from your dependencies. 
– Leader election is one of the hardest problems in Computer Science 🙂 

On Forgiveness in UX Design

As engineers and designers, we need to focus on building products that have empathy and forgiveness for their users. 

Software is eating the world, but as it optimizes for engagement and retention, it leaves behind confused and exhausted users. 

Companies raise millions of dollars at billion-dollar valuations. With those valuations comes a drive to add new features. With the move to SaaS for everything, user interfaces and modes of interaction seem to change overnight.

Perhaps we could take inspiration from the consumer packaged goods industry. 

As a new father, I have changed diapers in various circumstances. In the dark, in the park, trying to mitigate a full-on meltdown and sometimes just trying to stem an avalanche of 💩. 

And yet, the diaper works as intended. Forgiveness is built into the design. I can operate it one-handed if I have to, and it gives some protection even when not used correctly. I can be confident that the design won’t change dramatically in the next iteration.

So, dear UX designer, next time you fire up Figma, think of the humble diaper, and a poor sleep-deprived dad dealing with a poop 🌋 at 3am. 

Think of the mistakes a user may make and design your application to forgive them and not punish them when they make those mistakes when addled, distracted, or simply exhausted.

Explaining Bitcoin (through terrible science fiction)

On a Planet far away..

The planet of Isthar, in the outer reaches of the Delta Quadrant, was a dusty, arid, and sparsely populated world. Over the millennia, the original colonists split into tightly knit tribes and spread across the deserts and rocky canyons of the planet. What technology they had brought from their homeworld was soon forgotten in the face of the ongoing struggle to live in Isthar’s hostile environment.

When Isthar’s largest moon was full, the tribes would meet at an oasis to trade. A long history of bad blood, warfare, and mistrust had meant that these were often tense occasions. A fiercely independent set of people, they had little need for a central government or any sort of written laws. Trade meant barter without contracts, banks, or currency, with plenty of misunderstandings, violence, and theft.

During one of these gatherings, out of the western desert emerged a group of monks called themselves the Order of the Great Lizard. They claimed to have a way that would allow the tribes to trade peacefully without sacrificing their independence or indeed have to trust each other.

The monks produced a tablet made of stone. They called the tablet the Ledger. They claimed the Ledger would provide a tamper-proof way of recording what was traded and by whom. This would help reduce the sort of misunderstandings that often lead to bloodshed.

The Order of the Great Lizard had placed a spell on the Ledger that gave it some unique properties.

  1. Only a monk of their Order could write on the Ledger.
  2. Things once written could never be erased.
  3. While every monk could write to the Ledger, only one chosen monk could do so at a time.
  4. The chosen monk must have had a unique vision of the Great Lizard in a pose that a majority of other monks agreed was magnificent and one befitting the splendor of the Great Lizard.
  5. Once the monk wrote on the Ledger, the spell would cause a coin to appear. It was a token of benevolence from the Great Lizard, and out of respect, these were called Lizard Coins. The monk could do what they wished with their Lizard Coin.

The tribespeople were a little bemused. But the Order seemed harmless and willing to help, so they agreed to give this new way of trading a try.

The monks all sat in a large circle around the Ledger. Tribespeople would approach and shout out the trade they just had done.

“I Troopz of the tribe Gooners bartered my sand goat for a sack of salt and a bottle cap with Bubbles of the Hammers.”

Each monk would then contemplate the Great Lizard until the first one had a vision.

“I, Brother Chili, see the Great Lizard lying down with its right arm raised up.”

As long as more than half of the other monks agreed that it was indeed a splendid and unique vision, Brother Chili could record Troopz and Bubbles’ trade on the Ledger. He would then be rewarded with a Lizard Coin by the most magnificent and benevolent Great Lizard.

Once there was a record of Troopz bartering his sand goat to Bubbles on the Ledger, he would not be able to trade it to anyone else. Bubble’s ownership of the said sand goat was now backed by the Ledger.

And so, the Order of the Great Lizard made trade possible for the people of Isthar. There were some interesting consequences:

  • The visions could only include the Grand Lizard and nothing else. Hence it would become more challenging to develop new and unique visions as time passed. This meant the total number of Lizard Coins would also be limited.
  • Some folks tried bribing one or more monks to see if they could write false transactions on the Ledger, but this would not be possible without convincing a majority of the monks.
  • The Order of the Great Lizard was a genuinely open group. Anyone could join and become a monk provided they committed to contemplating the magnificence of the Great Lizard.
  • The monks used their coins to buy goods and services from the tribes. As the number of Lizard Coins increased, they became a useful and widely accepted currency on Isthar.
  • Very soon most transactions recorded on the ledger were transfers of Lizard Coin from one person to another.

Through their stone tablet and their magic spell, the Order of the Great Lizard had given the tribes a way to trade without having to trust each other. The Order was also open, so anyone could join provided they would spend their time and energy contemplating the magnificence of the Great Lizard.

The End.


Bitcoin Explained

Bitcoin was created to solve the problem of coordination in a trustless environment. This problem is also known as the Byzantine Generals problem.

In our story, the Order of the Great Lizard attempts to solve a similar problem for the tribes that did not have reason to trust one another.

The Ledger plays the same function as the blockchain in Bitcoin. It is an immutable (cannot be modified), append-only data structure. In a given period of time, only the first monk to come up with a unique, verified vision of the great lizard could write a new line to the ledger.

Miners play the same role in Bitcoin as the monks of the Order. The first miner to solve a complex mathematical puzzle that can be verified by a majority of other miners can write a new block to the Bitcoin chain. The monks get rewarded by a Lizard Coin and the miners get rewarded by a Bitcoin. This concept is also known as Proof of Work. The idea is to incentivize the miners to compete to solve puzzles to be able to generate and write a block to the Blockchain. The intrinsic value of Bitcoin comes from the energy expended in solving mathematical puzzles.

Transactions on the Bitcoin blockchain are limited to moving a fraction of Bitcoin from one address to another. On Ishtar, the transactions could be anything but eventually become mostly moving Lizard Coin from one person to another.

The Bitcoin protocol is represented by the spell cast on the Ledger. It provides the rules for recording transactions as well as rewards for the monks (miners) who are the fastest to have a unique vision of the Great Lizard. In Bitcoin, the magic spell is written in publically available and vetted code.

Our story also doesn’t cover the important roles that cryptography plays in Bitcoin. While we watched Troopz and Bubbles exchange a goat for salt and bottle caps, participants on the Bitcoin network can only be identified by the public key corresponding to their wallets. Hence, participants are pseudonymous.


But.. why bother?

Bitcoin and cryptocurrencies are complex. Skeptics often call them technologies looking for problems. My attempt at using terrible science fiction to explain how Bitcoin works is an attempt to explain how the technology works at a high level to folks who are not in the crypto ecosystem 24X7. I hope you were amused and somewhat enlightened. Drop me a line if you have any comments or questions!


Further Reading