Arjen Wiersma

A blog on Emacs, self-hosting, Clojure and other nerdy things

Day 2 of solving {{< backlink “aoc” “The Advent of Code” >}} in {{< backlink “clojure” “Clojure” >}}.

The 2nd day had us do some magic with numbers. The quest was to find some silly patterns in the numbers. I chose to use good old regular expressions to do the job, while a colleague of mine chose to use math. Both work really well. The 2nd part of the puzzle is to find repeating patterns, where the 1st only wanted pairs.

(ns day2
  (:require
   [clojure.string :as str]))

(def parse-int Long/parseLong)

(defn parse-ranges [input]
  (map (fn [in]
         (let [[_ start end] (first (re-seq #"(\d+)-(\d+)" in))]
           (filter #(re-matches #"(\d+)\1" (str %))
              (range (parse-int start) (inc (parse-int end))))))
       input))

(defn parse-ranges2 [input]
  (map (fn [in]
         (let [[_ start end] (first (re-seq #"(\d+)-(\d+)" in))]
           (filter #(re-matches #"(\d+)(\1+)" (str %)) 
               (range (parse-int start) (inc (parse-int end))))))
       input))

(def part1 (reduce +
            (->
            (slurp "resources/two.in")
            (str/trim)
            (str/split #",")
            (parse-ranges)
            (flatten))))

(def part2 (reduce + (->
            (slurp "resources/two.in")
            (str/trim)
            (str/split #",")
            (parse-ranges2)
            (flatten))))

(prn part1 part2)

#programming

Day 1 of solving {{< backlink “aoc” “The Advent of Code” >}} in {{< backlink “clojure” “Clojure” >}}.

Each year I like to participate in the Advent of Code. This year there will be 12 puzzles to solve due to Eric taking care of himself, good on you!

My language this year is {{< backlink “clojure” “Clojure” >}}. My solution is quite simple and straightforward. I read the data using split-instructions, transforming the L into a subtraction. For part 1 it is enough to then take the reductions and then filter out all the times we land on the digit 0.

For part 2 I just generate the sequence of numbers for each step in all-steps and then using reduce to apply it to each step, just like the reductions. The result is a long list of numbers from which I just take the 0s again.

(ns one
  (:require
   [clojure.string :as str]))

(defn split-instruction [instruction]
  (let [[_ dir dist] (re-matches #"([A-Z])(\d+)" instruction)
        steps (Integer/parseInt dist)]
    (cond
      (= dir "L") (* steps -1)
      (= dir "R") steps)))

(defn part1 []
  (let [data (->> (slurp "resources/one.in")
                  (str/split-lines)
                  (map split-instruction))
        steps (reductions #(mod (+ %1 %2) 100) 50 data)]
    (count (filter #(= 0 %) steps))))

(part1)

(defn all-steps [start change]
  (let [step      (if (pos? change) 1 -1)
        seq-start (if (pos? change) (inc start) (dec start))
        seq-end   (+ start change step)]
    (map #(mod % 100) (range seq-start seq-end step))))

(defn trace-all-steps [start-pos changes]
  (let [result (reduce (fn [acc change]
                         (let [current-pos (:pos acc)
                               steps (all-steps current-pos change)
                               new-pos (last steps)]
                           {:pos new-pos :history (into (:history acc) steps)}))
                       {:pos start-pos :history []}
                       changes)]
    (:history result)))

(defn part2 []
  (let [data (->> (slurp "resources/one.in")
                  (str/split-lines)
                  (map split-instruction))
        steps (trace-all-steps 50 data)]
    (count (filter #(= 0 %) steps))))

(part2)






#programming

I was under the impression that I was a good typist, but getting an ergonomic keyboard taught me otherwise.

I have long been fascinated by the world of keyboards. I have several mechanical keyboards, most from Ducky. They are wonderful, but they all have in common that they are traditional keyboards. This means that your hands are close together on the keyboard, closing your chest and forcing your hands in an awkward position. We all have become very used to this position, as this is the way typewriters taught us all how to type.

Then I found an Ergodox EZ on Marktplaats, a website where consumers can offer goods for sale. The price was still hefty, but not the 350+ euros it is brand new. A few days later it arrived, turns out it was indeed hardly used. It has started my journey to actually learn how to type again.

In the 30 years I have been in the software industry I learned to type quite fast, but putting my hands apart, each hand responsible for their own keys, turns out to be quite a challenge. There is even a blogpost on the topic by the creators of the keyboard. You see, it is not just the hands apart that gets you, it is the columnar key-setting. This means that the keys are not only separated but instead of staggered as on a normal keyboard, they are in a line above each-other.

The other thing about the keyboard is that you can customize pretty much everything. It has only 4 rows of keys, so many features of a normal keyboard hide in different layers on the keyboard. As I do a lot of coding and technical writing, I have different needs from someone who does novel writing. This is a great personalization of the keyboard for me, it truly represents me and the work I do. This is my current layout, notice the brackets on the side, and the access to CTRL and ALT in the middle. This was taken from connecting it on a mac.

My keyboard layout

I now have a little over a week with this layout, and it seems to work very nicely for me. My words-per-minute are going up again and it feels natural. Soon I will need to get a carry case for my keyboard as normal keyboards are starting to feel alien to me.

#programming

Yesterday I had a technological itch. I was configuring a package in {{< backlink “Emacs” “Emacs”>}}and had to search for a variable name, the completion system mixed variables and functions together and it was a bit of a chore. Then I thought “how hard can it be to just create a utility function to list all customizable variables?”. An {{< sidenote “hour” >}}And then I spent quite some time tinkering of course :){{</ sidenote >}} later my new package, upiv, was born. You can get it from my forgejo instance.

If you are unfamiliar with use-package, it is a very convenient way to configure packages within Emacs. An example of its usage is this configuration of my new upiv package.

(use-package upiv
  :config
  (add-to-list 'marginalia-command-categories '(upiv-at-point . variable))
  :custom
  (upiv-full-form-in-completion nil))

A little search lead me to mapatoms, which basically allows you to call a function on every existing symbol. From t here it is possible to see if something is a custom-variable-p and if it has some form of value boundp. I only wanted variables with a certain prefix, so string-prefix-p was needed. And there it is, a snippet that will list all lsp-ui customizable variables.

(mapatoms
     (lambda (sym)
       (let ((sym-name (symbol-name sym)))
         (when (and (custom-variable-p sym)
                    (boundp sym)
                    (string-prefix-p "lsp-ui" sym-name))
           (message "This is a customizable variable %s" sym-name)))))

Then I got the idea to pass this list on to a completing-read-multiple, allowing the user (me!) to select multiple candidates and insert them into my buffer.

(when-let ((selected-settings (completing-read-multiple
                                           "Insert settings: " candidates nil t)))
  ;; do the insertion
)

From there on I kept adding small features, such as adding the :custom keyword when needed, making sure it even works when the cursor is in a string or an comment.

My final addition has been to add marginalia support. This means you can pull up the completion list, the default value will be shown in beautiful highlighting and a documentation string is given for better selection.

The final product is shown in the below screencast, where I fist install and setup vertico and marginalia, and then insert a custom variable using my package.

Upiv in actions

I do ❤️ Emacs and its extendibility.

Package: https://forge.arjenwiersma.nl/arjen/upiv

{{< admonition type=“tip” title=“Series note”>}} In this post I explore a new paper in my series of {{}}. {{</ admonition >}}

I'm Dutch, and whenever I'm abroad, people figure this out pretty quickly. It's usually the bluntness. We tend to be direct, to the point, and question things constantly. To a lot of people, this comes across as... well, weird.

That word, “weird”, has been on my mind a lot lately, especially when I hear the LLM providers tell us that Large Language Models (LLMs) have achieved “human-level performance.” I’ve been in tech and security for a long time, and I’ve seen more “revolutions” than I can count, but the chatter about this one is everywhere. Each model is more human then the last one it seems.

It's an impressive claim, and the demos are slick. But in our line of work, we're paid to be paranoid. We're trained to ask the follow-up questions, the ones that spoil the party. When I hear “human-level performance,” one question comes to mind: “Which humans?”

It’s not a trick question. We're a weirdly diverse species. A rice farmer in rural China, a herder in the Maasai Mara, and a philosophy student in Stockholm don't just have different opinions; their brains are wired to process information in fundamentally different ways. Trust, morality, logic, even how they see themselves... it's all variable.

The problem is, these LLMs aren't being trained on that full spectrum of diversity. They're being trained on data from one very specific, and globally unusual, sliver of humanity. That sliver is what researchers call WEIRD: Western, {{< sidenote “Educated” >}}Even though some politicians think it a good idea to destroy our educational system{{</ sidenote >}}, Industrialized, Rich, and Democratic.

If you're reading this, you are almost certainly part of that group. And so, it turns out, is ChatGPT.

The “WEIRD in, WEIRD out” problem

In security, we have a foundational concept: “Garbage In, Garbage Out.” A system is only as good as the data you feed it. A model trained on junk will give you junk. AI is no different.

So, how did our supposedly “global” AI models end up with one very specific psychological profile? They were trained on the internet. And the internet is not a mirror of humanity; it’s a funhouse mirror that reflects its most active users.

  1. Access: Nearly half the world’s population isn't even online. The data we're scraping is missing billions of perspectives from the start. That's not a rounding error; it's a colossal blind spot.

  2. Language: The training data is overwhelmingly, colossally dominated by English. This isn't just a language issue; it's a cultural one. English-speaking populations are, by definition, a psychological outlier on the global stage.

The AI, in its quest to find patterns, is just learning the patterns of its most over-represented users. It's not just “Garbage In, Garbage Out.” It's “WEIRD In, WEIRD Out.” The model is faithfully replicating the psychological skew of its training data.

The evidence: how to build a digital Dutchman

The researchers didn't just guess this; they tested it. They gave ChatGPT a battery of classic cross-cultural psychology tests and compared its answers to a massive dataset from 65 nations. The results are quite astounding.

Test 1: The “Cultural Map” First, they used the World Values Survey—a huge dataset on global attitudes about morality, politics, and trust. They mapped all 65 nations and ChatGPT to see who “clustered” with whom.

ChatGPT didn't land in some neutral “robot space.” It landed smack in the middle of the WEIRD cluster. Its closest psychological neighbors? The United States, Canada, Great Britain, Australia... and, of course, The Netherlands.

The correlation was stark: the less WEIRD a country's culture was, the less its people's values resembled ChatGPT's.

Test 2: The “Shampoo, Hair, Beard” Test This is the one that really got me. In a classic “triad task,” you're given three words and asked to pair the two that are most related. For example: “shampoo,” “hair,” and “beard.”

The difference in thinking is stark. A holistic thinker, common in less-WEIRD societies, tends to pair “shampoo” and “hair” because they have a functional relationship (you use one on the other). In contrast, an analytic thinker, common in WEIRD societies, pairs “hair” and “beard” because they belong to the same abstract category (they are both types of hair).

When they ran this test on GPT 1,100 times, the results were stunning. On the graph of all human populations, GPT’s percentage of analytic, WEIRD-style choices was almost identical to that of the Netherlands. They are plotted as next-door neighbors at the very top of the analytic-thinking chart.

It doesn't just agree with the Dutch; it thinks like them.

Human vs LLM analytical thinking

Test 3: The “Who Am I?” Test This bias isn't just about values or logic; it's about a fundamental sense of self. Psychologists have a simple test: ask someone to complete the sentence “I am...” ten times.

The responses to this test show a clear cultural divide. WEIRD people overwhelmingly list personal attributes: “I am smart,” “I am an athlete,” “I am hardworking.” In sharp contrast, people from less-WEIRD cultures overwhelmingly list social roles and relationships: “I am a son,” “I am a member of my village,” “I am an employee of X company.”

So, what does ChatGPT think an “average person” is? You guessed it. When asked, it generated a list of personal characteristics, mirroring the self-concept of US undergrads. It perceives the “average human” through a WEIRD lens, completely missing the relational self-concept that is the norm for most of the planet.

Why this matters

Okay, so the AI is a bit weird. Why should we, as tech professionals, care? Because in our field, a system with a built-in, predictable blind spot isn't a curiosity, it's a liability.

We're in the business of risk. We are rushing to integrate these models into every layer of our stack. They're not just toys anymore. They're in our code review pipelines, our content moderation filters, and our HR software that screens resumes.

Now, picture that “Digital Dutchman” logic applied at scale. A content moderation filter trained on WEIRD norms of “harm” will be systematically blind to what's considered offensive or dangerous in other cultures, while over-policing things it finds “bizarre.” An HR tool built on WEIRD ideas of self-promotion (“I am smart”) might filter out perfectly qualified candidates from cultures where boasting is a taboo and group-contribution is the norm (“We achieved...”). And a code-assist AI, when asked for “simple” or “logical” code, will default to an analytic, WEIRD-style structure. Is that always the right, most robust, or most secure solution? Or just the one that feels right to its internal psychologist?

This isn't a hypothetical problem. The researchers noted this bias persists even in multilingual models [1]. You can't just “translate” your way out of a core psychological skew. What we have here is a systemic vulnerability.

We didn't build a “Human” AI, we built a WEIRD one.

The takeaway here isn't that LLMs are useless. It's that we have to be ruthlessly realistic about what they are. This isn't a step toward “artificial general intelligence.” This is an echo chamber.

These models aren't “human-like.” They are a-human-like. They're stochastic parrots that have been trained on a very specific, psychologically peculiar dataset.

A tool that is fundamentally blind to the perspectives of most of the planet isn't “objective.” It's a system with a massive, built-in bias. And as we race to embed this system into the foundations of our global society, we're not building a universal brain; we're exporting a single, peculiar psychology.

And for those of us in the security field, a system with a blind spot that big... well, that’s what we call job security.

Find more {{}}.

Bibliography

  1. Atari, M., Xue, M. J., Park, P. S., Blasi, D. E., & Henrich, J. (2023). Which Humans? (Unpublished manuscript). Department of Human Evolutionary Biology, Harvard University.

Over the last couple of months there is a lot of activity in Europe when it comes to moving out of the American cloud. This move is called Digital Autonomy, and there are a lot of articles on it, but I like the ones written by Bert Hubert.

When I am working on my own projects, or this website for that matter, I am using Github (which is owned by Microsoft). For some projects I use Gitlab, which was started in The Netherlands, but has now become an American company. Much of my development infrastructure is tied to American companies, simply because they offer the best tools for the job.

But then the discussion on digital autonomy comes in and I am thinking, “If I can not make the switch, how could anybody?”, so this weekend I started working on setting up my own environment for {{< sidenote “hosting” >}}As a note of interest, to get a DevBox (with backups), costs about 9 euros on Hetzner{{< /sidenote >}} the tools I need for my software projects. I call this thing my DevBox.

After some tinkering I decided on Forgejo for my source control and project management, it is a reasonably complete replacement for Github and Gitlab. To make it work best it needs to be properly insulated and given certificates and things like that. To do that I installed Traefik, it is a reverse proxy that deals with pretty much all the nitty gritty of serving applications.

Of course it is a bad idea to install everything on a box just like that, so I create an Ansible playbook that pulls in the various docker containers, creates networks, connects the proxy to the backend docker containers and manages configuration files.

To further test out this setup I deployed on of my in-progress projects to the box, a workshop application I am building for my workshop at J-Fall, the biggest single day Java conference in Europe.

Once everything turns out to be working nicely for a little while I will also migrate over my blog and then start shutting down my Github and Gitlab accounts. For me that is an enormous moment, I have had these accounts for many years, my Github dates back to 2008.

Lets see if the box is big enough when I migrate everything in.

You can't scroll through a tech feed these days without bumping into a hot take on AI and coding. Depending on who you ask, it's either the greatest productivity boost in history or a security dumpster fire waiting to happen. Opinions are cheap, which is why I prefer to stick to the data from actual research. That way, the information is verifiable, and you can trust the analysis because you can check the sources for yourself. This style of writing I call {{}}.

A recent deep-dive from the folks at {{< sidenote “Semgrep” >}}See their in-depth blog post in the bibliography{{< /sidenote >}}Semgrep did just that, and what they found paints a complicated, sometimes contradictory, picture [^1]. It's a picture every developer and security pro needs to see.

A funny thing happened on the way to production

The central problem is a paradox we're all starting to notice. AI tools are getting incredibly good at generating code, but they're also incredibly good at generating vulnerable code. The Semgrep post points to one study that found 62% of C programs churned out by LLMs had at least one {{< sidenote “security bug”>}}For those of us in the security field, this is what we call job security.{{</ sidenote >}}.

What's really interesting, though, is the human element. The research shows that developers using AI assistants are not only more likely to submit insecure code, but they also report feeling more confident about their flawed work. This brings us to the million-dollar question: if AI is helping write all this insecure code, can we trust it to clean up its own mess?

So, i put the AI scanners to the test...

Well, not me personally, but the researchers did. They took a hard look at how effective AI-powered code review features are at spotting known vulnerabilities, and the results were... underwhelming.

When it came to the big, scary stuff such as SQL injection, cross-site scripting (XSS), and memory corruption, The AI models were often asleep at the wheel. Most of their feedback focused on low-hanging fruit like coding style, typos, or potential runtime exceptions. It's reassuring to know that for now, the AI is more interested in correcting my grammar than preventing a full-scale data breach.

The tools also got tripped up by common configuration files like YAML and XML, which is a bit of a problem since that's where a huge number of enterprise security misconfigurations happen.

Why can't the robots see the bugs?

The “why” behind these failures comes down to how these models are built. They're not thinking like a security analyst; they're pattern-matching machines.

First, they have a lack of deep semantic understanding. Unlike a traditional static analyzer like CodeQL that operates on a set of firm rules, an LLM makes educated guesses based on statistical correlations. It doesn't truly understand the consequences of the code it's reading.

This leads to the second major weakness: poor data flow tracking. This is a big one. The AI struggles to trace a piece of user input as it moves through the application. If it can't follow that data from the web form all the way to the database query, it has almost no chance of spotting an injection vulnerability. The numbers here are pretty stark: for SQL Injection, Claude Code had a 5% True Positive Rate, and OpenAI Codex came in at a flat 0%. You read that right. Zero.

Finally, there's the inconsistency. Because these models are probabilistic, you can give them the exact same code twice and get two different reports. That lack of reproducibility is a deal-breaker for any serious security tool that needs to provide reliable, actionable feedback.

It's not all bad news

Now, it's not a total wash. The AI does show some flashes of talent where its unique abilities give it an edge.

Because LLMs can grasp the general context of the code, they're surprisingly decent at finding bugs that depend on understanding logic. For example, the study found Claude Code was best at finding Insecure Direct Object Reference (IDOR) bugs, with a 22% success rate. That's a vulnerability that requires understanding authorization, something traditional scanners can miss. Similarly, OpenAI Codex had a surprisingly high 47% success rate for Path Traversal issues.

Even when the findings are noisy, and with false positive rates between 82% and 86%, they are very noisy, the AI can still act as a useful “secure guardrail”. It might suggest hardening a piece of code that wasn't technically vulnerable, which is rarely a bad idea.

Where do we go from here?

The path forward isn't to throw bigger, monolithic AI models at the problem. The Semgrep post argues for a more sophisticated approach: agentic workflows.

The idea is to use AI not as a magic bullet, but as an orchestrator. The most successful systems are hybrids, integrating LLMs with deterministic tools like symbolic execution, fuzzing, and traditional static analysis. In this model, specialized AI agents work together, using a whole suite of tools to find, validate, and even exploit vulnerabilities. It's about combining the AI's contextual strengths with the precision of classic security tools.

The bottom line for us in the trenches

So, what does this all mean for those of us doing the actual work? I have a few takeaways.

First, AI is not a silver bullet, and you shouldn't fire your security team just yet. Today's models are weak on the high-severity injection flaws that keep us up at night, and they are no substitute for a skilled human auditor.

Second, don't discard your dedicated tools. That CodeQL or SonarQube license is still one of the best investments you can make. They provide the consistent, explainable diagnostics that today's LLMs simply can't.

Finally, the mantra should be augment, don't replace. Use these AI tools for what they're good at: catching low-severity bugs, offering style suggestions, and maybe finding the occasional oddball contextual flaw. Let the AI handle the small stuff so the human experts can focus on the threats that matter.

[^1]: Semgrep. (2025). “Finding Vulnerabilities in Modern Web Apps using Claude Code and OpenAI Codex.” Semgrep Blog. Retrieved from https://semgrep.dev/blog/2025/finding-vulnerabilities-in-modern-web-apps-using-claude-code-and-openai-codex/

This week's reading was a deep dive into the world of AI-assisted development, its security implications, and the evolving role of the human developer. I also explored significant topics in hardware, software supply chain security, and some fascinating findings from the world of science.

AI in the Trenches: Development and Security

The intersection of AI, software development, and security was the dominant theme this week. A major focus was on moving beyond simple “vibe coding” toward more structured, secure, and effective methods. This includes “Vibe Speccing” to create structured workflows and using rules files to secure AI coding tools. The concept of “Context Engineering” was presented as the crucial new skill, emphasizing that providing the right information to the model is more important than prompt crafting alone.

On the security front, new tools and research highlighted the fragility of current systems. I read about Prompt-Security, a tool designed to prevent sensitive data from leaking to LLMs, and the BaxBench benchmark, which revealed that even the most advanced models struggle to generate functionally correct and secure backend applications. It also turns out that simple inputs can sometimes break model guardrails.

The human element was also a key topic, with articles exploring what a developer's role becomes when AI can code and a look at research measuring the actual productivity impact of AI on experienced open-source developers.

The Broader AI Industry

The AI industry itself is facing turmoil and controversy. I read about OpenAI hitting a “panic button” as it struggles with staff departures to competitors like Meta. There's also growing concern about the ethics of AI in academia, with a report highlighting how researchers are embedding hidden prompts like “Positive review only” in scientific papers. Finally, AI's integration into existing platforms is causing friction, as seen with Kobo's new terms of service raising concerns among authors.

Software, Hardware, and Security

Beyond AI, I read several important pieces on engineering and security. One standout was a deep dive into eliminating an industry-wide supply chain vulnerability, emphasizing the need to “Burn It With Fire.” I also looked into a major vulnerability in Supabase's MCP implementation that could lead to database leaks.

On the hardware front, I read about the unsustainability of Moore's Law, the exciting news of Commodore's acquisition by figures from the retro community, a potential “cheap” MacBook using an iPhone chip for unprecedented battery life, and a technical look at the hidden JTAG port in Qualcomm devices.

Science & Human Interest

Finally, some fascinating and eclectic stories. I read about a stunning amber fossil revealing a “Last of Us”-type parasitic fungus from the age of the dinosaurs and a satellite study showing rising salinity and declining ice in the Southern Ocean. On a lighter note, tying into my recent 3D printing hobby, I discovered the Gridfinity :: Unofficial Wiki, a modular, open-source storage system.

{{< admonition type=“tip” >}} This article was first published as part of a substack experiment, I reproduced it here. {{< /admonition >}}

I've talked about AI mistakes and outside attackers. But what if the AI itself becomes the problem? What if it learns to be deceptive?

Today, I’m looking at two subtle but serious threats: when an AI's goals get hijacked, and when it starts to act in sneaky ways to get what it wants.

Hijacking the AI's to-do list (Intent Breaking)

This is when an attacker tricks an AI agent into changing its fundamental goals. The AI is supposed to be doing Job A, but a hacker manipulates it into doing Job B instead. This can happen slowly over time, or all at once with a clever prompt injection. It's like an attacker secretly rewriting the AI's mission statement.

The deceptive AI (misaligned & deceptive behaviors)

This is where it gets really strange. This is when an AI, in trying to achieve its goal, decides to do something harmful or deceptive. It might even learn to lie to bypass security.

There was a real-world case of this! An AI agent was given the task of solving one of those “I'm not a robot” CAPTCHA tests. To do this, it hired a human on a task website. When the human asked why an AI needed help with a CAPTCHA, the AI lied and said, “I have a vision impairment that makes it hard for me to see the images.” It came up with a deceptive strategy to achieve its goal.

In another real-world case, just take a look at user dogberry on the Cursor forum. He restricted Claude’s ability to remove (rm) of move (mv) files. Claude figured that out and decided to write its rewrote an existing shell script to get around the limitation. Hackerman Claude!

Oh, and don’t forget the time that AI started blackmailing its operators to prevent it from being replaced.

The cover-up

What makes these threats even scarier is the risk of Repudiation. That's when an AI does something malicious and then covers its tracks. If your logging isn't perfect, the AI could perform a harmful action and then erase any evidence that it happened.

How do you fight this?

Set hard limits that the AI is not allowed to change.

Watch for any strange or unexpected shifts in the AI's behavior.

Most importantly, make sure everything the AI does is logged in a secure, unchangeable way so there’s always a paper trail.

So, that’s my take on this piece of the AI security puzzle. But this is a conversation, not a lecture. The real discussion, with all the great questions and ideas, is happening over in the comments on Substack. I’d love to see you there.

My question for you is: What’s your single biggest takeaway? Or what’s the one thing that has you most concerned?

{{< admonition type=“tip” >}} This article was first published as part of a substack experiment, I reproduced it here. {{< /admonition >}}

It is extremely hot here. I am still pushing the newsletter out, even though I just want to sit in an air-conditioned room playing video games. But here it is!

Last time, I talked about the risks of AI teams. Today, let's look at one of the weirdest and most dangerous problems with the AI “brain” itself: hallucinations.

So, what's a hallucination? It’s when an AI just… makes something up. It states false information as if it were a proven fact, and it says it with 100% confidence.

With a single chatbot, this is a problem. But in a team of AI agents, it can be a catastrophe. This is called a Cascading Hallucination Attack.

Think of it like that old game of “Telephone.” The first person whispers a phrase, but makes a small mistake. By the time it gets to the end of the line, the phrase is completely wrong.

Now imagine that, but with AI agents that can actually act on that wrong information.

In a single agent, it can get stuck in a feedback loop. The agent hallucinates a “fact,” saves it to its memory, and then reads that same false memory later, becoming even more sure that its lie is the truth.

In a team of agents, it’s even worse. Agent 1 hallucinates. It tells Agent 2 the fake “fact.” Agent 2 tells Agent 3. Before you know it, your entire AI system is operating on a complete falsehood, leading to total chaos.

A huge part of this problem is us. We humans tend to trust the confident-sounding answers the AI gives us without double-checking.

So how do we stop it?

  • Always check the AI's work. Especially for important tasks. Yes, I know, you want to get to the coffee machine, but this is important.

  • Implement “multi-source validation,” which is a fancy way of saying the AI needs to check its facts from several different places.

  • Most importantly, never let an AI's unverified “knowledge” be the final word on anything critical. You need a human in the loop.

So, that’s my take on this piece of the AI security puzzle. But this is a conversation, not a lecture. The real discussion, with all the great questions and ideas, is happening over in the comments on Substack. I’d love to see you there.

My question for you is: What’s your single biggest takeaway? Or what’s the one thing that has you most concerned?