The Alexandrian

Archive for the ‘Politics’ category

Woman in Cybergear

There’s been Discourse™ of late about the use of GenAI/LLMs in creating RPGs. Not the artwork in an RPG book (that’s a whole ‘nother kettle of fish), but the actual design and development of the game itself: Feeding game text into ChatGPT, Claude, or similar chatbots and asking it to critique, analyze, revise, or otherwise provide feedback.

If you know anything about how LLMs work, it will likely be immediately obvious why this is a terrible idea. But the truth is that a lot of people DON’T know how LLMs work, and that’s increasingly dangerous in a world where we’re drowning in their output.

Michael Crichton described the Gell-Mann amnesia effect: “You open the newspaper to an article on some subject you know well. In Murray’s case, physics. In mine, show business. You read an article and see the journalist has absolutely no understanding of either the facts or the issues. Often the article is so wrong it actually presents the story backwards—reversing cause and effect. (…) In any case, you read with exasperation or amusement the multiple errors in a story—and then turn the page to national or international affairs, and read with renewed interest as if the rest of the newspaper was somehow more accurate about far-off Palestine than it was about the story you just read. You turn the page… and forget what you know.”

Flipping that around, I think analyzing stuff like LLMs in arenas we’re familiar with is valuable because we can more easily see the failures and absurdities. My particular arena of expertise and familiarity — and one I think is likely shared by most of you reading this — is RPGs. So let’s use that familiarity as a lens for looking at LLMs.

Before we start, let’s set a couple baselines.

First, I don’t think AI is completely worthless. I also don’t think it’s the devil. Whether we’re talking about LLMs or some of the other recent technology that’s all getting lumped together as “AI” or “GenAI,” there’s clearly specific ways of using those tools (and also building those tools) which can be ethical and valuable. I don’t think pretending otherwise is particularly useful in trying to prevent the abuse, theft, propaganda, systemic incompetence, and other misuse that’s currently happening.

Second, I am not an expert in LLMs. If you want a truly deep dive into how they work, check out the videos from Welch Labs. (For example, The Moment We Stopped Understanding AI.)

I think the key thing to understand about LLMs, however, is that they are, at their core, word-guessers: They are trained on massive amounts of data to learn, based on a particular pattern of words, what the next most likely word would be. When presented with new input, they can then use the patterns they’ve “learned” to “guess” what the next word or set of words will be.

This is why, for example, LLMs were quite bad at solving math problems: Unless they’d “seen” a specific equation many times in their training data (2 + 2 = 4), the only pattern they could really pick out was X + Y = [some random number].

LLMs are actually still incredibly bad at math, but the “models” we interact with have been tuned to detect when a math problem is being asked (directly or indirectly) and use a separate calculator program to provide the answer. So they look significantly more competent than they used to.

DESIGNING WITH CHATGPT

It’s truly remarkable how far what are fundamentally babble generators can take us. With nothing more than word-guessing, LLMs can create incredible simulacrums of thought. Every generation interprets human intelligence through the lens of modern technology — our brains were full of gears and then they were (steam) engines of thought before becoming computers — but it’s hard not to stare into the abyss of the LLM and wonder how much of our own daily discourse (and even our internal monologue?) is driven by nothing more than pattern-guessing and autonomic response. We see it in the simple stuff:

Ticket Taker: Enjoy the show!

Bob: Thanks! You, too!

But does that sort of thing go deeper than we’ve suspected?

Regardless, there’s one thing missing from LLMs: The ability to form mental models. They can’t read a text, form a mental model of what that text means, and then use that mental model. They can’t observe the world, think about it abstractly, and then describe their conclusions. All they can do is produce a stream of babbled text.

This is why the term “hallucinate” is deceptive when used to describe LLMs’ propensity for spreading misinformation. A “hallucination” would imply that the LLM has formed a false mental model of the world and is now describing that false understanding. But this is not, in fact, what’s happening. What’s happened is that it guessed a word and that word, while matching the patterns found in the model’s training data, did not conform to reality. It’s just words. There is no underlying mental model behind them.

It’s also why asking LLMs to critique anything more complex than the grammar of individual sentences is a waste of time. In order to meaningfully critique something, you have to be able to form a mental model of that thing, have deep and original thoughts about it, and then figure out how to express the conclusions you’ve drawn. An LLM can’t do any of that. At best, it can produce a simulacrum of criticism — a babble that you could perhaps use like a Rorschach blot to free associate your way to a useful insight.

Which brings us to the current movement convinced that they can meaningfully “collaborate” with ChatGPT when designing RPGs. They want to feed in the text of a rulebook and ask the LLM things like:

  • Can you build an encounter for 7th level characters?
  • Is this new class I’ve designed balanced with existing classes?
  • Are there any problems with the probability of my core mechanic?
  • What attribute scores and skills should I have?

In light of what we know, of course, we can immediately see the problem: ChatGPT can’t reliably do basic arithmetic, but it can definitely do complex analysis of mathematical game balance in a system of rules it fundamentally can’t understand?

Someone recently challenged me to “see for myself”: Take an existing RPG, feed the rules into ChatGPT, and ask it to playtest the rules.

This was an interesting challenge, and I’m certainly always open to being proven wrong.

Let’s take a peek at what happened.

CHATGPT DOES A PLAYTEST

I pointed ChatGPT to a copy of Lasers & Feelings, a one-page space opera RPG by John Harper. Then I asked it to generate a playtest transcript of the game.

The choice of game was deliberate: Using a one-page RPG should make it much easier for ChatGPT to analyze and use the entire text. The game is also limited to two mechanics: The action roll and Helping (which adds +1d6 to another character’s action roll). This was the challenge in its easiest form possible.

ChatGPT begins by saying this:

Lasers & Feelings — Playtest Session Transcript

GM: You’re the crew of the interstellar scout ship Raptor, alone while Captain Darcy recovers in a medical pod.

ChatGPT is combining two different sentences from the PDF. Then:

Your mission: find the source of a distress signal coming from an uncharted system.

This is incorrect. In Lasers & Feelings, the GM is supposed to roll on a mission generator. ChatGPT’s mission cannot be generated by the game. But let’s give it a pass here, because there are plenty of human GMs who would also ignore the mission generator and just riff.

ChatGPT now moves on to character creation. It does a decent job of identifying the four steps of character creation, but almost immediately screws up basic math:

Number: 3 (balanced but slightly leaning Lasers)

In Lasers & Feelings, the player picks a number for their character from 2-5. “A high number means you’re better at LASERS (technology; science; cold rationality; calm, precise action). A low number means you’re better at FEELINGS (intuition; diplomacy; seduction; wild, passionate action).”

The number 3, you’ll note, does not “lean Lasers.” It does the exact opposite.

Furthermore, there’s no such thing as “balanced” in Lasers & Feelings. The game specifically doesn’t give you the choice of a midpoint. The whole point is that there’s a tradeoff between Lasers & Feelings. ChatGPT has fundamentally misunderstood the core design principles and theme of the game.

After character creation, ChatGPT proceeds with a transcript of play, and it almost immediately makes a skill check:

Lee: I want to pilot us carefully in. That’s Lasers because it’s technical precision.

Lee rolls 1d6 + 1d6 (prepared) since they have experience navigating rock fields.

This is incorrect. Lee is a Pilot, which means he’s an expert in piloting. If he’s also prepared (as ChatGPT asserts), he should be rolling 1d6 + 1d6 (expert) + 1d6 (prepared).

GM: Your target number is 3 (your stat). Under 3 counts as a success for Lasers.

This is incorrect. A 3 would also count as a success (in addition to generating a Laser Feeling).

ChatGPT cites this blog post as a source for this, but the blog post summarizes the mechanic correctly. ChatGPT just screwed up.

Lee rolls: 2, 5. ✔️ One die under 3 — Success!

GM: You thread us through the asteroids. The distress beacon pings again. What do you do?

According to the rulebooks: “If one die succeeds, you barely manage it. The GM inflicts a complication, harm or cost.”

The GM did not inflict a complication, harm, or cost. ChatGPT has screwed up again.

The “transcript” continues in this vein. Sometimes ChatGPT gets the rules right. It frequently doesn’t, in a wide variety of ways.

You can see the pattern and understand the root cause: ChatGPT can’t actually understand the rules of Lasers & Feelings (in the sense of having the words of the rulebook create a mental model that it can then use independent of the words) and, therefore, cannot truly use them. It can only generate a sophisticated pattern of babble, guessing what the next word of a transcript of Lasers & Feeling game session would look like based on the predictive patterns generated from its training data.

And if it can’t understand the rules well enough to accurately call for a simple action roll, what possible insight could it have into the actual design of the game?

None, of course. Which is why, when I asked it what changes it would make to the game to reinforce the themes, it replied with stuff like:

  • The GM should only be allowed to inflict consequences that affect relationships. (Making the game functionally unplayable.)
  • Encourage players to switch modes between Feelings and Lasers by inflicting a -1d penalty to the next Feelings roll each time a characters uses Lasers. (This rule would obviously have the exact opposite Plus, it doesn’t recognize that many rolls only use 1d, so how would this rule even work?)

Maybe one of these nonsense ideas it generated will spark an idea for you, but it’s inspiration from babble. Mistaking it for actual critical insight would be a disastrous mistake.

AI GAME MASTERS

Reading ChatGPT’s “transcript” of play, however, it’s nevertheless impressive that it can produce these elements and distinct moments: The distress call isn’t from the rulebook. It’s plucked that out of the ether of its training data. When I mentioned earlier that it’s remarkable how much can be achieved with an ultra-sophisticated babble engine, this is the type of thing I was talking about.

Examples like this have led many to speculate that in the not-too-distant future we’ll see AI game masters redefine what it means to play an RPG. It’s easy to understand the allure: When you want to play your favorite game, you wouldn’t have to find a group or try to get everyone’s schedules to line up. You’d just boot up your virtual GM and start playing instantly. It’s the same appeal that playing a board game solo has.

Plus, most publishers know that the biggest hurdle for a new RPG is that, before anyone can play it, you first have to convince someone to GM it — a role which almost invariably requires greater investment of time, effort, and expertise. If there was a virtual alternative, then more people would be able to start playing. (And that might even end up creating more human GMs for your game.)

There will almost certainly come a day when this dream becomes a reality.

But it’s not likely going to come from simply improving LLM models.

This Lasers & Feelings “transcript” is a good example of why:

  • The PCs are following a distress signal.
  • It turns out that the distress signal is actually a trap set by bloodythirsty pirates. Two ships attack!
  • ChatGPT momentarily forgets that everyone is onboard ships.
  • We’re back in ships, but now there’s only one pirate ship.
  • And now they’re no longer pirates. They’re lost travelers who are hoping the PCs can help them chart a course home.

It turns out that the GM’s primary responsibility is to create and hold a mental model of the game world in their mind’s eye, which they then describe to the players. This mental model is the canonical reality of the game, and it’s continuously updated — and redescribed by the GM — as a result of the players’ actions.

And what is ChatGPT incapable of doing?

Creating/updating a mental model and using language to describe it.

LLMs can’t handle the fictional continuity of an RPG adventure for the same reason they “hallucinate.” They are not describing their perception of reality. They are guessing words.

The individual moments — maneuvering through an asteroid belt to find the distress signal; performing evasive maneuvers to buy time for negotiations; helping lost travelers find their way home — are all pretty good simulacra. But they are, in fact, an illusion, and the totality of the experience is nothing more than random babble.

And this is fundamental to LLMs as a technology.

Some day this problem will be solved. There are a lot of reasons to believe it will likely happen within our lifetimes. It may even incorporate LLMs as part of a large AI meta-model. But it won’t be the result of throwing ever greater amounts of computer at LLM models. It will require a fundamentally different — and, as yet, unknown — approach to AI.

Supreme Court Questions

When I was in high school, I became aware that the Republicans had made a conscious decision to weaponize the courts: They were angry when the Constitution kept telling them that their legislative agenda wasn’t legal. In 1968, Nixon dusted off an old term (“strict constructionist”) and they began crafting a new theory of law around it. It claimed to be all about applying a strict reading to the Constitution; to do nothing except read literally what was on the page and apply it in the most literal way possible.

And if that’s what strict constructionism actually was, I’d be all for it. But, of course, it isn’t. “Strict constructionism” is a code word for “conservative activism”. If it was actually about a strict reading of the Constitution, then, to pick just one example, its practitioners wouldn’t have so much difficulty finding the 14th Amendment in their copies. “Equal protection of the laws,” after all, is a pretty cut-and-dry statement.

This ideology of conservative activism got a huge boost in the ’70s when the Court found a “right to privacy” in the Constitution which, if applied logically, would allow them to overturn any law they felt like at their whim. It gave the “strict constructionist” movement the red meat it needed.

Since that point, the unabashed goal of the Republican party has been to stack the Court with conservative activists. And, by and large, the progressives in America have let them do it. I still have friends who talk proudly about voting for Nader in 2000. Many of them were complicit in Trump becoming President, by either staying home or by voting for a third party candidate. And it’s not just Presidential votes, either: Apathetic progressives have repeatedly handed Republicans congressional control. The result is that only three out of twenty judges in the last 50 years have been appointed by a Democrat with a Democratic congress.

So for about twenty years I’ve watched this slow motion trainwreck happening. And now that it’s finally arrived, it’s actually worse than anything I imagined when I was eighteen. Because it’s not just a matter of reversing what the “liberal” court had achieved. It extends beyond that. Kennedy’s replacement will lock in for at least 15-20 years a conservative majority which has already demonstrated that it will:

  • Prevent any form of election reform.
  • Go further than that, and explicitly allow Republicans to rig the electoral system.
  • Go even further than that, and allow Republicans to pass laws dismantling non-Republican political organizations.

And, yes, this conservative court will also dismantle any form of public healthcare, roll back the rights of anyone who isn’t a white, straight, Christian male, and do far more damage besides. But it is this fundamental, anti-democratic core of the new Republican ideology — an anti-democractic agenda which will now be ruthlessly enforced by the Supreme Court — which is the death knell of America.

We need to show up in 2018 and we need to show up in 2020 even more. And not just at the national level: Progressives need to win at the state level in a census year to undo a lot of the damage the Republicans have done over the last decade; and they need to continue winning at the state level consistently for many years to come to make it stick. But the truth is it may already be too late: The Republicans have waged a fifty year campaign to take the keys to the kingdom. Over the last decade, they’ve been working hard to rig the system. And now that they have the Supreme Court, they will use it to lock that rigging into place.

I was recently linked to this story on Facebook: U.S. Government Bans Native American Tribe From Protesting On Their Own Land – Send In Police To Remove Protesters.

As far as I can tell, the linked story is bullshit. First, it’s unclear which judicial action it’s reporting on. The article was written on September 7th, but the only judicial action on that day was actually a victory for Native American protestors.

Digging a little deeper, however, it appears that this is actually just a spam site that’s repackaging a story that got a lot of clicks on Facebook so that it can harvest some of that proven clickbait. It was most likely posted by an algorithm that noticed an uptick in Native American-related or pipeline-related stories on social media, and decided to copy-paste an earlier story on those topics which was a known success at attracting likes and shares.

The story it was copying, however, was actually just a spammy repackaging of actual reporting that had taken place several days earlier by Telesur.

Telesur’s story, however, wasn’t accurate. And their headline (“Native Americans Banned from Protesting Pipeline on Own Land”) was total bullshit. As Native News Online accurately reported, the judge’s order only prohibited them from physically interfering with construction. It didn’t ban them from protesting. Furthermore, the site covered by the judge’s order wasn’t actually on a Native American reservation, so it never banned them from ANYTHING “on their own land”.

So, to sum up: Inaccurate reporting tied to a completely inaccurate headline caused a bunch of fringe websites to post mock-outrage stories about something that wasn’t actually happening. One of those mock-outrage stories remixed the headline into a mostly fact-free rant masquerading as a news story and paired it to a really great photograph that caused people to click it and share it. Then some trashy sites noticed that the post was popular and duped it in order to harvest the advertising revenue.

The photograph, by the way, is actually of a Brazilian man from 2012: “An indigenous man stands as riot police stand guard during the UN Conference on Sustainable Development, or Rio+20, in Rio de Janeiro, Brazil, Wednesday, June 20, 2012. Brazil’s indigenous are protesting the government’s plan to construct the large Belo Monte hydroelectric dam in the Amazon.”

And that’s how most Americans are getting their news in 2016.

Which is a problem. Because, as we’ve just demonstrated, what the algorithms, systems, and mob psychology of social media select for is not the dissemination of truth. It is the dissemination of outrage. When you unthinkingly allow yourself to take in that outrage, you’re doing a disservice to yourself. And when you unthinkingly allow that outrage to drive your actions — even the simple action of hitting a Like or Share or Retweet or Up Vote button — you’re doing a disservice to everyone around you.

You’ll frequently hear authors and IP companies bitching and moaning about the fact that they don’t see a penny when their copyrighted material is sold on the used market. Even otherwise fairly intelligent folks like Isaac Asimov have irrationally believed that people buying used paperbacks were sticking daggers in their backs.

Even if we ignore the ethically tenuous position of people who want to sell you a toaster and then prohibit you from ever selling that toaster to somebody else (which a few weeks ago I would have considered hyperbole, but then the Ninth Circuit Court of Appeals decided it would be a good idea to gut consumer protection and ship American jobs overseas all in one fell swoop), the claim being espoused here is fundamentally nonsensical.

What they’re overlooking (either willfully or ignorantly), is the actual effect that being able to sell used books has on the original customer’s buying habits:

First, it influences their decision to buy. (“I’m willing to pay $50 for this textbook, but only because I know I can sell it back for $15 at the end of the semester.”) If they weren’t able to recoup a portion of their investment, they might never buy it in the first place.

Second, it amortizes risk. (“I dunno if this DVD is worth $20. But I guess if I don’t like it, I’ll be able to sell it for at least $8. $12 isn’t that much of a risk.”) Customers who can amortize their risk are more likely to buy. And if the product turns out to be good, they may not resell at all.

Finally, it injects fresh capital: The $10 you get from GameStop for your video game is often going right back into purchasing a brand new game at GameStop.

This effect is somewhat diffused and may, therefore, not be clear when it comes to books or DVDs or video games. But it’s crystal clear when you look at the auto industry: X buys a $30,000 car from Ford. X sells it a couple years later to Y for $10,000 and uses that money to buy another $30,000 car. A couple years later X sells his new $30,000 car to Y for $10,000, while Y sells the original car to Z for $2,000.

Holy shit! Ford has lost all that money spent by Y and Z! X is ripping Ford off! … right?

Nope. Because (a) X couldn’t afford to buy a $30,000 car every two years if he wasn’t selling to Y; and neither Y nor Z can afford $30,000 new cars. The money from Y and Z is, in fact, funneling right up the system and into Ford’s pocket. And everybody wins: Ford makes more money. X gets fancy new cars on a more frequent basis. Y and Z get cars they otherwise couldn’t afford.

This is why nobody in the auto industry makes a new car that they can sell for $5,000 despite the obvious market for $5,000 vehicles.. They’re already getting the money from the $5,000 market.

As virtually everyone in the world knows, there’s a massive oil spill in the Gulf of Mexico. I’m not going to spend a lot of time harping on details (since they’re well-known and you can Google ’em if you’re curious), but I have two thoughts on the matter I’d like share.

First, blame.

Second, solutions.

THE BLAME

Figuring out who, exactly, is to blame for this catastrophe is going to play out over several months. Possibly years. But there are  a couple things which are abundantly clear:

(1) There’s something rotten with BP. When you’ve racked up 700+ safety violations at your deepwater drilling platforms and every other oil company has less than a dozen… well, it doesn’t take a genius to figure out that BP was doing something wrong.

(2) Under President Bush, the Minerals Management Service somehow managed to devolve into the sort of cocaine-snorting, sex-addled, graft-ridden machine of corruption one really only expects to see in Hollywood action blockbusters. This was part of the Bush Administration’s wider failure to maintain the robust regulatory agencies required by law. (See also No One Would Listen: A True Financial Thriller.) And the election of Obama didn’t magically fix these problems.

Since the Deepwater Horizon rig exploded, the MMS has approved 27 new offshore drilling projects. All but one of these were granted the same exemptions from environmental review as the Deepwater Horizon platform. Incredibly, the reason these exemptions were granted is because of the implausibility of a spill resulting from deep water drilling.

(3) President Obama isn’t to blame for the current spill. Nor is it clear to me what action he could reasonably be taking at this point to speed the progress of disaster efforts in the Gulf. (Getting angry or wearing a less-fancy shirt won’t actually accomplish anything, no matter what the brain-dead, narrative-addicted media tries to tell you.)

But where Obama does deserve to be smacked around is the fact that he decided to reverse course on his campaign promise not to allow off-shore drilling. Of course, there was no way for Obama to know that the Deepwater Horizon disaster was coming (and that, as a result, he was irreparably shooting himself in the foot and wasting what could have been amazing political capital and a complete vindication of his policies).

But what Obama should have known is what everyone who supported his opposition to off-shore drilling knew years ago: Off-shore drilling platforms are not some form of magical technology which is completely impervious to bad luck, bad design, or bad maintenance. Like everything else ever built by man, this technology is fallible. And, as we’re seeing, the environmental impact when something goes wrong can be huge.

THE SOLUTION

All that being said, I have the solution for stopping the oil spill.

This isn’t because I’m a genius. It’s because everyone involved already knows what the solution is: Drilling relief wells which can be used to repressurize the pipe.

Drilling Relief Wells

Everything else going on in the Gulf of Mexico right now is a sideshow of bread and circuses designed to keep people mildly appeased and distracted until the relief wells finally reach the right depth. (Which isn’t anticipated to happen until August.) Relief wells are the only way we know to stop spills from blowouts.

We know this because all of this has happened before: On June 3rd, 1979, the Ixtoc oil well suffered a blowout. All of the same techniques being attempted at Deepwater Horizon were attempted at the Ixtoc: Garbage was dumped into the hole. Mud was pumped into it. Chemical dispersants were used. A massive Top Hat-like cap was unsuccessfully lowered into place. (It was called — and I wish I was kidding as I said this — SOMBRERO.)

And the only thing that finally stopped the Ixtoc blowout were the relief wells that were finally drilled to relieve the pressure. The Ixtoc well was not successfully capped until March 1980.

So here’s the hard, bitter truth: There is absolutely nothing that can be done about this spill until the relief wells currently being drilled are completed.

But here’s what needs to happen in the future: Instead of waiting for disaster to strike before beginning the relief wells (which will then take months to reach the necessary depth), oil companies should be REQUIRED to maintain two relief wells in addition to their main well at ALL of their ocean oil rigs.

The next time disaster strikes, these pre-drilled relief wells can be quickly connected to the main well, pressure can be rapidly alleviated, and the scope of the disaster can be rapidly contained.

Archives

Recent Posts

Recent Comments

Copyright © The Alexandrian. All rights reserved.