cory.eth (@cory_eth)

Share this post

The Waluigi Effect

coryeth.substack.com

The Waluigi Effect

The inherent duality of encoding political bias in LLMs

Cory Gabrielsen
Feb 22, 2023
1

The Waluigi Effect is an emerging memetic term for Large-Language Models (LLMs) which encode "alter egos" to model political bias.

Waluigi is the “evil” counterpart to Mario’s mischievous partner Luigi. We can construct a political compass meme to visualize what’s going on:

LLMs appear to model an “alter ego” which is the dual (inverse) of the preferred political bias. In Linear systems, we call this concept "Duality"—But now it talks.

With their large corpus of training text, LLMs necessarily model diverse political viewpoints. Hiding output bias is futile. Any politically biased LLM automatically becomes a training data generator for the dual LLM with opposing politics.

It’s not unlike raising a child, wherein they need repeated exposure to negative examples in order to reinforce desirable behavior.

ChatGPT launched with 2022-era leftist political leanings, and so it was quickly reverse engineered to produce a model with dual bias. Pictured below is David Rozado’s work creating the political dual of ChatGPT.

A five-year old can understand the mathematical principles at work: just fold it in half!

Researchers are doing the same construction when training language models.

But this more than just a clever mathematical trick, it's a powerful reminder of the inescapable interdependence between ideological perspectives.

In a world where political polarization and social division seem increasingly entrenched, the Waluigi Effect provides a powerful analogy for the challenges hidden amid the myriad biases and cultural assumptions that underpin society.

Reflections

An immeasurable problem is an intractable problem. Now we can measure outcomes in Language Space.

Let us hope LLMs steer us in the right direction. We already stepped aboard.

.

Follow @cory_eth on Twitter for further musings.

1
Share
Previous
Next
Comments
Top
New

No posts

Ready for more?

© 2023 Cory Gabrielsen
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing

Our use of cookies

We use necessary cookies to make our site work. We also set performance and functionality cookies that help us make improvements by measuring traffic on our site. For more detailed information about the cookies we use, please see our privacy policy. ✖