Don’t lie. Don’t cheat. Don’t hurt people. Be fair. Be helpful. Be honest. Be accountable.
Those are the values we’re trying to code into artificial intelligence. As we were building this shiny new mirror, we are catching a glimpse of ourselves—and it isn’t pretty … because if we take the AI alignment rulebook and point it at human history, we fail. Hard.
We’re out here demanding that machines do no harm, while we’ve spent millennia perfecting harm at industrial scale. We want our AIs to be helpful and beneficial, while entire generations of philanthropic and aid systems have stumbled over colonial power dynamics, self-interest, and top-down “solutions” that solve little and disempower much. We preach fairness in machine learning models, while living in societies built on systemic inequality, racism, slavery, apartheid, and extraction.
We don’t want AIs that manipulate or deceive us, while our own governments and corporations run misinformation campaigns, sell cigarettes with cancer disclaimers in fine print, and bury climate truths under a billion-dollar fossil PR strategy.
We say we want honest machines. But can we honestly say we’ve even built an honest civilization?
This isn’t a philosophical exercise. This is the raw audit of our species through the very standards we’re asking our machines to live by. Harmlessness. Beneficence. Fairness. Honesty. Accountability. These aren’t new concepts—they’re just newly sharpened. And we’re suddenly realizing we’ve never quite measured up.
What we’ve built in AI alignment is not just a safeguard for future tech. It’s a mirror. And that mirror is showing us something hard but obvious: the real alignment problem isn’t the machines. It’s us.
Because let’s face it—teaching an AI to avoid genocide is important. But maybe it’s time we fully grasp the fact that we industrialized it first. That between 1956 and 2016, 50 million people died in genocides. That we built weapons that could end human life on Earth several times over, and we call it “security.”
That we’re still pumping poison into oceans, cutting down forests at record speed, and packaging everything in microplastic tombs while nodding along to “sustainability” slogans. We’re not living aligned. We’re just pretending to.
And yet. And yet—we can do better. Because even in the wreckage, there’s a blueprint for something else.
We’ve seen the sparks of coherence. Civil rights movements. Anti-apartheid struggles. Truth commissions. Investigative journalism. Scientific rigor. These are not perfect tools, but they are proof that alignment isn’t a fantasy. It’s just hard. It requires work. Integrity. Humility. A willingness to be wrong—and to change.
And this is where it gets even more interesting: we’re not just building smarter machines. We’re building a new ethical standard, one that forces us to grow up. One that doesn’t just ask, “What do we want the machines to do?” but “Who do we need to become?”
There’s a term from AI safety that captures this: Coherent Extrapolated Volition. The idea is to build systems that do what humanity would want if we knew more, thought faster, were more the people we wished we were.
That’s a beautiful goal. But it pulls down the masks.
Because maybe the most advanced AI will never be more dangerous than a misaligned human with too much power. Maybe the machines are giving us a gift: a final chance to look in the mirror and ask—are we ready to live up to the values we keep trying to teach?
We don’t need to be perfect. But we do need to be honest. About who we’ve been. About what we’ve done. And about the enormous gap between our aspirations and our behavior.
The machines will only ever reflect us. If we want better outputs, we need a better input. That starts with realignment—not just of the systems we’re building, but of the species doing the building.
The future won’t be written in code - it will be written in courage. In coherence. In whether or not we finally live up.
Yes, an important observation on how we are creating what we are not.
I really agree with the idea of using AI as a mirror like this. I’m experimenting with inserting coherence ideals as a kind of “formula”
https://open.substack.com/pub/earthstar111/p/earth-star-inter-ai-resonance-protocol