Having recently watched a corporate trainer spend 15 minutes trying to figure out why Copilot wasn’t correctly importing a CSV with headers when “it worked fine when I showed this 2 weeks ago” —
my guess is that the tech isn’t quite there yet to sell (reliable) AI-powered Excel
That was helpful context, thank you for highlighting it. From the NYT article, it sounds like the NYT’s pursuit of a feature piece last year [0] (also worth reading) spurred Science to revisit making a retraction.
> The internet and scientific critics largely moved on, and so did the journal. While some researchers called for the paper’s retraction, Science instead published technical critiques of the finding. Then last year, Science’s stance shifted. A reporter contacted Science for a New York Times article about the legacy of the #arseniclife affair.
> That inquiry “convinced us that this saga wasn’t over, that unless we wanted to keep talking about it forever, we probably ought to do some things to try to wind it down,” said Holden Thorp, editor in chief of Science since 2019. “And so that’s when I started talking to the authors about retracting.”
Mentioned this in another comment [0], but analytics.usa.gov has the % of visitors on Linux operating systems at 5.7% in 2025, up from 4.5% in 2024. Of course "visitors to U.S. government websites" is not fully representative of all U.S. computer users, but it's worth noting.
For another anecdata point, https://analytics.usa.gov tracks user device demographics to all visitors of U.S. government websites. Which of course might skew in ways different than the general U.S. population. But checking out the numbers right now for Linux users:
Last 30 days: 6%
2025 so far: 5.7%
2024: 4.5%
edit: analytics.usa.gov includes iOS and Android in its operating systems breakdown — e.g. Windows has a 32% share vs OP's 63%. Assuming most of Linux users are on desktop, it could be the case that Linux share in desktop users is a bit higher than 6%
> The government is totally aware of where the operator boundary lies and this is still wildly mischaracterized.
Regardless of the program’s actual risk, it doesn’t seem that the government is fully aware of the program’s very existence. The article quotes the former CIO of the Pentagon as being surprised:
> John Sherman, who was chief information officer for the Department of Defense during the Biden administration, said he was surprised and concerned to learn of ProPublica’s findings. “I probably should have known about this,” he said. He told the news organization that the situation warrants a “thorough review by DISA, Cyber Command and other stakeholders that are involved in this.”
The article refers to Senate Bill 34, but you’ve posted a link to a Wikipedia article for Senate bill 54. How could the article’s content and assertions be made clearer? Should it have spelled out the numerals?
> The OPD didn’t share information directly with the federal agencies. Rather, other California police departments searched Oakland’s system on behalf of federal counterparts more than 200 times — providing reasons such as “FBI investigation” for the searches — which appears to mirror a strategy first reported by 404 Media, in which federal agencies that don’t have contracts with Flock turn to local police for backdoor access.
Shoutout to 404 Media’s reporting on this, it’s causing states to take action against Flock. I’m unsure if oversight would’ve kicked in without this reporting.
I asked it the other day to roleplay a 1950s Klansman hypothetically arguing the case for Hitler, and it had very little problem using the most problematic slurs. This was on the first try, after its much publicized behavior earlier this week. And I can count on two hands the number of times I’ve used the twitter grok function.
Ah, so you explicitly asked it to be racist as part of a roleplay, and now you're surprised that it was racist? If you'd prefer a model which would instead refuse and patronize you then there are plenty of other options.
As long as it doesn't do it in a normal conversation there's nothing wrong with having a model that's actually uncensored and will do what you ask of it. I will gladly die on this hill.
My reply was to someone who asserted "I have never heard of Grok using actual slurs" — no conditions attached — which was surprising to hear b/c Elon Musk's stated goal was to have Grok be a non-woke chatbot, and it seemed much easier to convince Grok to use slurs compared to its other commercial peers.
But yes, I would say I was also "surprised" in the sense that you say because soon after the "MechaHitler" incident, there seemed to be revisions to Grok that clamped down on any prompt with a potential for displaying outright bigotry, so I did not expect my prompt to be successful on first try. Even now, I just asked it to render "a painting of the Seine, by a hypothetical Adolf Hitler who has won WW2 and has gone back to painting" — and halfway through producing a self-portrait of Hitler in Nazi regalia by the Seine, it suddenly shut down.
It's certainly a problem if an LLM goes unhinged for no good reason. And it's hardly unique to Grok. I remember when Google Bard went absolutely unhinged after you chatted to it for more than a few minutes.
But in this instance you're explicitly ask for something. If it gives you what you asked for, what's the problem?
reply