

That is cool! I’ve been wanting I’ve wanted to use a model like this but haven’t really looked.
Are you self hosting the long context llm, of do what are you using?
Context lengths are what kill a lot of my local llm experiments.
That is cool! I’ve been wanting I’ve wanted to use a model like this but haven’t really looked.
Are you self hosting the long context llm, of do what are you using?
Context lengths are what kill a lot of my local llm experiments.
Had no idea they were doing that, but that’s plausible
And yes, it would shock me they can build this model this well and fuck this up.
I just hold little sympathy for the employees.
XAI was founded in 2023, 6 months after Elon acquired Twitter and did his layoffs. 4 months after XAI was created, when it was publicly announced, Musk stated that a politically correct AI would be dangerous
Anyone working at XAI already knew the game by then, they weren’t on visas who got legacied in.
During a launch event Friday afternoon, the mogul argued that politically correct AI is “incredibly dangerous” because it requires the technology to provide misleading outputs, citing the lies told by HAL 9000, the murderous AI in Stanley Kubrick’s 1968 film, “2001: A Space Odyssey.”
I thought she was already a rubber stamp
In April, Nigeria asked Google, Microsoft, and Amazon to set concrete deadlines for opening data centers in the country. Nigeria has been making this demand for about four years, but the companies have so far failed to fulfill their promises. Now, Nigeria has set up a working group with the companies to ensure that data is stored within its shores.
Just onshoring the data center does not solve the problems.
You can’t be sure no data travels to the US servers, some data does need to travel to the US servers, and the entire DC is still subject to US software and certificate keychains. It’s better, but not good or safe.
I need to channel my inner Mike Ehrmantrout to the US tech companies and government: you had a good thing going you stupid son of a bitch. You had everything you needed and it all ran like clockwork. You could have shut your mouth, cooked, and made as much money as you needed, but you just had to blow it up, you and your pride and your ego.
Seriously, this is a massive own goal by the US government. This is a massive loss to US hegemony and influence around the world that’s never coming back.
It has never been easier to build sovereign clouds with off the shelf and open source tooling. The best practices are largely documented, software is commoditized, and there are plenty of qualified people out there these days and governments staring down the barrel of existential risk have finally got the incentive to fund these efforts.
In February when Grok 3 was first on every chatbot arena metric – that was just a bit after deepseek r1, and o3, and the space has evolved a lot since then.
However, checking wikipedia it actually seems like they juiced the metrics by using unfair comparisons and letting Grok try the problems 64 times and returning the best answer in their comparisons.
https://en.wikipedia.org/wiki/Grok_(chatbot)#Grok-3
I’m a bit surprised the grok staff are capable enough to make grok briefly the top rated model, and incompetent enough they don’t know that putting things like this in the prompt poisons the model to always try and be politically incorrect.
LLMs are like Ron Burgundy, if it’s in the prompt they read it. Go fuck yourself XAI.
It’s actually wild that it was the #1 LLM for a while in terms of accuracy and usefulness.
But for whatever reason they seemingly keep fucking around with it for political reasons trying to make it politically agree with Elons stupid politics.
Definitely, I’m just trying to share a foot gun I’ve accidentally triggered myself!
For your database test data, I usually write a helper that defaults those columns to base values, so I can pass in lists of dictionaries, then the test cases are easier to modify and read.
It’s also nice because you’re only including the fields you use in your unit test, the rest are default valid you don’t need to care about.
I don’t know basic solutions that are super good, but whisper sbd the whisper derivatives I hear are decent for dictation these days.
I have no idea how to run then though.
One word of caution with AI searxh is that it’s weirdly vulnerable to SEO.
If you search for “best X for Y” and a company has an article on their blog about how their product solves a problem the AI can definitely summarize that into a “users don’t like that foolib because of …”. At least that’s been my experience looking for software vendors.
It’s a bit frustrating that finding these tools useful is so often met with it can’t be useful for that, when it definitely is.
More than any other tool in history LLMs have a huge dose of luck involved and a learning curve on how to ask the right things the right way. And those method change and differ between models too.
It is truly terrible marketing. It’s been obvious to me for years the value is in giving it to people and enabling them to do more with less, not outright replacing humans, especially not expert humans.
I use AI/LLMs pretty much every day now. I write MCP servers and automate things with it and it’s mind blowing how productive it makes me.
Just today I used these tools in a highly supervised way to complete a task that would have been a full day of tedius work, all done in an hour. That is fucking fantastic, it’s means I get to spend that time on more important things.
It’s like giving an accountant excel. Excel isn’t replacing them, but it’s taking care of specific tasks so they can focus on better things.
On the reliability and accuracy front there is still a lot to be desired, sure. But for supervised chats where it’s calling my tools it’s pretty damn good.
Agents do that loop pretty well now, and Claude now uses your IDE’s LSP to help it code and catch errors in flow. I think Windsurf or Cursor also do that also.
The tooling has improved a ton in the last 3 months.
If possible convert those files to compressed parquet, and apply sorting and partitioning to them.
I’ve gotten 10-100gb csv files down to 300mb-5gb sizes just by doing that
That makes searching and scanning so much faster, and you can do this all with open source free software like polars and ibis.
Definitely agree with you
From my experience most companies enshitify before the IPO to juice the metrics and boost their valuations (I.e. their payout).
The fact that they aren’t doing that yet that is a positive sign.
But founders aren’t immune to suffering from billionaire brain rot and years of exposure to the constant sycophancy and wealth seems to turn nearly everyone into a greed driven money soulless vampire.
I once watched a 2 part DVD movie in the wrong order and I thought it was a bold director’s choice to not introduce any of the characters or explain the background.
I hate that every product has added “for AI” to their name and homepage.
I guess the investor are asking for it, but as a customer I can’t tell what products actually work for my use case anymore.
What?