The generalized learning is usually just the first step. Coding LLMs typically go through more rounds of specialized learning afterwards in order to tune and focus it towards solving those types of problems. Then there’s RAG, MCP, and simulated reasoning which are technically not training methods but do further improve the relevance of the outputs. There’s a lot of ongoing work in this space still. We haven’t seen the standard even settle yet.
- 0 Posts
- 128 Comments
An AI crawler is both. It extracts useful information from websites using LLMs in order to create higher quality data for training data. They’re also used for RAG.
Doesn’t work either
The text you provided translates to:
“But what about typing like this?”. This style of writing involves replacing standard Latin letters with similar-looking characters from other alphabets or adding diacritical marks (accents, tildes, umlauts) available in the Unicode standard.
Yeah, much like the thorn, LLMs are more than capable of recognizing when they’re being fed Markov gibberish. Try it yourself. I asked one to summarize a bunch of keyboard auto complete junk.
The provided text appears to be incoherent, resembling a string of predictive text auto-complete suggestions or a corrupted speech-to-text transcription. Because it lacks a logical grammatical structure or a clear narrative, it cannot be summarized in the traditional sense.
I’ve tried the same with posts with the thorn in it and it’ll explain that the person writing the post is being cheeky - and still successfully summarizes the information. These aren’t real techniques for LLM poisoning.
VoterFrog@lemmy.worldto
science@lemmy.world•How seeing the new color 'olo' opens the realm of vision scienceEnglish
12·3 days agoSo olo because it’s the middle of color?
VoterFrog@lemmy.worldto
Programming@programming.dev•The Efficiency Paradox: Why Making Software Easier to Write Means We'll Write Exponentially More
5·3 days agoMaybe Fallacy is a better word than Paradox? Take a look at any AI-related thread and it’s filled to the brim with people lamenting the coming collapse of software development jobs. You might believe that this is obvious but to many, many people it’s anything but.
Depends whether or not you consider the fastening of a surface to itself to be a real change to the topology.
VoterFrog@lemmy.worldto
Science Memes@mander.xyz•How about the digestive system?English
122·5 days agoA straw has zero holes. It’s just a flat piece of plastic wrapped around and attached to itself. Ain’t nobody drilling holes through plastic to make straws.
What the hell is call ID?
No, really, if you know what call ID is, you’re old.
VoterFrog@lemmy.worldto
Programming@programming.dev•AI’s Unpaid Debt: How LLM Scrapers Destroy the Social Contract of Open Source
2·14 days agoI can read your code, learn from it, and create my own code with the knowledge gained from your code without violating an OSS license. So can an LLM.
Not even just an OSS license. No license backed by law is any stronger than copyright. And you are allowed to learn from or statistically analyze even fully copyrighted work.
Copyright is just a lot more permissive than I think many people realize. And there’s a lot of good that comes from that. It’s enabled things like API emulation and reverse engineering and being able to leave our programming job to go work somewhere else without getting sued.
VoterFrog@lemmy.worldto
Programming@programming.dev•AI-authored code needs more attention, contains worse bugs
2·14 days agoYeah I don’t think we should be pushing to have LLMs generate code unsupervised. It’s an unrealistic standard. It’s not even a standard most companies would entrust their most capable programmers with. Everything needs to be reviewed.
But just because it’s not working alone doesn’t mean it’s useless. I wrote like 5 lines of code this week by hand. But I committed thousands of lines. And I reviewed and tweaked and tended to every one of them. That’s how it should be.
VoterFrog@lemmy.worldto
Programming@programming.dev•AI-authored code needs more attention, contains worse bugs
61·14 days agoI’m not sure I get your analogy. This is more to me like two people got into a bath and one went “Ooh, that’s a bit too warm” while the other screamed "REEEEEEE HOOOOOT”. The degree is the same. The response is not.
VoterFrog@lemmy.worldto
Programming@programming.dev•AI-authored code needs more attention, contains worse bugs
710·14 days agoKinda funny the juxtaposition between the programmers’ reaction to this compared to the “techies” reaction on the crosspost.
Maybe we’re still early yet so I’ll write the difference right now for posterity: Programming post is generally critical of the article and has several suggestions on how to improve the quality of agent-assisted code.
Technology post is pretty much just “REEEEEEEEEEE AI BAD”
VoterFrog@lemmy.worldto
Programming@programming.dev•LLM's hallucinating or taking our jobs?
2·18 days agoYou’re not going to find me advocating for letting the code go into production without review.
Still, that’s a different class of problem than the LLM hallucinating a fake API. That’s a largely outdated criticism of the tools we have today.
VoterFrog@lemmy.worldto
Programming@programming.dev•LLM's hallucinating or taking our jobs?
1·18 days agoI’ve thought about this many times, and I’m just not seeing a path for juniors. Given this new perspective, I’m interested to hear if you can envision something different than I can. I’m honestly looking for alternate views here, I’ve got nothing.
I think it’ll just mean they they start their careers involved in higher level concerns. It’s not like this is the first time that’s happened. Programming (even just prior to the release of LLM agents) was completely different from programming 30 years ago. Programmers have been automating junior jobs away for decades and the industry has only grown. Because the fact of the matter is that cheaper software, at least so far, has just created more demand for it. Maybe it’ll be saturated one day. But I don’t think today’s that day.
VoterFrog@lemmy.worldto
Programming@programming.dev•LLM's hallucinating or taking our jobs?
1·18 days agoAgents now can run compilation and testing on their own so the hallucination problem is largely irrelevant. An LLM that hallucinates an API quickly finds out that it fails to work and is forced to retrieve the real API and fix the errors. So it really doesn’t matter anymore. The code you wind up with will ultimately work.
The only real question you need to answer yourself is whether or not the tests it generates are appropriate. Then maybe spend some time refactoring for clarity and extensibility.
VoterFrog@lemmy.worldto
Programming@programming.dev•LLM's hallucinating or taking our jobs?
1·18 days agoThere are bad coders and then there are bad coders. I was a teaching assistant through grad school and in the industry I’ve interviewed the gamut of juniors.
There are tons of new grads who can’t code their way out of a paper bag. Then there’s a whole spectrum up to and including people who are as good at the mechanics of programming as most seniors.
The former is absolutely going to have a hard time. But if you’re beyond that you should have the skills necessary to critically evaluate an agent’s output. And any more time that they get to instead become involved in the higher level discussions going on around them is a win in my book.
I don’t think it’s wrong, just simplified. You don’t really have to touch the photon, just affect the wave function, the statistical description of the photon’s movement through space and time. Detectors and polarizers, anything that can be used to tell exactly which path the photon took through the slits will do this. Quantum eraser experiments just show that you can “undo the damage” to the wave function, so to speak. You can get the wave function back into an unaltered state but by doing so you lose the which-way information.
I think it’s reasonable if you consider the kind of physical situation it might represent.
You visit a farm and there are 2 unpackaged apples. There are also 5 packages that hold 8 apples but 5 have been removed from each. How many apples are there?
It’s worth noting that good IDE integrated agents also have access to these deterministic tools. In my experience, they use them quite often. Even for minor parts of their tasks that I would typically just type out.