deleted by creator
Who downvotes this? Chris Sawyer is the GOAT.
Thank you. I almost forgot
In this thread: Programmers disassembling the joke to try and figure out why it’s funny.
Cute. It would be funnier if it was correct.
For people interested in the difference between decompiled machine code and source code I would recommend looking at the Mario 64 Decomp project. They are attempting to turn a Mario 64 rom into source code and then back into that same rom. It’s really hard and they’ve been working on it for a long time. It’s come a long way but still isn’t done.
I thought they were done already?
There is still some stuff that needs documenting, but the original goal of recompiling the created source code into the ROMs has been achieved. People are still actively working on it, so in that sense it’s maybe never done.
Okay, boomer here, be gentle.
So back in the ‘70s I dabbled in programming (now called “coding”, I hear). I only did higher-level languages like Fortran, Cobol, IBM Basic, but a friend had a job (at age 13!) programming in assembler. Is assembler now called assembly, or are they different?
It’s still called programming, coding is the same thing. Assembler more commonly refers to the utility program that converts the assembly code to machine code while assembly refers to the code itself, but the term assembler code is also valid. It’s uncommon to simply call the code assembler because it would be easily confused with the utility program.
Yep, some call it assembly, others call it assembler
(at age 13!)
c/suddenlyfactorial
Easier to say than “at age 6227020800”
I thought that the assembler is a specific program that translates mnemonics into the corresponding machine code. Perhaps in early computing this was done by hand so a person was the assembler (and worked in assembler), but now that is handled by software (and supports various macros). So programming in assembly would generate a stream of text that must be assembled by an assembler. (Although I have heard people refer to programming in assembler as well, just not often.)
I hear people say “program in assembler” but IMO that’s wrong. I’d say you write the code in “assembly language” (or better yet, the actual architecture you’re using like “x86 assembly”) but you “assemble” it with an “assembler”. Kind of like how you could write a program in the “C language” and “compile” it with a “compiler”
A compiler and an assembler do wildly different things though. An assembler simply replaces mnemonics while a compiler transfers instructions to a whole other language.
Depends on the language, really… C maps pretty closely to assembly language, it’s not as simple as one mnemonic to one machine code byte, more like tokens get mapped to sequences of machine code, a function call translates to some code that sets up a stack frame, a return tears it down…
I was too young/poor to afford an assembler for my 6502 so I wore out the assembly long hand on a legal pad and then manually converted each operation to machine code.
Needless to say my programs done this way were exceptionally simple, but it’s interesting to understand the underlying code.
It just occurred to me that AI in the nearish future will probably/almost certainly be able to do this.
I can’t wait for AI to make a PC port of every console game ever so that we can finally stop using emulators.
This won’t happen in our lifetime. Not only because this is more complex than rambling vaguely correlated human speech while hallucinating half the time.
deleted by creator
That dosen’t really translate to neural nets though. There is nothing inherent about matrix multiplication that would make it good at reading code. And also computers aren’t reading code they are executing it. The hardware just reads instruction by instruction and performs that instruction it has no idea what the high level purpose of what it is doing actually is.
Half of programming is writing code, the other half is thinking about the problem. As i learn more about programming i feel that it is even more about solving problems.
It’s the other way round. Code is being written to fit how a specific machine works. This is what makes Assembly so hard.
Also there is by design no understanding required, a machine doesn’t “get” what you are trying to do it just does what is there.
If you want a machine to understand what specific code does and modify that for another machine that is extremely hard because the machine would need to understand the semantics of the operation. It would need to “get” what you were doing which isn’t happening.
I think it’ll be in our lifetime just not anytime soon. I feel like AI is gonna boom like the internet did. Didn’t happen overnight and not even in a year but over 35ish years
Off the shelf models do this, yes.
Sophisticated local trained models on expensive private hardware are already dunking on publicly available versions. The problem of hallucination is generally resolved in those contexts
Sure but until I see such a thing I chose not to believe in fairy tales.
Decompiling arbitrary architecture machine code is quite a few levels above everything I’ve seen so far which is generally pretty basic pattern recognition paired with statistics and training reinforcement.
I’d argue decompiling arbitrary machine code into either another machine code or legible higher level code is in a whol other league than what AO has proven to be capable of.
Especially because with this being 90% accurate is useless.
Again you aren’t seeing this because these models are being developed for private enterprise purposes.
Regarding deep machine code analysis, sure, that’s gonna take work but the whole hallucination thing is an off the shelf, rookie problem these days
It’s not, though. Hallucinations are inherent to the technology, it’s not a matter of training. Good training can greatly reduce the likelihood, but cannot solve it.
Training doesn’t solve hallucination. I didn’t say that
Why does a pre-trained model need expensive private hardware after it was trained, other than to handle API requests faster? Is Open AI training chat-GPT on inferior hardware compared to these sophisticated private versions you mentioned?
The fine tuning, while much more efficient than starting fresh, can still be a large amount of work.
Then consider that your target corpus of data may also be large.
Then consider to do your reasoning tasks across that corpus also takes strong hardware to get production ready response times.
No, openai isn’t using inferior hardware, but their model goals, token chunking strategies and overall corpus are generalist in nature.
There are then processing strategies teams are using to go beyond the “memory” limitations gpt 4 has, that provide massive benefits to coherency, essentially anti hallucination and better overall reasoning
Idk the specifics, but what you say makes it sound like it would be easier to create an AI that recreates a game based on gameplay visuals (and the relevant controls)
That game would still not work because there is a ton of hidden state in all but the simplest computer games that you cannot tell from just playing through the game normally.
An AI could probably reinvent flappy birds because there is no more depth than what is currently on screen but that’s about it.
Ai prompt: make me a program that will convert PS5 games to PC
AI: Use Convert-PS5GameToPC
End of line
AI can literally read minds. I don’t think it’s that great of a step to say it should be able to decompile a few games.
About half the time, the text closely – and sometimes precisely – matched the intended meanings of the original words.
Don’t be surprised but about half of the time I can predict the result of a coin flip.
I’m not saying it’s not interesting but needing custom training and an fMRI is not “an AI can read minds”
It can see if patterns it saw previously reappear in a heavily time delayed fMRI. Looking for patterns you already know isn’t such an impressive feat Computers have done this for ages now.
It litterally can’t read minds.
It was a staple of Asimov’s books that while trying to predict decisions of the robot brain, nobody in that world ever understood how they fundamentally worked.
He said that while the first few generations were programmed by humans, everything since that was programmed by the previous generation of programs.
This leads us to Asimov’s world in which nobody is even remotely capable of creating programs that violate the assumptions built into the first iteration of these systems - are we at that point now?
No. Programs cannot reprogram themselves in a useful way and are very very far from it.
Eh, I’d say continuous training models are pretty close to this. Adapting to changing conditions and new input is kinda what they’re for.
Very far from reprogramming though. The general shape of the NN doesn’t change, you won’t get a NN made to process images to suddenly process code just by training it.
Then how does polymorphic/self-modifying code work?
It doesn’t or do you have serious applications for self-modifying code?
are we at that point now?
Nope, but we’re getting there.
Do what?
It’s honestly remarkable how few people in the comments here seem to get the joke.
Never stop dissecting things, y’all.
As above so below, the microscopic and the macroscopic
Joke aside, that’s kind of like claiming that any web frontend is open source because you can access the built, minified and often obfuscated source of it.
So true! I have been “hacking” some chrome extensions recently, do you know of a tool for reverse engineering JS?
IDA Pro (a disassembler) is closed source but came with a license that allowed disassembly and binary modification. Unfortunately, that’s no longer the case.
Why not use that NSA tool they released
Ghidra is open source even before you run the disassembler 🤯 great anecdote
I feel old watching this meme template
deleted by creator
What about server site executed code?
You moved the goal post! No fair!
Metasploit becomes your “decompiler”.
If you wanna skip a few inconvenient instructions in X86 assembly, throw a few No Operation instructions in the right places.
NOP = 0x90
And so you add a hashing check. But then that can be removed.
So you need one in the OS but that can be removed.
So you need one in hardware.
In other words no matter how clever you are there’s always a way to monkey with something unless you have absolute control from silicon on up.
Here’s a really interesting video the Xbox team did on the challenges of trying to make sure that the content running wasn’t pirated.
While DRM is the bane of everybody there are cases where trust and integrity is important and it’s an intriguing look into how hard it is to manage.
While DRM is the bane of everybody there are cases where trust and integrity is important and it’s an intriguing look into how hard it is to manage.
Nah, when the user wants to ensure trust and integrity in his own system, it works just fine. The problem comes when the user who needs to be able to access the data is simultaneously the adversary who needs to be stopped from accessing the data.
In other words, it’s one of those situations where the fact that it’s hard to manage is a gigantic clue that it’s wrongheaded to try to do so in the first place.
I agree. I mean when doing secure channel communications or weapons systems or health biometrics.
There are cases where you need to be sure of the integrity of the data and environment
It’s called _soft_ware for a reason 😹
Meanwhile, I’ve been archiving terabytes of software with no DRM, with no account.
OS - obfuscated source
Open source ≠ Source availiable
Example of non open source programs with source code https://en.m.wikipedia.org/wiki/List_of_proprietary_source-available_software
Open source ≠ free software
Open source inherently means you can compile the code locally, for free. You can’t necessarily redistribute it, depending on the license, but I’m not aware of a “you can compile this source for testing and code changes only but if you use it as your actual copy you are infringing” license.
I am very much open to correction here.
Open source inherently means you can compile the code locally,
Open Source means more than that. It is defined here:
If you use the phrase “open source” for things that don’t meet those criteria, then without some clarifying context, you are misleading people.
for free.
Free Software is not the same as “software for free”. It, too, has a specific meaning, defined here:
https://www.gnu.org/philosophy/free-sw.html
When the person to whom you replied wrote “free software”, they were not using it in some casual sense to mean free-of-charge.
Most free software is also open source and vice versa, but not all, the difference usually lies in the licence, this stackexchange answer gets it pretty well
Free as in free speech, not as in free beer
Where are all those free beer I always hear about?!
Source available