• 0 Posts
  • 400 Comments
Joined 2 years ago
cake
Cake day: July 14th, 2023

help-circle
  • There’s a whole history of people, both inside and outside the field, shifting the definition of AI to exclude any problem that had been the focus of AI research as soon as it’s solved.

    Bertram Raphael said “AI is a collective name for problems which we do not yet know how to solve properly by computer.”

    Pamela McCorduck wrote “it’s part of the history of the field of artificial intelligence that every time somebody figured out how to make a computer do something—play good checkers, solve simple but relatively informal problems—there was a chorus of critics to say, but that’s not thinking” (Page 204 in Machines Who Think).

    In Gödel, Escher, Bach: An Eternal Golden Braid, Douglas Hofstadter named “AI is whatever hasn’t been done yet” Tesler’s Theorem (crediting Larry Tesler).

    https://praxtime.com/2016/06/09/agi-means-talking-computers/ reiterates the “AI is anything we don’t yet understand” point, but also touches on one reason why LLMs are still considered AI - because in fiction, talking computers were AI.

    The author also quotes Jeff Hawkins’ book On Intelligence:

    Now we can see the entire picture. Nature first created animals such as reptiles with sophisticated senses and sophisticated but relatively rigid behaviors. It then discovered that by adding a memory system and feeding the sensory stream into it, the animal could remember past experiences. When the animal found itself in the same or a similar situation, the memory would be recalled, leading to a prediction of what was likely to happen next. Thus, intelligence and understanding started as a memory system that fed predictions into the sensory stream. These predictions are the essence of understanding. To know something means that you can make predictions about it. …

    The human cortex is particularly large and therefore has a massive memory capacity. It is constantly predicting what you will see, hear, and feel, mostly in ways you are unconscious of. These predictions are our thoughts, and, when combined with sensory input, they are our perceptions. I call this view of the brain the memory-prediction framework of intelligence.

    If Searle’s Chinese Room contained a similar memory system that could make predictions about what Chinese characters would appear next and what would happen next in the story, we could say with confidence that the room understood Chinese and understood the story. We can now see where Alan Turing went wrong. Prediction, not behavior, is the proof of intelligence.

    Another reason why LLMs are still considered AI, in my opinion, is that we still don’t understand how they work - and by that, I of course mean that LLMs have emergent capabilities that we don’t understand, not that we don’t understand how the technology itself works.




  • Why is 255 off limits? What is 127.0.0.0 used for?

    To clarify, I meant that specific address - if the range starts at 127.0.0.1 for local, then surely 127.0.0.0 does something (or is reserved to sometimes do something, even if it never actually does in practice), too.

    Advanced setup would include a reverse proxy to forward the requests from the applications port to the internet

    I use Traefik as my reverse proxy, but I have everything on subdomains for simplicity’s sake (no path mapping except when necessary, which it generally isn’t). I know 127.0.0.53 has special meaning when it comes to how the machine directs particular requests, but I never thought to look into whether Traefik or any other reverse proxy supported routing rules based on the IP address. But unless there’s some way to specify that IP and the IP of the machine, it would be limited to same device communications. Makes me wonder if that’s used for any container system (vs the use of the 10, 172.16-31, and 192.168 blocks that I’ve seen used by Docker).

    Well this is another advanced setup but if you wanted to segregate two application on different subnets you can. I’m not sure if there is a security benefit by adding the extra hop

    Is there an extra hop when you’re still on the same machine? Like an extra resolution step?

    I still don’t understand why .255 specifically is prohibited. 8 bits can go up to 255, so it seems weird to prohibit one specific value. I’ve seen router subnet configurations that explicitly cap the top of the range at .254, though - I feel like I’ve also seen some that capped at .255 but I don’t have that hardware available to check. So my assumption is that it’s implementation specific, but I can’t think of an implementation that would need to reserve all the .255 values. If it was just the last one, that would make sense - e.g., as a convention for where the DHCP server lives on each network.




  • Fair point, I should have asked about commercial games in general

    That said I didn’t mean that the game studio itself would do the AI training and own their models in-house; if they did, I’d expect it to go just as poorly as you would. Rather, I’d expect the model to be created by an organization specialized in that sort of thing.

    For example, “Marey” is one example I found of a GenAI model that its creators are saying was trained ethically.

    Another is Adobe Firefly, where Adobe says they trained only on licensed and public domain content. It also sounds like Adobe is paying the artists whose content was used for AI training. I believe that Canva is doing something similar.

    StabilityAI is also doing something similar with Stable Audio 2.0, where they partnered with a music licensing company, AudioSparx, to ensure that artists are compensated, AI opt outs are respected, etc…

    I haven’t dug into any of those too deep, but they seem to be heading in the right direction at the surface level, at least.

    One of the GenAI scenarios that’s the most terrifying to me is the idea of a company like Disney using all the material they have copyright for to train their own, proprietary GenAI image, audio, and video tools… not because I think the outputs would be bad, but because of the impact that would have on creators in that industry.

    Fortunately, as long as copyright doesn’t apply to purely AI generated outputs, even if trained entirely on your own content, then I don’t think Disney specifically will do this.

    I mention that as an example because that usage of AI, regardless of how ethically the model was trained, would still be unethical, in my opinion. Likewise in game creation, an ethically trained and operated model could still be used unethically to eliminate many people’s jobs in the interest solely of better profits.

    I’d be on board with AI use (in game creation or otherwise) if a company were to say, “We’re not changing the budget we have for our human workforce, including for contractors, licensed art, and so on, other than increasing it as inflation and wages increase. We will be using ethical AI models to create more content than we otherwise would have been able to.” But I feel like in a corporate setting, its use is almost always going to result in them cutting jobs.



  • Depends on your e-reader! If you have a Kindle, Kobo, or Nook, yes, that’s true. However:

    Boox has e-readers that run Android and you can install Hoopla. The Palma 2 is phone sized which is great. The Page, Leaf2, and Go 7 are all in the 7” form factor, plus they have 6” versions. And they have tablet sizes, too. They have both traditional black&white and color e-ink displays.

    I have the Boox Air 3C and the original Palma and both are great. I’ll likely get a Boox as my next standard sized e-reader, too (whenever I replace my Kindle Oasis). Though unless the technology drastically improves before then, it’ll be one with a black and white screen. (The color is nice in the tablet sizes, though, especially for comics from Hoopla.)

    Some other options that I’m less familiar with include:

    • Bigme has Android 7” color e-readers, as well as tablets and e-ink smartphones.
    • Meebook has e-readers that run Android (and Android e-ink tablets)
    • The MuSnap Aura C is a 10” Android e-ink tablet
    • XPPen has an 11” Android e-ink tablet




  • Copyright applies to unfinished works, too. There are many reasons it might not protect an unfinished work, but those reasons are still relevant even for finished works.

    If someone steals your physical drawing, that’s theft. If they take a picture of it, then use the picture - or your picture + modifications - without your permission, particularly in a commercial work, then that’s copyright infringement, but not theft. If they steal your physical drawing and then take a picture and so on, then it’s both theft and copyright infringement.

    Most likely this wasn’t considered copyright infringement because the allegedly copied art isn’t copyrightable, e.g., game mechanics; or the plaintiff didn’t own the copyrights themselves and thus couldn’t sue (possibly the arts were still copyrighted by the original artists, having never been purchased; possibly they were stock assets that were re-purchased by the defendant). There are any number of reasons. However, “the work wasn’t published” isn’t one of them.

    On the other hand, it’s quite likely they were able to sue for theft of trade secrets for that very reason. And they might have chosen to do that simply because proving copyright infringement is much more difficult.



  • hedgehog@ttrpg.networktoComic Strips@lemmy.worldThe Witch's Curse
    link
    fedilink
    arrow-up
    25
    arrow-down
    2
    ·
    1 month ago

    The witch turned the creep into a woman and the spell was complete by the time she flew away. Unfortunately, like many women, the creep was born with the body of a man (she’s AMAB). Maybe the witch could have changed her body, too, but that would have made things far too easy, given that the point of the curse was to teach her empathy.





  • I think the best way to handle this would be to just encode everything and upload all files. If I wanted some amount of history, I’d use some file system with automatic snapshots, like ZFS.

    If I wanted to do what you’ve outlined, I would probably use rclone with filtering for the extension types or something along those lines.

    If I wanted to do this with Git specifically, though, this is what I would try first:

    First, add lossless extensions (*.flac, *.wav) to my repo’s .gitignore

    Second, schedule a job on my local machine that:

    1. Watches for changes to the local file system (e.g., with inotifywait or fswatch)
    2. For any new lossless files, if there isn’t already an accompanying lossy files (i.e., identified by being collocated, having the exact same filename, sans extension, with an accepted extension, e.g., .mp3, .ogg - possibly also with a confirmation that the codec is up to my standards with a call to ffprobe, avprobe, mediainfo, exiftool, or something similar), it encodes the file to your preferred lossy format.
    3. Use git status --porcelain to if there have been any changes.
    4. If so, run git add --all && git commit --message "Automatic commit" && git push
    5. Optionally, automatically craft a better commit message by checking which files have been changed, generating text like Added album: "Satin Panthers - EP" by Hudson Mohawke or Removed album: "Brat" by Charli XCX; Added album "Brat and it's the same but there's three more songs so it's not" by Charli XCX

    Third, schedule a job on my remote machine server that runs git pull at regular intervals.

    One issue with this approach is that if you delete a file (as opposed to moving it), the space is not recovered on your local or your server. If space on your server is a concern, you could work around that by running something like the answer here (adjusting the depth to an appropriate amount for your use case):

    git fetch --depth=1
    git reflog expire --expire-unreachable=now --all
    git gc --aggressive --prune=all
    

    Another potential issue is that what I described above involves having an intermediary git to push to and pull from, e.g., running on a hosted Git forge, like GitHub, Codeberg, etc… This could result in getting copyright complaints or something along those lines, though.

    Alternatively, you could use your server as the git server (or check out forgejo if you want a Git forge as well), but then you can’t use the above trick to prune file history and save space from deleted files (on the server, at least - you could on your local, I think). If you then check out your working copy in a way such that Git can use hard links, you should at least be able to avoid needing to store two copies on your server.

    The other thing to check out, if you take this approach, is git lfs. EDIT: Actually, I take that back - you probably don’t want to use Git LFS.


  • It was already known before the whistleblower that:

    1. Siri inputs (all STT at that time, really) were processed off device
    2. Siri had false activations

    The “sinister” thing that we learned was that Apple was reviewing those activations to see if they were false, with the stated intent (as confirmed by the whistleblower) of using them to reduce false activations.

    There are also black box methods to verify that data isn’t being sent and that particular hardware (like the microphone) isn’t being used, and there are people who look for vulnerabilities as a hobby. If the microphones on the most/second most popular phone brand (iPhone, Samsung) were secretly recording all the time, evidence of that would be easy to find and would be a huge scoop - why haven’t we heard about it yet?

    Snowden and Wikileaks dumped a huge amount of info about governments spying, but nothing in there involved always on microphones in our cell phones.

    To be fair, an individual phone is a single compromise away from actually listening to you, so it still makes sense to avoid having sensitive conversations within earshot of a wirelessly connected microphone. But generally that’s not the concern most people should have.

    Advertising tracking is much more sinister and complicated and harder to wrap your head around than “my phone is listening to me” and as a result makes for a much less glamorous story, but there are dozens, if not hundreds or thousands, of stories out there about how invasive advertising companies’ methods are, about how they know too much, etc… Think about what LLMs do with text. The level of prediction that they can do. That’s what ML algorithms can do with your behavior.

    If you’re misattributing what advertisers know about you to the phone listening and reporting back, then you’re not paying attention to what they’re actually doing.

    So yes - be vigilant. Just be vigilant about the right thing.