• 0 Posts
  • 183 Comments
Joined 1 year ago
cake
Cake day: June 23rd, 2023

help-circle

  • You say “Not even close.” in response to the suggestion that Apple’s research can be used to improve benchmarks for AI performance, but then later say the article talks about how we might need different approaches to achieve reasoning.

    Now, mind you - achieving reasoning can only happen if the model is accurate and works well. And to have a good model, you must have good benchmarks.

    Not to belabor the point, but here’s what the article and study says:

    The article talks at length about the reliance on a standardized set of questions - GSM8K, and how the questions themselves may have made their way into the training data. It notes that modifying the questions dynamically leads to decreases in performance of the tested models, even if the complexity of the problem to be solved has not gone up.

    The third sentence of the paper (Abstract section) says this “While the performance of LLMs on GSM8K has significantly improved in recent years, it remains unclear whether their mathematical reasoning capabilities have genuinely advanced, raising questions about the reliability of the reported metrics.” The rest of the abstract goes on to discuss (paraphrased in layman’s terms) that LLM’s are ‘studying for the test’ and not generally achieving real reasoning capabilities.

    By presenting their methodology - dynamically changing the evaluation criteria to reduce data pollution and require models be capable of eliminating red herrings - the Apple researchers are offering a possible way benchmarking can be improved.
    Which is what the person you replied to stated.

    The commenter is fairly close, it seems.


  • Maybe you can.
    All the money I’d earmarked for kung fu lessons and a collection of random lethal weapons wound up going into pet care and hobbies. Besides, I definitely don’t have plot armor. I’d get popped by some junior security mail cop. They probably wouldn’t even have to shoot me. They’d run me over with their Segway, I’d fall, crack my head open, and they’d put a little skull and crossbones sticker on their scooter, like a WWII fighter pilot.


  • So are the old, children, and anyone not within the “best for capitalism” bell curve (which I just made up, but you get the sentiment).

    That’s basically a pro-capitalist argument used to justify a system which should not exist in the first place.
    If we’re discussing our thoughts of what should be, then I believe healthcare should be a function of government and a human right.


  • that happiness should not come at the others’ expense.

    It’s probably a failure of imagination, but I’ve never really understood this part of anti-acceptance sentiment.
    Any valid criticisms I’ve seen largely come down to regular old accessibility failures or capitalism making people believe airline seats should be miserably cramped.

    What is the trade off, in a real, not hypothetical sense?









  • Monument@lemmy.sdf.orgto196@lemmy.blahaj.zoneThat was quick rule
    link
    fedilink
    English
    arrow-up
    11
    arrow-down
    1
    ·
    edit-2
    12 days ago

    No shit. I posted this on one of this first articles, where a commenter pointed out the headline was a lie and the first ruling just found that Georgia had no standing, and that the judge had transferred the case to Missouri.

    Which is more or less what happened the last time Biden tried to forgive student loans. Eventually Missouri was found to have standing, and all his efforts were thrown out.
    Aside from a nagging feeling that it was known this was going to happen, and this was all for political talking points, I wanted to info dump.
    A few tidbits from that prior lawsuit:

    • MOHELA supported loan forgiveness, although I can’t recall why. (I think it was about simplifying administration in the face of a bunch of loans that had already paid for themselves in terms of the interest collected. At this point the cost to maintain the loan on their books and or chase accounts they can’t write off is more expensive than attempting to recover the loan.)
    • MOHELA refused to be a plaintiff, and it was the state of Missouri claiming standing.
    • The state of Missouri only had standing due to a voluntary agreement where MOHELA would pay a certain percentage of revenue back to the state of Missouri - something it had not done for nearly a decade. Missouri’s standing was merely technical, and more or less un-realized.
    • Yet it still was used to fuck over millions of people, because Misery loves company.

    Someone indicated that a court of appeals would take this up - that just means it goes to the supreme court eventually, where they come up with some dumbfuck ruling.








  • Monument@lemmy.sdf.orgtoMicroblog Memes@lemmy.worldEarbuds
    link
    fedilink
    English
    arrow-up
    9
    arrow-down
    3
    ·
    edit-2
    1 month ago

    My old car was a Kia. (Don’t hate me. It was 2009, and I was earning $19,000 a year.)

    I got a used model that was the one higher than base, that included the deluxe audio package. Basically, it included an aux input and the crappy speakers had metal grills instead of plastic ones.
    I spent years trying to figure out why the aux jack never worked, until in 2014 I took apart the insides, and then took it to a dealership to confirm that the factory had installed the standard wiring harness, which didn’t include connectors for the aux jack. They said it would be cheaper to buy a new car than it would be to have them fix the wiring.
    I wound up missing the aux roadtrip experience entirely, and replaced the radio with one that did Bluetooth.
    Bastards.