• BlackEco@lemmy.blackeco.com
    link
    fedilink
    English
    arrow-up
    79
    ·
    1 month ago

    Also it doesn’t respect robots.txt (the file that tells bots whether or not a given page can be accessed) unlike most AI scrapping bots.

    • kboy101222@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      52
      ·
      1 month ago

      My personal website that primarily functions as a front end to my home server has been getting BEAT by these stupid web scrapers. Every couple of days the server is unusable because some web scraper demanded every single possible page and crashed the damn thing

    • object [Object]@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 month ago

      Out of the 60gb/month of traffic my website gets, 20gb is because of bytedance’s webscraper. I haven’t gotten around to blocking them as bandwidth isn’t an issue but damn do they send a lot of requests.