(skeletor is leading by example by adding that unnecessary apostrophe…)

        • AggressivelyPassive@feddit.de
          link
          fedilink
          arrow-up
          43
          ·
          9 months ago

          I’m currently in a project where the client has a custom, but not entirely consistent or known subset of utf-8.

          They want us to keep the form content as it is, but remove the “bad” characters. Our current approach is to just forward everything as it is and wait for someone to complain. How TF am I supposed to remove a character without changing the message?

            • dan@upvote.au
              link
              fedilink
              arrow-up
              8
              ·
              edit-2
              9 months ago

              Yeah I had a backend with poor support for anything that wasn’t ASCII

              PHP is like this. Poor Unicode support, but it treats strings as raw bytes so it usually works well enough. It turns out a programming language can take data from a form, save it to a database, then later load and render it, without having to know what those bytes actually mean, as long as the app or browser knows it’s UTF-8, for example through a Content-Type header or meta tag.

              The tricky thing is the all the standard string manipulation functions (strlen, substr, etc) don’t handle Unicode properly at all and they deal with number of bytes rather than number of characters. You need to use the “multibyte” (Unicode-ready) equivalents like mb_substr, but a lot of PHP developers forget to do this and end up with string truncation code that cuts UTF-8 characters in half (e.g.if it’s truncating a long title with Emoji in it, it might cut off the title in the middle of the three bytes that represent the Emoji and only leave 1 or 2 of them)

        • dan@upvote.au
          link
          fedilink
          arrow-up
          5
          ·
          edit-2
          9 months ago

          You just need to ensure you validate character by character (NOT byte by byte) and allow characters in the Emoji Unicode ranges (which are well-defined in the Unicode standard). Using a library is a great idea though.

        • poppy@lemm.ee
          link
          fedilink
          arrow-up
          1
          ·
          9 months ago

          I had the same issue. (Or rather, cause of issues.) Some devices couldn’t identify it.

      • lud@lemm.ee
        link
        fedilink
        arrow-up
        2
        ·
        9 months ago

        I had an emoji in my phone hotspot a while ago. Unfortunately I had to remove it after a while because some devices refused to connect.

    • Ottomateeverything@lemmy.world
      link
      fedilink
      arrow-up
      75
      arrow-down
      2
      ·
      9 months ago

      To make sure millenials can’t read your password, 𝔀𝓻𝓲𝓽𝓮 𝓹𝓪𝓻𝓽 𝓸𝓯 𝓲𝓽 𝓲𝓷 𝓬𝓾𝓻𝓼𝓲𝓿𝓮.

      How would this mess with millennials? I think you mean gen z.

    • The Picard Maneuver@lemmy.worldOP
      link
      fedilink
      arrow-up
      32
      arrow-down
      1
      ·
      9 months ago

      To make sure millenials can’t read your password, 𝔀𝓻𝓲𝓽𝓮 𝓹𝓪𝓻𝓽 𝓸𝓯 𝓲𝓽 𝓲𝓷 𝓬𝓾𝓻𝓼𝓲𝓿𝓮.

      Hey, millennials know cursive!

    • nezbyte@lemmy.world
      link
      fedilink
      arrow-up
      15
      ·
      9 months ago

      CSVs are supposed be comma-separated files. Microsoft deviated from the specification and decided some languages would use semicolons for CSVs.

      Source: StackOverflow

      • dan@upvote.au
        link
        fedilink
        arrow-up
        6
        ·
        edit-2
        9 months ago

        Microsoft deviated from the specification

        There is no specification for CSV, which is why it’s such a mess and different parsers and renderers have wildly different features. The closest thing to a spec is RFC4180 but that RFC simply describes the most common features across several CSV implementations, and is not actually a spec.

        I agree that it should be comma separated though. My understanding is that it caused issues in countries that use a comma as a decimal point.

        Also, Excel sometimes uses tabs rather than commas or semicolons.

      • nom345@sopuli.xyz
        link
        fedilink
        arrow-up
        5
        ·
        9 months ago

        Using comma would probably caused more problems as it is a decimal separator for those languages. My excel also uses semicolon in formulas instead of comma when separating parameters. Some VBA scripts break when using different language settings and some forumilas don’t translate automatically to different locale so they just give an error. Overall using excel in different locale setups is annoying.

        Best separator I have used is | as i have never seen it in the data as an input. Comma and semicolon both have caused issues in the past for me as they might pop up at wrong places.

    • jawa21@lemmy.sdf.org
      link
      fedilink
      arrow-up
      13
      ·
      9 months ago

      Here’s my confusion: as soon as it is no longer separated by commas, it is by definition no longer a CSV. Is it an SCSV now?

    • rtxn@lemmy.world
      link
      fedilink
      English
      arrow-up
      12
      ·
      9 months ago

      Z̵̫̖͚̳̖̖̰̩̀̆͐͒͝ä̸̛̻́̈́̌͂̽̈́l̷̤̥̖̝͙̅g̵̱̤͙͕̥̮͌̽o̸̡̦̙̬̘͎̪̥̔ ̴͔̙̞̱̗͒͊͊̽̀̑͌ẏ̵̛̻̾o̸̡͍̤͔͌ų̶̠͔̯̲̖͇̯̅̒̓̃̏̓͊r̷͎̪̗̤̄̊̃̚͝ ̵̢̰͔̀t̵̡̘̤̙͕͎̅͂͛̀̚ȩ̷͙̙̖̲̟͍̉̎͝x̷͇̦̝̼͗͋̊t̶̫̹̳̩͇̼̠͚̿͆̅̋̔̃͐͗!̶̧̛͕̮̻̞͎͇̹͆͛͘̕̚͠

        • Xyre@lemmus.org
          link
          fedilink
          English
          arrow-up
          14
          ·
          9 months ago

          I emailed my bank about this a few years ago. Never heard back but to my surprise they actually updated the password restrictions! I should send another email asking for MFA and virtual cards…

          • veroxii@aussie.zone
            link
            fedilink
            arrow-up
            3
            ·
            9 months ago

            Jeez mate you gotta get on that! You have the magic powers and you’re holding back civilization’s progress with your procrastination!

          • Gestrid@lemmy.ca
            link
            fedilink
            English
            arrow-up
            1
            ·
            edit-2
            9 months ago

            virtual cards

            Do you mean tap-to-pay, or do you mean card numbers you can use for online purchases?

            • Xyre@lemmus.org
              link
              fedilink
              English
              arrow-up
              1
              ·
              9 months ago

              I think a more apt description would be proxy cards. It’s relatively new, but it lets you create cards that are linked to your primary without ever issuing a plastic card. This way if fraud happens you only need to replace it for the services it was used on. Or if you happen to lose your physical card, you can have it replaced without affecting the others.

              • Gestrid@lemmy.ca
                link
                fedilink
                English
                arrow-up
                1
                ·
                9 months ago

                I think this is the same thing as when I said

                card numbers you can use for online purchases

                I admittedly didn’t describe it very well, though.

        • Xin_shill@lemm.ee
          link
          fedilink
          arrow-up
          9
          ·
          9 months ago

          Truly ancient Cobol running in the back is my only guess. Why they wouldn’t have their authentication systems completely separate with better security features and some sort of token based access to the backend is beyond my understanding of their back end.

        • NaibofTabr@infosec.pub
          link
          fedilink
          English
          arrow-up
          4
          arrow-down
          1
          ·
          9 months ago

          This isn’t really true. If it were the financial world would be incredibly unstable and untrustworthy, and nobody would keep their money in banks.

          Banks do tend to be behind the leading edge because their systems are thoroughly tested and have to be stable. They have to be regularly audited and there’s a lot of oversight. Change control processes are inherently slow. Given a choice between rapid and flexible or deliberate and reliable, banks will take the cautious route.

        • theneverfox@pawb.social
          link
          fedilink
          English
          arrow-up
          2
          ·
          9 months ago

          Why is our money based on debt? Why do banks keep getting away with nearly collapsing the global economy? Why do private institutions have the right to coin currency?

          Because banks put themselves in extremely risky situations, and civilization is based on the idea that money has value and the law is enforced. So laws get passed whenever they’re in danger (usually self inflicted)

          Banks have security through legislation. It’s extra illegal to hack them. And since that’s the case, what’s a little more risk for a little higher profit? -_-

    • Thomas@discuss.tchncs.de
      link
      fedilink
      arrow-up
      9
      ·
      edit-2
      9 months ago

      To fuck with computers that don’t know how to do UTF8, add a few emoji.

      Even better, add some byte sequences that are invalid UTF-8.