So today I discovered that there’s a cron job that holds non-reproducible state that died, and now our system is fucked.

The cron job doesn’t live inside any source control. This morning it entered a terminal state, and because it overwrites its state there’s no way to revert it.

I’m currently waiting for the database rollback and have rewritten it in a reproducible/idempotent way.

  • wise_pancake@lemmy.caOP
    link
    fedilink
    arrow-up
    6
    ·
    2 months ago

    What’s extra frustrating is the previous guy did create a git repo of these types of hacks, but this one doesn’t live in it for no discernible reason.

      • wise_pancake@lemmy.caOP
        link
        fedilink
        arrow-up
        6
        ·
        2 months ago

        He does charge a consulting fee to “fix” these issues

        Almost all of them are dumb shit like this, where something is built in super hacky and dumbass ways.

          • Agent641@lemmy.world
            link
            fedilink
            arrow-up
            1
            ·
            2 months ago

            Judgement day postponed indefinitely due to “Object reference not set to an instance of an object”

            • Sherry@programming.dev
              link
              fedilink
              arrow-up
              1
              ·
              2 months ago

              that might be a stupid question, but why would you running all services in tmux be a bad idea? a co-worker of mine is doing exactly that right now, which is why I’m asking.

              • qaz@lemmy.world
                link
                fedilink
                English
                arrow-up
                2
                ·
                edit-2
                2 months ago
                1. They’re all gone when you restart
                2. It doesn’t properly deal with logging
                3. You can’t set up dependencies between services but that doesn’t matter due to point 1

                I recommend using systemd services and/or docker compose instead. systemd services are files that describe which program / script to run and when (like after networking is active or after a certain other service is loaded).

              • swab148@lemm.ee
                link
                fedilink
                English
                arrow-up
                1
                ·
                2 months ago

                It’s not horrible, like it’ll do the job just fine, it’s just probably a better idea to use systemd and like, containers and whatnot, but I couldn’t be arsed to fiddle with all that for Jellyfin, caddy reverse proxy, and two modded Minecraft servers, so shell scripts and tmux won the day. It takes a little extra time to restart everything after an update, and maybe I’ll get the motivation to do things “correctly™” one day, but today is not that day.

  • arotrios@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    2 months ago

    This is almost exactly what happened to me on Monday, resulting in a fifteen hour day.

    My particular jenga piece was an Access query that none of my predecessors had deigned to document or even tell me about… but was critical to run monthly or you had obsolete data embedded deep within multi-million dollar reports.

    Thank god I don’t work on salary anymore, or I’d be really upset.

  • grrgyle@slrpnk.net
    link
    fedilink
    arrow-up
    2
    ·
    1 month ago

    I have also mixed up crontab -l with crontab -r. 😔

    Let this be a lesson to start versioning your crontabs.

    • tiredofsametab@fedia.io
      link
      fedilink
      arrow-up
      1
      ·
      1 month ago

      We never had our crons in source control, but I always saved it somewhere (usually on my machine and the target machine) so we had some history just in case of typing r instead of l for some reason. You can also create an alias called backupCrontab or something that runs the command for you and puts the output somewhere safe.

  • LainTrain@lemmy.dbzer0.com
    link
    fedilink
    arrow-up
    2
    ·
    2 months ago

    Cron job that evals some base64 encoded string which is actually downloading a script from a personal GitHub repo of an IT guy who left…

  • MonkderVierte@lemmy.ml
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    2 months ago

    Only tangentially related, but “What a elegant house of cards” is an insult i’m going to use someday.

  • JackbyDev@programming.dev
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 months ago

    Had a similar thing once. Some how, some way, the DBA copied and pasted something wrong. Oracle DB had some odd extra syntax for left and right joins that other DBs didn’t (or at least that I’d never seen). My best guess is that he auto formatted out of habit and maybe it took those symbols out.

    It took a long time to find that. Because the only evidence something was wrong was that ONE of our customers wasn’t being billed for ONE product. Everyone else was fine. Basically they were using it in a very atypical way. The left joins made sure to include them in the billing even because they didn’t have whatever was on the right of that join. Everyone else did.

  • Godort@lemm.ee
    link
    fedilink
    arrow-up
    0
    ·
    2 months ago

    Time to restore a whole machine backup to a VM with no network connectivity, and manually pull the command?

    • wise_pancake@lemmy.caOP
      link
      fedilink
      arrow-up
      1
      ·
      2 months ago

      I was able to do that

      Turns out there was a second bug which triggered this one, and a bug I found in this script that I thought was responsible was happening silently for months.

      Now three bugs are squashed