SQL: Incorrect by Construction

(chreke.com)

24 points | by ingve 4 hours ago

11 comments

  • hilariously 11 minutes ago
    Generally I would recommend an append only data structure here, not a bunch of updates

      BEGIN TRANSACTION;
    
      IF EXISTS (
        SELECT 1
        FROM account_balances WITH (UPDLOCK, HOLDLOCK)
        WHERE owner = 'alice'
          AND balance >= 10
      )
      BEGIN
        INSERT INTO account_ledger (owner, amount, memo)
        VALUES
            ('alice', -10, 'transfer to bob'),
            ('bob',    10, 'transfer from alice');
      END
    
      COMMIT TRANSACTION;
    
    or if I wanted to use the update pattern since I am taking the lock anyway

      BEGIN TRANSACTION;
    
      UPDATE accounts WITH (UPDLOCK)
      SET balance = balance - 10
      WHERE owner = 'alice'
      AND balance >= 10;
    
      IF @@ROWCOUNT = 1
      BEGIN
        UPDATE accounts
        SET balance = balance + 10
        WHERE owner = 'bob';
      END
    
      COMMIT TRANSACTION;
  • crazygringo 30 minutes ago
    This is just like... SQL 101 for transactions and locking. These are basic, elementary concepts in databases.

    There's nothing "incorrect by construction".

    The author claims the original snippet "looks completely reasonable". It absolutely does not, if you know anything about client-server databases.

    • Bjartr 25 minutes ago
      Not going past the end of a string or array is C 101 and yet buffer overruns abound. Rust's memory model doesn't do anything you shouldn't already be doing in C, but it provides real value because in practice people are demonstrably bad at doing those things. So would a SQL variant or successor that made these "beginner" mistakes a lot harder to make.
  • traderj0e 1 hour ago
    I've encountered this dozens of times. It's not intuitive, but this implicitly locks the row from concurrent reads, where as SELECTing first won't:

      UPDATE accounts  
      SET balance = balance - 10  
      WHERE owner = 'alice' AND balance >= 10;
    
    Another possible surprise, say two xacts do this at the same time:

      INSERT INTO foo(num) (
        SELECT 1 WHERE NOT EXISTS (
          SELECT * FROM foo WHERE num = 1
        )
      );
    
    Without a UNIQUE on num, you get num=1 twice. Of course adding UNIQUE would prevent this, but what you might not expect is UNIQUE implicitly adds a lock too. So not only do you only get num=1 once, but also both xacts are guaranteed to succeed, which in some situations is an important distinction.

    Schools teach that databases are ACID, but in most cases they aren't by default, and enabling full ACID comes with other caveats and also a large performance hit.

    • mrits 56 minutes ago
      One issue is there were a lot of database enhancements and known side effects introduced at a time where not only was SQL a full time job, it was often paid a lot more and was the most senior engineer on the team.

      It has since become a tool of even front end engineers.

      • traderj0e 46 minutes ago
        Yeah I keep running into situations where everyone else is a backend engineer but only has cursory knowledge of databases. Maybe they think the DB is just some modular add-on you can swap out, or they understand it's the foundation of all their backend code but don't really know how to do it right.
  • chasil 1 hour ago
    This document relies strongly upon Transact-SQL:

    https://en.wikipedia.org/wiki/Transact-SQL

    A more universal industry standard is SQL/PSM, which originated from Oracle PL/SQL:

    https://en.wikipedia.org/wiki/SQL/PSM

    Demonstrating the flaws in question in the PSM standard would be more useful.

    • traderj0e 1 hour ago
      You can in fact show the same flaws in PL/SQL or whatever else
  • mrkeen 26 minutes ago
    > I want an alternative to SQL

    I don't think it's SQL itself - it's the DB vendors ship weak isolation so people aren't hit by deadlocks, isn't it?

    > Make transactions atomic by default

    Not the issue, right? It's the weak isolation.

  • taeric 1 hour ago
    SQL is intended as a declarative query language. That it is not the correct tool for imperative processing of updates feels expected? And mostly fine?

    Fair that things often grow beyond their original intent.

    • traderj0e 1 hour ago
      You run into the same issues without using the weird imperative syntax in this article.
      • taeric 19 minutes ago
        Fair. It would help if the examples didn't always fallback to processes that should have reconciliation procedures to cover when they differ. Double entry accounting combined with the necessary follow on processes can't be "coded away."

        Which is to say that I am definitely indexing on the idea that you would try and get a query language to encode processes being the problem, here.

  • bena 50 minutes ago
    You don't update balances, you enter transactions. Then you derive balances from transactions.

    You can even insert them in an unvalidated state, then validate them later. That way if you have two transactions that come one after another, it doesn't matter because you can process them sequentially anyway.

  • giancarlostoro 1 hour ago
    This assumes you don't do any sort of caching or use distributed systems that can cache the data and choose to hold off to write it all to the DB. The cached system can show both users the in-process transactions as well.
    • traderj0e 1 hour ago
      That introduces more questions, like are cache reads fully consistent
  • gonzalohm 1 hour ago
    If you need concurrency design your system for concurrency.

    Have a transactions table with the payer and receiver and calculate the current balance using the transactions.

    Each transaction must have a unique Id (pk)

    • traderj0e 1 hour ago
      That is actually worse, I've been there. It's good to keep logs like that, but you can't use that for locking, you need a separate balances table.

      Edit: Well another option is to add a "pending" col and do three separate db xacts: 1. insert pending=true row 2. select balance with pending debits deducted (which ages out pending rows older than 1min) 3. update row to pending=false if successful. This is a useful pattern if you're waiting on an external system too, but not good in this case where you're just trying to update in one DB.

      • cozzyd 1 hour ago
        your goal is to find if there is any combination of plausible transaction orders that results in a balance less than 0, so you can issue an overdraft fee.
        • traderj0e 1 hour ago
          The original stated goal is that we want to disallow overdraft. If you want to allow it instead, then there are some followup questions like do you want to limit how much they could overdraft. But this is meant to be an example of race conditions, not a real world bank.
    • grebc 1 hour ago
      This is the answer for any serious banking/accounting software.

      Balance is calculated & stored after the fact from a known correct value.

  • selimthegrim 2 hours ago
    Wonder what the [check-constraints] part meant or if it's a placeholder.
  • analyticsfs 1 hour ago
    [dead]