Writing Composable SQL Using Knex and Pipelines

(lackofimagination.org)

58 points | by tie-in 235 days ago

11 comments

larodi 235 days ago
I really fail to understand why "writing HTML mixed with JS" is okay for React, and mixing regex with what-not is okay for...like...everyone, but it would not be okay for so many ORM people to actually have properly bound SQL in-place. Really boggles me down.
Besides - I've seen one of the best possible schema-on-the-fly-on-top-of-RDBMS systems that can ever exist, is in production, and even there it is much better every so often to fallback to either the Virtual SQL layer or to hardcore SQL when performance is needed. In all other occasions - well okay, but unless you actually have a proper MOP (meta-object-protocol or entity/relationship schema, name it what you like) in-place/available, the DB schema becomes the MOP and you are back to square one...
[-]
- 9rx 235 days ago
  > but it would not be okay for so many ORM people to actually have properly bound SQL in-place.
  It's okay... sometimes. But, as what the article is about, more often than not you need to start composing that SQL. SQL does not deal well with composability natively. Which means you need to bring your own solution to work around its limitations.
  In theory, you could parse the SQL and build up some kind of AST on which composition could be built. Or, you could write a set of functions in the application language which somewhat resemble SQL, as demonstrated in the article, that build the same AST. It turns out the latter is considerably easier to implement.
  [-]
  - larodi 230 days ago
    > SQL does not deal well with composability natively.
    not sure what you really mean by this one. I've seen quite complex SQL code, pages long, not PL/SQL , but SQL with intersects, CTEs, recursion and all... and it composes.
    it actually composes OK and on a very high level where reading skills similar to those needed to read math crypto-text are needed. is not easy, but the power you get from the declarative writing can never be matched by functional language in terms of readability and concise length. is the same with regexp - once you master it you never go back to writing for-loops slicing commas.
    i don't really take this argument that you want a little ORM here and there to save time implementing CRUDs, but then you invest heavily into describing the AST of the SQL by means of .dot.dot.dot OOP composition, which eventually is "translated" (transpiled if u want) into SQL again. So basically you need to understand structure on the ORM API level, but then still understand it all (for debugging) on SQL level. Seems doing the same two times.
    Besides not all queries are equally used statistically, and it is apparent you are going to want to optimize the critical stuff, while the trivial stuff - well you can compose CRUDs with simple join+binding and it's the same thing timewise.
    [-]
    - 9rx 230 days ago
      > not sure what you really mean by this one.
      The same as what everyone has meant, including the article. The naive solution is to simply write out every nearly the same query you need, but that starts to become difficult to maintain once you have more than a small handful of queries. More realistically you are going to want to introduce composition in a practical setting.
      The SQL solution is views, but they are complicated by being dependent on runtime state. There is a lot of complexity involved in ensuring that everything is in the right state. While that may be a decent tradeoff for some applications, the types of applications that lean on these query builders tend to be of the type where they are trying to prove a business hypothesis and whatever gets there the fastest is the tradeoff that needs to be made. A set of functions in the application language that somewhat resemble SQL, which output a SQL query, are a lot easier to implement.
      The alternative is to compose queries at "compile time". In the simplest case you might get away with string concatenation, but this too quickly becomes error prone at scale. While many languages of this nature are designed such that all concatenations are valid for this very reason, SQL is not. An innocuous and perfectly valid modification to a query can easily invalidate what was a previously valid concatenation. More realistically you are going to want to build an AST so that properties can be modified without reliance on exacting syntax. You can parse SQL to give you that AST, or you can go back to the aforementioned set of functions to generate the AST. The latter is considerably easier to implement.
      Each approach has pluses and minus. As with anything, you have to pick your tradeoffs. But it is no surprise that a lot of developers choose the easy tradeoff, especially when a lot of those developers are developing systems where what is easy/fast is the most important tradeoff, not knowing if what they are writing will ever get used. Theoretical engineering perfection is moot if there is no customer to use the product of that engineering.
  - MathMonkeyMan 235 days ago
    > In theory, you could parse the SQL and build up some kind of AST on which compatibility could be built.
    This is an interesting problem that I'd like to learn more about. Have you read anything about it?
    [-]
    - ajfriend 235 days ago
      You can compose SQL with https://ibis-project.org/tutorials/ibis-for-sql-users, which is using https://github.com/tobymao/sqlglot to parse the SQL under the hood.
      As an alternative to parsing the SQL yourself, DuckDB's https://duckdb.org/docs/api/python/relational_api allows you to compose SQL expressions efficiently and lazily, which I've used when playing around with things like https://gist.github.com/ajfriend/eea0795546c7c44f1c24ab0560a...
- Izkata 235 days ago
  > I really fail to understand why "writing HTML mixed with JS" is okay for React, and mixing regex with what-not is okay for...like...everyone, but it would not be okay for so many ORM people to actually have properly bound SQL in-place. Really boggles me down.
  There isn't anything that can change remotely that would break the local code.
  ORMs being in charge of table structure and queries give them an overhead view that helps deal with changes like that. Not perfect by any means, since you have to change the ORM queries anyway, but it's a little bit better since the model definition provides an abstraction for it, and since the queries are in the local language there are refactoring tools to help find and do the changes.
  > "writing HTML mixed with JS" is okay for React
  There's a third piece to frontend: CSS. There are additional abstractions on top of it in React-world to help limit its scope, to help prevent similar "break something somewhere else" problems, such as CSS Modules.
- swyx 235 days ago
  > it would not be okay for so many ORM people to actually have properly bound SQL in-place.
  do you have an actual security/perf/DX issue w the knex "fluent api" demonstrated in TFA or are you just voicing a stylistic preference?
bearjaws 235 days ago
Used knex before typescript was a thing, it was such a life saver.
Everytime an ORM conversation would come up, I would bat it down, especially at the time where Bookshelf (now 100% dead) and Sails.js were popular.
We were all in on knex and it was such a life saver, especially migrations.
The only custom thing we did was an extension that would run explain on queries and flag anything that had a query cost over some value (I forget what it was).
Made finding new queries that were unperformant really easy.
The composability was nice, but we didn't need to use it all that much. It made one of our services that was essentially a query builder very easy to build. Something like "Find me patients that need X work Y weeks from now, now add a filter for medication type..." was easy to script out.
[-]
- hu3 235 days ago
  Thanks for sharing your experience.
  Have you tried https://kysely.dev ?
  I'm interested in using a SQL builder in a new project and I'm undecided between knex and kysely.
  [-]
  - williamdclt 235 days ago
    I'm finding all these SQL builders so painful to use. They closely match the syntax of SQL so you do need to know SQL, but now you also need to know the SQL builder's syntax. It's friction for juniorer devs to learn SQL. It's also now much more difficult to trace a SQL query back to the originating code.
    They have two advantages: building SQL statements programmatically is cleaner than string concatenation, and they allow typing inference. The former is not so often necessary (and even when it is, it's often simple enough that string concatenation is _alright_). The latter is what makes me hesitate as a typing ayatollah, but even then I don't think it's worth it (I'll take the explicit typing, with the risk that we get it wrong, which would likely-although-not-surely be caught in tests).
    I still need something for migrations and query execution, but I would only reach for query building in like 1% of cases.
    [-]
    - ajfriend 235 days ago
      One approach I've been enjoying recently in my personal use is to write a light wrapper around DuckDB to enable composable SQL snippets. Essentially like what I have here https://gist.github.com/ajfriend/eea0795546c7c44f1c24ab0560a..., but without the `|` syntax.
      You're still writing SQL, so you don't need to learn a new syntax, but I find it more ergonomic for quick data exploration. I also have an easier time writing SQL from memory than I do writing the equivalent Pandas code.
    - bearjaws 235 days ago
      > They closely match the syntax of SQL so you do need to know SQL, but now you also need to know the SQL builder's syntax.
      I tend to agree, I recently tried Slonik and found it fun to use, but when I tried to incroporate it into an existing project I ran into ESM / TS issues (even with interop on, annoyingly). I also want to give pg-typed a try.
      The only reason I would recommend it is I feel many systems tend to have some sort of dynamic query builder "UI" over time, especially in enterprise spaces, and a query builder solves these problems very well.
    - thrw42A8N 235 days ago
      I have never had any issues like this, and the type safety provided by Kysely has saved me a lot of bugs and potentially very serious issues.
      [-]
      - plopz 235 days ago
        I've had issues when needing to use sql functions.
        [-]
        thrw42A8N 235 days ago
        What issue? Kysely has an example to do this.
        [-]
        plopz 235 days ago
        This was one of the problems I ran into https://github.com/kysely-org/kysely/issues/664, not being able to use functions inside onDuplicateKeyUpdate
        [-]
        thrw42A8N 235 days ago
        Basically no ORM supports such specific functionality of a specific DBMS. In any case you'd have to use raw SQL. I don't see the problem, this is expected.
        I don't see how not using Kysely would make it better overall - you simply write the raw SQL yourself as you would without it; but you gain so much type safety by using it.
  - naranha 235 days ago
    Not OP, but I have converted a couple of projects from knex to kysely recently. With Typescript kysely is much better. With kysely-codegen you can generate typescript types for a pre existing schema too.
    [-]
    - hu3 235 days ago
      That was my impression too. Glad to have it confirmed. Thank you.
  - bearjaws 235 days ago
    I have used it in personal projects, it definitely feels like the succesor to it if you use TypeScript. The type safety at the schema level is fantastic.
    I haven't used it in a professional setting, but if I was evaluating between Knex and Kysely I would use Kysely simply for the additional type safety, and it appears maintained.
    [-]
    - hu3 235 days ago
      The project will be TypeScript so it seems like a no-brainer. Thank you.
  - weeksie 235 days ago
    I'm using kysely with a database that has quite a few complex functions, etc. Lots of queries that require CTEs and so on. It's fantastic. Type safety with the codegen is amazing and catches a tremendous amount of simple bugs and typos which are super easy to make in raw SQL strings.
    People talking about writing raw SQL have never maintained a large project. Once you start string concatting queries and trying to remember which table aliases are which you're halfway to writing a custom, buggy query builder without type safety. It's a trap for young players.
    [-]
    - hn_throwaway_99 235 days ago
      > It's a trap for young players.
      As someone with 25 years of experience using lots of ORMs and query builders in both Java and JavaScript/Typescript, I believe the exact opposite to be true. There are libraries now that let you write composable, safe SQL with type guarantees (I am a big fan of Slonik, but there are other libraries).
      Every single time I've seen an ORM or query builder used for application software, eventually I see it becoming a big headache for operations. It usually makes debugging and and performance investigations more difficult, and I so often see developers "fighting with the tool" when they're just trying to get it to output the SQL they want it to.
      In my experience, people who are fans of ORMs and query builders never have had to deal with significant scale and the operational difficulties that come with it (and I'm not talking about "Google scale", either, I'm just talking about a reasonably well known consumer site with moderately heavy traffic at times).
      [-]
      - weeksie 235 days ago
        I've been at it for a bit over 25 years as well and have worked on everything from telco billing projects to complex consumer applications. Query builders and ORMs are not the same thing, and it's odd that you conflate them.
        The project that I happen to be on now has a mid sized db (~70 ish) tables with quite a few custom functions and complex queries. I have been untangling a mess of slonik queries and there's an objective difference in the quality of the code and the maintainability between that and the kysely code that's replacing it.
        The people I see drift toward things like slonik tend to be journeyman level developers and they often end up over their skis. Smart folks a lot of the time, but inexperienced.
  - phpnode 235 days ago
    kysely is an excellent successor to Knex, strongly recommended. Pair it with something like https://github.com/kristiandupont/kanel to generate types from your schema.
- tengbretson 235 days ago
  The decision not to use sails.js may have saved your company millions of dollars.
  [-]
  - evantbyrne 235 days ago
    Ignoring mainstream js trends seems to be a safe bet in general.
anonzzzies 235 days ago
We use Knex a lot because it can be used dynamically. All newer systems are based on typescript, which is great, but not dynamic unless you do code gen. And that's quite a lot slower because of the compilation steps in our experience. With Knex we can have users building a massive query dynamically while seeing intermediate results immediately.
maciejgryka 235 days ago
I’d encourage everyone, who finds this appealing to check out how Ecto works in Elixir. It’s all functional & immutable goodies and pipelines are built into the language and idiomatic.
Definitely looks weird at first glance (and Ecto is kinda weird even if you’re familiar with Elixir), but it’s such a joy to use once you grok it.
mythz 235 days ago
I've also just released a type safe, SQL-like parameterized TypeScript/JS ORM for SQLite, MySQL and PostgreSQL. I wrote it because I couldn't find any other ORMs with type-safe query builders where joined tables and column references are type checked against my data models to enable design-time type checking and safe refactoring.
It's also highly composable as SQL expressions can be embedded within SQL Builders which can themselves be embedded within other Query Builders.
https://litdb.dev
[-]
- damidekronik 235 days ago
  What's missing from https://kysely.dev/?
  [-]
  - mythz 235 days ago
    Looks pretty good, but APIs look more stringly typed and APIs don't look natural, I prefer using SQL expressions with typed references.
    The query builder also looks coupled to execution, litdb query builders and aren't coupled to a driver implementation, i.e. they're just used to generate SQL with parameters that could then be executed with any driver.
poxrud 235 days ago
I've used Knex, ActiveRecord and many other ORM's and query builders. At some point, beyond basic queries you start wasting time coming up with SQL statements and then having to convert them to your ORM/builder's syntax. I've reached the point now where it's just easier to stick with writing SQL and having a library that removes the possibility of sql injections. My current stack is postgres.js and dbmate for migrations.
[-]
- neilk 235 days ago
  Knex isn’t an ORM, it’s a query builder. (Though the closely related library Bookshelf.js is an ORM).
  I know your pain of translating SQL back to the ORM language, but what similar difficulty do you have for Knex? You have a query, and it’s isomorphic to the Knex query usually. There are a few things Knex doesn’t do, but it has the “raw” get-out-of-jail-free card for making part of the query just a string without sacrificing anything else.
  Every project I’ve ever worked on has many closely related complex queries and writing them out as strings (even with placeholders) becomes either perilous as you write complex string building logic, or tediously repetitive.
- hn_throwaway_99 235 days ago
  You're being downvoted, but I completely agree. In the long run I believe most ORMs end up being a time sink. I like to put it that "they make the easy thing easy, and they make the hard stuff much harder". There are tons of operational benefits to using raw SQL.
  The author of the slonik library for Postgres wrote this a couple years ago (note the slonik library also makes it trivial to write composable SQL statements): https://gajus.medium.com/stop-using-knex-js-and-earn-30-bf41.... I was working on a project where I had originally started using Knex, read this article and loved it, and ripped out everything to switch to slonik over a weekend. I'm so glad I did, it ended up being a huge benefit.
  [-]
  - weeksie 235 days ago
    Slonik does _not_ make it possible (let alone trivial) to write composable queries. You're just doing string concatenation and string concatenation will bite you when things get complex enough.
    There is also a big difference between an ORM and a query builder. Knex (and kysely, which is the only thing I'd use these days) allows you to write SQL that's just as complex as anything you'd write raw, complete with escape hatches if you need them. The criticisms of ORMs tend to be spot on, they are nice until they run into a wall, but that same thing simply does not apply to a robust query builder.
    [-]
    - hn_throwaway_99 235 days ago
      > Slonik does _not_ make it possible (let alone trivial) to write composable queries. You're just doing string concatenation and string concatenation will bite you when things get complex enough
      In practice, I find this to be potayto/potahto. At the end you need the thing to output SQL that is going to run, and I've had a lot more problems trying to get Knex to output the SQL that I know works vs dealing with any string composition issues in Slonik.
      More importantly, at the end of the day, when it comes to DB operations, all of the information you get from the DB is going to be output/logged in SQL. There is no way getting around the fact that you need to know SQL well to manage a DB. Knex just becomes another layer that you need to translate to and from ("the leakiest of abstractions" as I like to say). I'd rather focus my effort on becoming an expert in SQL than yet another popular-library-of-the-month.
      [-]
      - weeksie 235 days ago
        1. You certainly need to know SQL 2. Composition is very different than concatenation and the differences become apparent as a project increases in complexity 3. I'd recommend checking out the current SOTA, especially with kysely (or Ecto if you ever get into Elixir projects)
        [-]
        hn_throwaway_99 235 days ago
        I've used kysely extensively on a project built using sst.dev. I can't paste proprietary code, but I remember my head spinning over a particularly gnarly query (e.g. a bunch of window functions, aggregations, etc.) It was a million times more difficult to grok reading this in kysely's syntax than SQL.
h1fra 235 days ago
Unrelated but started using Kysely after struggling too much with knex and prisma, that's a breath of fresh air. Excellent native typing, good tooling, and no missing features.
fbn79 235 days ago
I like knex, but would love it more if all methods does not mutates the original data structure. A pure version of knex would be great
natpalmer1776 235 days ago
Here I was wondering how you managed to use K’nex (toy) to write code lol
[-]
- MrMcCall 235 days ago
  I was fully ready to have my mind blown, but am kindof perversely happy that no one's that damn smart :-)
  [-]
  - nom 235 days ago
    well, there is this:
    http://knexcomputer.blogspot.com
- btown 235 days ago
  Well, now that you mention it... https://www.youtube.com/watch?v=rdT1YT9AOPA
pictur 235 days ago
[dead]