19 comments

  • metadat 4 days ago
    This is incredibly high-quality BASH programming, as a fellow bash freak I am studying this code, and even I am learning some new techniques.

    https://github.com/h4l/json.bash/blob/main/json.bash

    You've boiled it down to a set of very elegant constructs. Respect. Thank you @h4l, this is badass.

    I hope you follow up with a golang or rust implementation, that would really be something else.

    p.s. I noticed the following odd behaviors with escaping delimiters (e.g. "="), is there a way to get an un-escaped equal sign as the trailing part of a key or leading part of a value?

      $ docker container run --rm ghcr.io/h4l/json.bash/jb msg=Hi
      {"msg":"Hi"}
      $ docker container run --rm ghcr.io/h4l/json.bash/jb msg=\=Hi
      {"msg=Hi":"msg=Hi"}
      $ docker container run --rm ghcr.io/h4l/json.bash/jb "msg=\=Hi"
      {"msg":"\\=Hi"}
      $ docker container run --rm ghcr.io/h4l/json.bash/jb "msg\==\=Hi"
      {"msg\\=\\":"Hi"}
      $ docker container run --rm ghcr.io/h4l/json.bash/jb "msg\\==\=Hi"
      {"msg\\=\\":"Hi"}
      $ docker container run --rm ghcr.io/h4l/json.bash/jb "msg\\===Hi"
      {"msg\\=":"Hi"}
    • h4l 4 days ago
      Thank you, that's high praise! I learnt a lot about bash writing this, but I've also not looked at the code in a few months, and it's already starting to look quite intimidating!

      I definitely like the idea of a goland/rust implementation, there are certainly things I could improve.

      So the argument syntax escapes by repeating a character rather than backslash. I chose this because with backslashes escapes it would be unclear whether a backslash was in the shell syntax or the jb syntax, and users may end up needing to double escape backslashes, which is no fun! Whereas a shell will always ignore two copies of a character like =:@.

      The downside of double-escaping is that the syntax can be ambiguous, so sometimes you need to include the middle type marker to disambiguate the key from the value. But the type can be empty, so just : works:

          $ jb ===msg==:==hi=
          {"=msg=":"=hi="}
      
      In the key part, the first = begins the key, the == following are an escaped =. The first = following the : marks the value, and everything after is not parsed, so =hi= is literal.

      When you have reserved characters in keys/values (especially if they're dynamic), it's easiest to store the values in variables and reference them with @var syntax:

          $ k='=msg=' v='=hi=' jb @k@v
          {"=msg=":"=hi="}
      • zikohh 4 days ago
        How is this different to this https://github.com/kellyjonbrazil/jc
        • h4l 3 days ago
          jc has many parsers for the specific output format of various programs, it can automatically create a JSON object structure using its knowledge of each format.

          jb doesn't have high-level knowledge of other formats, it can read from common shell data sources, like command-line arguments, environment variables and files. It gives you ways to pull several of these sources into a single JSON object.

          jb understands some simple/general formats commonly used in shell environments:

          - key=value pairs (e.g. environment variable declarations, like the `env` program prints

          - delimited lists, like a,b,c,d; (but any character can be the delimiter) including null-delimited (commonly used to separate lists of file paths)

          - JSON itself — jb can validate and merge together arrays and objects

          You can use these simple sources to build up a more complex structure, e.g. using a pipeline of other command line tools to generate null-delimited data, or envar declarations, then consuming the program's output via process substitution <(...). (See the section in the README that explains process substitution if you're not familiar, it's really powerful.)

          So jb is more suited to creating ad-hoc JSON for specific tasks. If you had both jc and jb available and jc could read the source you need, you'd prefer jc.

  • h4l 4 days ago
    As well as anyone's general thoughts/experiences, I'd appreciate opinions on the error handling mechanism jb uses to detect errors in upstream jb processes that jb is reading from.

    Normally, detecting errors on the other end of a pipe requires care in a shell environment (e.g. retrospectively checking PIPESTATUS). I used an approach I've called Stream Poisoning. It takes advantage of the fact that control characters are never present in valid JSON. When jb fails to encode JSON, it emits a Cancel control character[1] on stdout. When jb encounters such a character in an input, it can tell the input it's reading from is truncated/erroneous. This avoids the typical problem of a pipe silently being read as an empty file.

    I've got a page explaining this with some examples here: https://github.com/h4l/json.bash/blob/main/docs/stream-poiso... I can imagine using control characters in a text stream being rather controversial, but I feel it works quite well in practice.

    [1]: https://en.wikipedia.org/wiki/Cancel_character

    • throwawaynorway 4 days ago
      What happens if the next program in the pipe is not jb? Does jb also exit with a code?

      For example `jb | jq`, where jq or a similar program discards the cancel character.

      (Away from pc, unable to check right now.)

      • h4l 4 days ago
        Good question! Yep, jb exits with non-zero:

            $ jb size:number=oops; echo $?
            json.encode_number(): not all inputs are numbers: 'oops'
            json(): Could not encode the value of argument 'size:number=oops' as a 'number' value. Read from inline value.
            ␘
            1
        
        If you pipe the jb error into jq, jq fails to parse the JSON (because of the Cancel ctrl char) and also errors:

            $ jb size:number=oops | jq
            json.encode_number(): not all inputs are numbers: 'oops'
            json(): Could not encode the value of argument 'size:number=oops' as a 'number' value. Read from inline value.
            parse error: Invalid numeric literal at line 2, column 0
        
            $ declare -p PIPESTATUS
            declare -a PIPESTATUS=([0]="1" [1]="4")
        
        So jq exits with status 4 here.
  • ensocode 3 days ago
    Appreciated, Slightly related: https://github.com/bashtools/JSONPath.sh Jsonpath handling in Bash - very usable for huge files
  • boomskats 4 days ago
    I find writing bash really gratifying - almost relaxing - but you're right, it also makes me feel kind of 'guilty', especially when I start getting carried away and reading tput docs.

    However, I think in your case the rationale in the performance section of your Readme totally makes sense, and every single use case I can think of for this would prioritise minimal latency over increased throughput. I've seen init containers that would execute probably 100x faster with this for the exact reasons you point out. I'm quite curious as to what you would you choose instead of bash if you were starting from scratch now?

    FYI Shellcheck has a couple of superficial nits that you might wanna address (happy to send a PR). And your Readme is great.

    • h4l 4 days ago
      I did find it quite satisfying to coerce bash into doing this while maintaining decent performance. I definitely came to appreciate some aspects of bash more from this, but it's so easy to shoot yourself in the foot!

      If I started from scratch now I'd use a compiled language that could produce a single static binary and start with really low latency. I'm pretty sure jo must not be tuned for startup time, if they optimised that they must be able get it way faster than bash can start and parse json.bash. I was pretty surprised that bash can startup faster!

      The codebase is basically at the limit of what I'd want to do with bash, but there are features I could add if it was in a proper programming language. e.g. validating :int number types, pretty-printing output, not needing the :raw type to stream JSON input.

      Thanks for the heads up on Shellcheck, I'd be happy to take a PR if you'd like to.

  • mg 4 days ago
    I like the syntax to send typed values from the terminal:

        jb id=42 size:number=42 surname=null data:null
    
        => {"id":"42","size":42,"surname":"null","data":null}
    
    I never had the need to use typed arguments in bash, but if I ever have it, this might be the syntax I'd use.

    In fact, I was thinking about such a syntax recently. I am writing a tool which lets you call functions in Python modules from the command line. At first, I thought I need to define the argument types on the command line. But then I decided it is more convenient to use inspection and auto-convert the values to the needed types.

    • h4l 4 days ago
      Glad to hear, this was something I wanted to make reliable, ergonomic and intuitive. I figured a lot of languages use `: type` to declare types.

      The same using jo would be like this, which I find harder to type and remember:

        jo -- -s id=42 -n size=42 -s surname=null data=null
        {"id":"42","size":42,"surname":"","data":null}
      
      Notice that surname comes out as the empty string though, I think this must be a bug in jo!
    • enriquto 4 days ago
      > I like the syntax to send typed values from the terminal:

      Incidentally, this syntax shows a notation that is clearly superior to json (at least for non-nested stuff). If all you need is this, you'd be better off by avoiding json altogether.

      [Rant: if json is so unergonomic that people keep inventing alternatives like this syntax and stuff like "gron" to de-jsonise their lives, maybe using json was always a bad idea, after all... I guess in a decade everybody will look at json with the same disdain as we do XML today.]

      • sureIy 4 days ago
        > shows a notation that is clearly superior to json

        I don’t see that at all? Why is `n:number=1` superior to `{n:1}`? If anything, CLI commands are awful for anything other than strings.

        • enriquto 4 days ago
          But strings are often the most common case (or even, the only case that is needed). And they need much less punctuation. Compare:

              a=1 b=2 c=3
          
          with

             {"a"="1", "b"="2", "c"="3"}
          
          the json version needs 19 punctuation characters just to define three variables, against the bash version that only has 3. Which one would you prefer to type with your keyboard?
        • hnlmorg 4 days ago
          Depends on the shell. The following parses as a number in Murex:

              %{n:1}
          
          https://murex.rocks/parser/create-object.html

          I’m sure you can do similar things in other modern shells too. So the real problem is that people are stuck on the constraints of 1970s command lines.

  • jpgvm 4 days ago
    {"password":"hunter2"}

    A man of culture I see.

    This looks really useful where you don't want to introduce another scripting VM just to spit out some JSON, i.e I have used Ruby a lot for this in the past.

    I can see myself using this in container init scripts and other very low dep environments to format config files from env vars etc.

    • everforward 4 days ago
      I wouldn’t do that to my users; the happy path here is nice, but the unhappy path seems very likely to end in garbled bash errors that are impossible to track down.

      As a user, I’m fine with embedding a reasonably small VM to handle the configs; disk space is cheap. Better yet would be a compiled binary that handles it, but that feels like asking a lot of maintainers.

      There’s a lot of surface area for someone to mis-quote stuff in their environment and generate unintelligible bash errors.

      Or that may be just me; I hate bash in general, so maybe it’s just that bleeding over.

    • h4l 4 days ago
      How did you guess my password?!?!

      This is just the kind of use case I had in mind. Something I've considered is publishing a mini version with only the json.encode_string function, as that's enough to create an array of JSON-encoded strings and use a hard-coded template with printf to insert the JSON string values.

      That would be a fraction of the overall json.bash file size.

    • ukuina 4 days ago
      RIP bash.org!
  • simonw 4 days ago
    I found the JSON array syntax a little unintuitive:

        $ jb dependencies:[,]=Bash,Grep
        {"dependencies":["Bash","Grep"]}
    
    One possible alternative would be to accept JSON literal snippets, like this:

        $ jb dependencies='["Bash", "Grep"]'
    
    This should support all forms of nested JSON objects. You could have a rule that if an argument does NOT parse as a valid JSON value it is treated as a raw string, so this would work:

        $ jb foo=bar bar='"this is a well formed string"'
        {"foo": "bar", "bar": "this is a well formed string"}
    
    You could even then nest jb calls like this:

        $ jb foo=$(jb bar=baz)
        {"foo": {"bar": "baz}}
    • h4l 4 days ago
      Thanks for giving it a try and your feedback. I agree, the array splitting is a bit fiddly. It is actually possible to pass JSON directly, you use the :json type on the argument:

          $ jb dependencies:json='["Bash","Grep"]'
          {"dependencies":["Bash","Grep"]}
      
          $ jb foo=bar bar:json='"this is a well formed string"'
          {"foo":"bar","bar":"this is a well formed string"}
      
      And then you can indeed use command substitution to nest calls:

          $ jb foo:json=$(jb bar=baz)
          {"foo":{"bar":"baz"}}
      
      It works even better to use process substitution, this way the shell gives jb a file path to a file to read, and so you don't need to quote the $() to avoid whitespace breaking things:

          $ jb foo:json@<(jb msg=$'no need\nto quote this!')        
          {"foo":{"msg":"no need\nto quote this!"}}
      
      Another option is to use jb-array to generate arrays. (jb-array is best for tuple-like arrays with varying types):

          $ jb dependencies:json@<(jb-array Bash Grep)
          {"dependencies":["Bash","Grep"]}
      
      And if you use it from bash as a function, you can put values into a bash array and reference it:

          $ source json.bash
          $ dependencies=(Bash Grep)
          $ json @dependencies:[]   
          {"dependencies":["Bash","Grep"]}
  • abdellah123 4 days ago
    Amazing tool and syntax. Hat down!
    • h4l 4 days ago
      Thanks!
  • lsferreira42 4 days ago
    Amazing bash programming skills, this is so cool that i want to find a problem to solve using it right now!!
  • fieu 4 days ago
    I wonder if I could use this on my project which uses multiple glue functions to piece together JSON strings. https://github.com/fieu/discord.sh
    • h4l 4 days ago
      If it helps, there's a little example of using the bash API with bash variables/arrays, should give you an idea of how it could be to use: https://github.com/h4l/json.bash/blob/main/examples/notify.s...

      This example uses the pattern of setting an out=varname when calling a json function, the encoded JSON goes into $varname variable. This pattern avoids the overhead of forking processes (e.g. subshells) when generating JSON.

      Otherwise you can use the more normal approach of jb writing to stdout, and capturing the output stream.

  • gkfasdfasdf 4 days ago
    Is there a minimum bash version required? I.e. will it work with bash 3 or whatever ships with macos by default?
    • h4l 4 days ago
      There is, the earliest version I've tested with is 4.4.19, but ideally a 5.x version. 3 certainly won't work I'm afraid. If you use homebrew on Mac it's a good way to get the latest bash.
  • lttlrck 4 days ago
    This is great! no doubt I'll be reaching for it very soon.
  • pmarreck 4 days ago
    BATS was a little heavy for me as a testing dependency for my own use (I ended up writing what I intended to be "the most minimalist shell testing library possible", see below, I think it still needs work though!), but I at least want to commend you for having what looks like a great test suite to begin with!

    https://github.com/pmarreck/tinytestlib

    • h4l 4 days ago
      It's nice to have compact single-file dependencies like this! I like the look of your assertions (checking out, err & status). I definitely found myself writing my own assertions to get understandable errors.
  • westurner 4 days ago
    jshn.sh: https://openwrt.org/docs/guide-developer/jshn src: https://git.openwrt.org/?p=project/libubox.git;a=blob;f=sh/j... :

    > jshn (JSON SHell Notation), a small utility and shell library for parsing and generating JSON data

  • IshKebab 4 days ago
    Yeah if you need this it's definitely a sign you shouldn't be using Bash.

    Can you give a concrete example of when this is the sanest option?

    • h4l 4 days ago
      Two main situations I think. The first is just interactive use in any shell to encode ad-hoc JSON. If you have a next-gen shell which can handle structured data directly, then you probably don't need it.

      Second is situations where you'd rather not add an additional dependency, but bash is pretty much a given. For example, CI environments, scripts in dev environments, container entrypoints. Or things that area already written in bash.

      I don't advocate writing massive programs in bash, for sure it's better to turn to a proper language before things get hairy. But bash is just really ubiquitous, and most people who do any UNIX work will be able to deal with a bit of shell script.

      • wavemode 4 days ago
        I agree with the interactive usecase.

        But for when you don't want an extra dependency, awk and perl are better than bash and just about as ubiquitous. (I might dare to say more ubiquitous, since MacOS in particular ships with an ancient version of bash that can't even use this jb tool. But the versions of awk and perl it comes with are fine.)

      • IshKebab 4 days ago
        > Second is situations where you'd rather not add an additional dependency, but bash is pretty much a given. For example, CI environments, scripts in dev environments, container entrypoints. Or things that area already written in bash.

        Is this tool not an additional dependency?

        > But bash is just really ubiquitous

        Biggest crime of the Unix world probably.

        • h4l 3 days ago
          > Is this tool not an additional dependency?

          It is, but if you already have bash, adding another shell script isn't much of a jump. e.g. I'd feel OK about committing jb to another repo for use from a .envrc file to set up an environment, whereas committing a binary would not feel good.

          > Biggest crime of the Unix world probably.

          Sorry if I'm perpetuating this! :) My take is that problem is not with bash, the problem is that it's hard for more advanced tools to replace it.

  • vips7L 4 days ago
    Built into Powershell:

        > @{ hello = 'world' } | ConvertTo-Json
        > { "hello": "world" }
    • majkinetor 4 days ago
      Not only its built in, but syntax is on another level, i.e. you don't need to learn special syntax if you know PowerShell. This thing alone makes pwsh worth it instead of using number of other tools.

          @{ Hello = 'world'; array = 1..10; object = @{ date = Get-Date } } | ConvertTo-Json
       
          {
            "array": [
              1,
              2,
              3,
              4,
              5,
              6,
              7,
              8,
              9,
              10
            ],
            "object": {
              "date": "2024-07-03T21:07:21.6562053+02:00"
            },
            "Hello": "world"
          }
      • h4l 4 days ago
        That is pretty cool, and I wish such features were common in regular UNIX shells.

        For good measure, this is how you might do the same with jb:

            $ jb Hello=world array:number[]@<(seq 10) object:json@<(date=$(date -Iseconds) jb @date)
            {"Hello":"world","array":[1,2,3,4,5,6,7,8,9,10],"object":{"date":"2024-07-03T19:26:36+00:00"}}
        
        Alternatively, using the :{} object entry syntax:

            jb Hello=world array:number[]@<(seq 10) object:{}=date=$(date -Iseconds)
            {"Hello":"world","array":[1,2,3,4,5,6,7,8,9,10],"object":{"date":"2024-07-03T19:30:26+00:00"}}
    • h4l 4 days ago
      Powershell has the upper hand here!

      Still, bash can try to keep up using json.bash. :)

          $ source json.bash
          $ declare -A greeting=([Hello]=World)
          $ json ...@greeting:{}
          {"Hello":"World"}
      
      ... is splatting the greeting associative array entries into the object created by the json call.

      Without the ... the greeting would be a nested object. Probably more clear with multiple entries:

          $ declare -A greeting=([Hello]=World [How]="are you?")
          $ json @greeting:{}   
          {"greeting":{"Hello":"World","How":"are you?"}}
      
      Vs:

          $ json ...@greeting:{}                                
          {"Hello":"World","How":"are you?"}
      • majkinetor 4 days ago

            $h=@{x=1; y=2}; $h + @{z=3} | ConvertTo-Json
        
            {        
              "y": 2,
              "z": 3,
              "x": 1 
            }
        
        You can even use [ordered]$h to make keys not go random place.
  • altruios 4 days ago
    Windows: what if everything was an (command) object?

    Linux: what if everything was a file?

    Soon we might have...

    Mong/Os: what if everything was JSON?

    YiAM/OS: YiAM/OS is ANOTHER MARKUP OPERATING SYSTEM... would come out shortly thereafter...

    I like JSON and getting in the terminal is a challenge - GOOD JOB!

  • 2f0ja 4 days ago
    Similar to jo, which is written in C [1]

    [1] https://github.com/jpmens/jo

  • Aerbil313 4 days ago
    I'll wait right here using Nushell while the you guys can spend the next 10 years re-inventing it.