A small bit of historical context. When I was participating in the PKP meetings at RSADSI, I believe it was Ron who insisted that DER was the only reasonable choice if we were going to encode things with ASN.1 (which we were because both DEC and RSA had already insisted that it had to be OSI compatible or they wouldn't support it, my suggestion that we use Sun's XDR was soundly rebuked, but hey I had to offer)
Generally it was presumed that because these were 'handshake' type steps (which is to say the prelude to establishing a cryptographic context for what would happen next) performance wasn't as important as determinism.
oh. did i meet you there? i was contracting at RSADSI at the time and argued w/ Burt K. about how easy it was to mess up a general DER parser, much less an ASN.1 compiler. I remember we found about two bugs per week in ICL's compiler. Burt and Ron were BIG ASN.1 fans at the time and I could never figure out why. Ron kept pushing Burt and Bob Baldwin to include more generic ASN.1 features in BSAFE. Part of my misery during SET development can be directly traced to ICL's crappy ASN.1 compiler, yet it was probably the best one on the market at the time.
Anywho... XDR isn't my favourite, but I would have definitely preferred it to DER/BER/ASN.1.
Probably :-). Ron was a huge fan of Roger Needham's (and, ngl, I was too) and Roger along with Andy Birrell and others were on a kick to make RPCs "seamless" so that you could reason about them like you did computer programs that were all local. Roger and I debated whether or not it was achievable (vs. desirable) at Cambridge when we had the PKI meeting there. We both agreed that computers would get fast and cheap enough that the value of having a canonical form on the wire vastly outweighed any disadvantage that "some" clients would have to conversion to put things in a native format they understood. (Andy wasn't convinced of that, at least at that time). But I believe that was the principle behind the insistence on ASN.1, determinism and canonical formats. Once you built the marshalling/unmarshalling libraries you could treat them as a constant tax on latency. That made analyzing state machines easier and debugging race conditions. Plus when they improved you could just replace the constants you used for the time it would take.
I wonder how much Needham had to do with Sun's AUTH_DH. It must have been Whit Diffie's baby, but if Needham was pushing RPC then I imagine there must have been interactions with Diffie.
It turns out that one should not design protocols to require canonical encoding for things like signature verification. Just verify the signature over the blob being signed as it is, and only then decode. Much like nowadays we understand that encrypt-then-MAC is better than MAC-then-encrypt. (Kerberos gets away with MAC-then-encrypt because nowadays its cryptosystems use AES in ciphertext stealing mode and with confounders, so it never needs padding, so there's no padding oracle in that MAC-then-encrypt construction. Speaking of Kerberos, it's based on Needham-Schroeder... Sun must have been a fun place back then. It still was when I was there much much later.)
XDR is like a four-octet aligned version of PER for a cut-down version of ASN.1. It's really neat.
XDR would not need much work to be a full-blown ER for ASN.1... But XDR is extremely inefficient as to booleans (4 bytes per!) and optional fields (since they are encoded as a 4-byte boolean followed by the value if the field is present).
I was writing a cryptographically-inclined system with serialization in msgpack. At one point, I upgraded the libraries I was using and all my signatures started breaking because the msgpack library started using a different representation under the hood for some of my data structures. That's when I did some research and found ASN.1 DER and haven't really looked back since switching over to it. If you plan on signing your data structures and don't want to implement your own serialization format, give ASN.1 DER a look.
If you are planning to sign your data structures, IMO your first choice should be to sign byte strings: be explicit that the thing that is signed is a specific string of bytes (which cryptographic protocol people love to call octets). Anything interpreting the signed data needs to start with those bytes and interpret them — do NOT assume that, just because you have some data structure that you think serializes to those bytes, then that data structure is authentic.
Many, many cryptographic disasters would have been avoided by following the advice above.
That matches the advice from Latacora[1]. That advice makes a lot of sense from a security correctness and surface area perspective.
There's a potential developer experience and efficiency concern, though. This likely forces two deserialization operations, and therefore two big memory copies, once for deserializing the envelope and once for deserializing the inner message. If we assume that most of the outer message is the inner message, and relatively little of it is the signature or MAC, then our extra memory copy is for almost the full length of the full message.
There are a few serialization/deserialization systems that are close enough to zero-copy that this has no overhead. Cap’n Proto and FlatBuffers were designed around roughly this idea. Even some protobuf implementations allow in-place reads of bytes.
Does this work with nesting? I guess, if reading the envelope gets you a pointer to the inner buffer, you can pass that to another read operation. If that can be done safely (with the library ensuring the appropriate checks before it casts/transmutes), that would be very powerful.
It should work fine. In C and C++, it's straightforward to YOLO it -- all you need is a pointer and a length. Rust can do more or less the same thing but with compiler-enforced safety. Many GC/ARC languages can efficiently handle slicing buffers, and it mostly comes down to library design. (Even Python can do this, although you generally pay for it in the rather large cost of every other operation...)
It took me a few tries to convince myself, but I think I agree with you that Rust can do this with lifetimes and unsafe. Importantly, the unsafe is self-contained, can be reliably generated by macros or codegen, and the end user doesn't have to muck with it.
There is also rasn library for Rust that now supports most of the codecs (BER/CER/DER/PER/APER/OER/COER/JER/XER).
Disclaimer: I have contributed a lot recently. OER codec (modern flair of ASN.1) is very optimized (almost as much as it can be with safe Rust and without CPU specific stuff). I am still working with benchmarking results, which I plan to share in close future. But it starts to be the fastest there is in open-source world. It is also faster than Google's Protobuf libraries or any Protobuf library in Rust. (naive comparison, no reflection support). Hopefully other codecs could be optimized too.
I do object to the idea that one should manually map ASN.1 to Rust (or any other language) type definitions because that conversion is going to be error-prone. I'd rather have a compiler that generates everything. It seems that rasn has that, yes? https://github.com/librasn/compiler
Correct, the compiler allows you to generate the Rust bindings automatically. Worth noting that the compiler is at an earlier stage of development (the library was started six years ago, the compiler started roughly two years ago). So there are features that aren't used or supported by the compiler that are available in the library.
Yes writing the definitions by hand can time consuming and error-prone, but I designed the library in mind around the declarative API to make it easy to both manually write and generate, I also personally prefer writing Rust whenever possible, so nowadays I would sooner write an ASN.1 module in Rust and then if needed build a generator for the ASN.1 textual representation than write ASN.1 directly since I get access to much better and stronger tooling.
Also in my research when designing the crate, there are often requests in other ASN.1 or X.509 libraries to allow decoding semantically invalid messages because in the wild there are often services sending incorrect data, and so I designed rasn to allow you mix and match and easily build your own ASN.1 types from definitions so that when you do need something bespoke, it's easy and safe.
> Yes writing the definitions by hand can time consuming and error-prone
With the proper environment, this isn't that time consuming or error-prone anymore, based on my recent experiments. When I initially started exploring with the library, it was a bit difficult at first (mostly because back then there was no reference documentation and you had to rely on tests), but after some time API gets easy to understand. Actual type definitions remain very small because of the derive macros.
Nowadays LLMs are pretty good at handling ASN.1 definitions. If you provide the context correctly, by giving e.g. the reference from the rasn's README, maybe even some type definitions from the library itself, and just give the ASN.1 schemas, LLMs can generate the bindings very accurately. The workflow switches from being the manual writer to be just the reviewer. Compiler is also already good at generating the basic definitions on most cases.
I really like the idea that then standard can be published as a crate, and nobody needs to ever touch for the same standard again, unless there is a bug or standard gets an update. Crates can be then used with different open-source projects with the same types. What I known about commercial ASN.1 usage, different companies buy these products to compile the same standards over and over again.
I bet that there are no many companies that define their internal APIs with ASN.1 and buys commercial tool just to support ASN.1 usage in their own business, without any international standards included.
> > Yes writing the definitions by hand can time consuming and error-prone
> With the proper environment, this isn't that time consuming or error-prone anymore, based on my recent experiments.
If you have modules with automatic, and manual IMPLICIT and EXPLICIT tagging and you fail to translate those things correctly then you can trivially make mistakes that cause your implementation to not interoperate.
This one looks interesting. A few years ago I looked at all of the Rust ASN.1 libraries I could find and they all had various issues. I'm a little surprised I didn't find this one.
The Python library proposed in TFA is to be based on a different, DER-only Rust ASN.1 library[1]. So ASN.1 in Rust is more than tangentially relevant here.
The discussion of rust and ASN.1 libraries and interoperability with other languages might be. The relative differences between the two libraries might be. The efforts of the author to get their library into Python might be.
Simply advertising a different project, with all the standard "rust tropes," is not what most people would consider relevant. It's hamfisted and weird.
Related: if you ever want to create your own serialization format, please at least have a cursory look at the basics of ASN.1. It's very complete both in terms of textual descriptions (how it started) and breadth of encoding rules (because it's practical.)
(You can skip the classes and macros, though they are indeed cool...)
Would you rather they reinvent the wheel badly? Thjat's what ProtocolBuffers is: badly reinvented ASN.1/DER!
PB is:
- TLV (tag-length-value), like DER
- you have to explicitly list the
tags in the IDL as if it was ASN.1
in 1984 (but actually, worse,
because even back then tags were
not always required in ASN.1, only
for diambiguation)
- it's super similar to DER, yet not
not the same
- PB was created in part because ASN.1
had so little open source tooling,
but PB had none until they wrote it
so they could just have written the
ASN.1 tooling they'd wished they had
In complete fairnes to PBs, PBs have a heck of a lot less surface area than ASN.1. You could argue, why not use a subset of ASN.1, but it seems people have trouble agreeing which subset to use.
There are two notions of surface area here: that exposed to the external input, which must be secured, and that exposed to the programmer, which must be understood. As far as the latter is concerned, one can’t really disassociate the encoding of DER from the, well, notation of ASN.1, which, while definitely not as foreign as it may first appear, is still very rich compared to the one Protobufs use. (I do think a good tutorial and a cheat-sheet comparison to more widely used IDLs would help—for certain, obscure dusty corners and jargon-laden specs have never stopped anyone from writing the COM dialect of DCE IDL.)
Even if we restrict ourselves to the former notion, the simple first stage of parsing that handles DER proper is not the only one we have to contend with: we also have to translate things like strings, dates, and times to ones the embedding environment commonly uses. Like, I’m the kind of weird pervert that would find it fun to implement transcoding between T.61 and Unicode faithfully, but has anyone ever actually put T.61 in an ASN.1 T61String? As far as I know, not as far as PKIX is concerned—seemingly every T61String in a certificate just has ISO 8859-1 or *shudder* even Windows-1252 inside it (and that’s part of the reason T61Strings are flat out prohibited in today’s Web PKI, but who can tell about private PKIs?). And I’ll have to answer questions like this about every one of a dozen obscure and/or antiquated data types that core ASN.1 has (EMBEDDED PDV anyone?..).
Why wouldn't you want to explicitly number fields? Protocols evolve and get extended over time, making the numbering explicit ensures that there's no accidental backwards compat breakage from re-ordering fields. Implicit field numbers sounds like an excellent reason to not use ASN.1.
This shilling for an over-engineered 80s encoding ecosystem that nobody uses is really putting me off.
> Why wouldn't you want to explicitly number fields? Protocols evolve and get extended over time, making the numbering explicit ensures that there's no accidental backwards compat breakage from re-ordering fields.
ASN.1 went through this whole evolution, and ended up developing extensive support for extensibility and "automatic tagging" so you don't have to manually tag. That happened because the tagging was a) annoying, b) led to inconsistent use, c) led to mistakes, d) was almost completely unnecessary in encoding rules that aren't tag-length-value, like PER and OER.
The fact that you are not yet able to imagine that evolution, and that you are not cognizant with ASN.1's history proves the point that one should study what came before before reinventing the wheel [badly].
I have to admit that I could not make heads or tails of the extension marker stuff in the ASN.1 standards I’ve read (so the essential ones like basic ASN.1 and BER, not the really fun stuff like object classes, macros, or ECN). This is rather unlike the rest of those standards. So, could you elaborate on what those actually do and/or why they’re the right way to do things?
The one thing that grinds my gears about BER/CER/DER is that they managed to come up with two different varint encoding schemes for the tag and length.
Yeah, but if you're writing a parser for use by others, you have to implement both, even if it's "rarely" used. Or some intern somewhere will have a bad day after getting tasked with "just add this value here, it'll be an easy starter project." :)
How much of it do you need in that representation? Usually I see that need in either: x509 where you're already using der, or tiny fragments where a custom tag-length-value would cover almost every usage without having to touch asn.
First of all you should never need a canonical representation. If you think you do, you're almost certainly wrong. In particular you should not design protocols so that you have to re-encode things in order to validate signatures.
So then you don't need DER or anything like it.
Second, ASN.1 is fantastic. You should at least study it a bit before you pick something else.
Third, pick something you have good tooling for. I don't care if it's ASN.1, XDR, DCE RPC / MSRPC, JSON, CBOR, etc. Just make sure you have good tooling. And don't pick XML unless you really need it to interop with things that are already using XML.
EDIT: I generally don't care about downvotes, but in this case I do. Which part of the above was objectionable? Point 1, 2, or 3? My negativity as to XML for protocols? XML for docs is alright.
Some of the bug reports here are not actually about ASN.1 even if they are in programs that also use ASN.1. However, some are actually about ASN.1.
But, there are bugs in many computer programs, whether or not they use ASN.1, anyways.
CVE-2022-0778 does not seem to be about ASN.1 (although ASN.1 is mentioned in the description); it seems to be a bug with computing a modular square root for non-prime moduli, and these numbers can come from any source and does not necessarily have anything to do with ASN.1.
CVE-2021-3712 does have to do with ASN.1 implementation, but this is a bad assumption in some other parts of the program that use the ASN.1 structure. (My own implementation also does not require the string stored in the ASN1_Value structure to be null-terminated, but none of the functions implicitly null-terminate it or expect it to be. One reason for this is to avoid memory allocations when they are not needed.)
Many programs dealing with OIDs have problems with it, since the program is badly designed; a properly designed program (which is not that difficult to do) will not have these problems with OIDs. It is rarely necessary to decode OIDs, except for display (my own implementation limits it to 160 digits per part when displaying a OID, which is probably much more than is needed, but should avoid the problem described in CVE-2023-2650 anyways). When comparing OIDs, you can compare them in binary format directly (if one is in text format, you should convert that one to binary to compare them, instead of the other way around). If you only want to validate OIDs, that can be done without decoding the numbers: Check that it is at least one byte long, the first byte is not 0x80, the last byte does not have the high bit set, and any byte that does not have the high bit set is not immediately followed by a byte 0x80. (The same validation can apply to relative OIDs, although some applications may allow relative OIDs to be empty, which is OK; but absolute OIDs are never allowed to be empty.) (Some other reports listed there also relate to OIDs. If the program is well-designed then it will not have these problems, as I described.)
My own implementation never decodes ASN.1 values until you explicitly tell it to do so, with a function according to the type being decoded, and returns an error condition if the type is incorrect. All values are stored in a ASN1_Value structure which works the same way.
Some of the CVE reports are about things which can occur just as easily in other programs not related to ASN.1. Things such as buffer overflows, improperly determining the length, integer overflows, etc, can potentially occur in any other program too.
None of the things listed by CVE reports seems to be inherent security issues with ASN.1 itself.
I was using the asn1bean Java library yesterday funnily enough. I'm sure it's fine for X.509 stuff, however lucky me got to use it with the more obscure parts of X.400. It's lacking support for COMPONENTS OF, and a bunch of other things that were likely deprecated from the ASN.1 spec a few decades ago.
No, it is not the same. Sadly the author probably made it private, might reach out to him, just don't know how. :( He did not put his e-mail anywhere from what I have found so far.
[1] Never mind, I found the newest (probably) asn1-compiler.jar! I still need an actively maintained alternative, however, for Java / Kotlin. For ASN.1 between Erlang / Elixir <> Java / Kotlin.
IBM did a partial implementation of ASN.1 in Java, and released it via the IBM AlphaWorks open-source repository. I used it in a telecommunications system in the 90s. Luckily the GSM protocol we were interfacing with only used a small subset of ASN.1, which was covered by the IBM software.
IBM AlphaWorks is still online : https://www.ibm.com/support/pages/aix-toolbox-open-source-so...
but only lists libtasn1, which is in C.
I used Peculiar Ventures ASN1.js to build an in-browser PKI platform years ago. It can sit on top of webcrypto and do everything you need in terms of managing TLS certs.
I do not use Python, but I wrote my own library in C for reading/writing DER. (I have made a variant, which adds a few new types such as: key/value list, BCD, TRON character code, etc. The program works even if you do not use these new types.)
DER does have the advantages they mention in that article, and other advantages.
Some people mention that DER is not compact or not efficient; but often what is used instead is formats that are even less compact or less efficient than DER, and/or that are significantly more complicated to handle.
DER is still easy, UPER (unaligned packed encoding rules) is so much harder, yet it's prevalent in Telecom industry.
Last I checked, there was no freely available tool than can handle UPER l00%
Not only is UPER hard to parse, but (I believe) 3GPP ASN1 definitions are provided only in .docx files which aren’t exactly the easiest to work with. It’s just really not a fun domain.
The ASN.1 format itself isn't too bad. It shows its age and has some very weird decisions behind it as places, but it's not that difficult to encode and is quite efficient.
Unfortunately, the protocols themselves can be confusing, badly (or worse: inconsistently) documented, and the binary formats often lack sensible backwards compatibility (or, even worse: optional backwards compatibility). Definitions are spread across different protocols (and versions thereof) and vendors within the space like to make their own custom protocols that are like the official standardised protocols, but slightly different in infuriating ways.
If you parser works (something open source rarely cares about so good luck finding one for your platform), the definitions extracted from those DOCX files are probably the least of your challenges.
First you can download specifications in either PDF or doc(x). Second doc(x) are simple enough that simple doc(x) to ASCII/text is good enough to produce working ASN.1 definition. Copy&paste is also an option.
There are many tools that can handle UPER up to certain level (some rare ASN.1 types might not be supported). I think the main issue is not in the codec, rather the lack of compilers that can create correct language-level representation of the ASN.1 definitions. 3GPP specifications are enormous and you don't want to create them by hand. ASN.1 has some very difficult notations, e.g. inner subtype constraints and information object classes. Subtype constraints may affect for the encoding output in UPER and if you are not representing them correctly overall, then you are not compatible.
When every bit passing through the network gets charged (if not to the customer, then it's taking up capacity that could otherwise be charged to the customer), and the software in the endpoints needs to be as low-power as possible, zlib is additional overhead you definitely don't want.
UPER is extremely compact encoding format. It still makes sense to use UPER, because after all, it is an international standard and telecommunication protocols itself are supposed to add as little overhead on top of actual payload as possible.
For example, if you have ASN.1 UTF-8 string that is constrained to 52 specific characters - UPER encoding can present every character with 6 bits (not bytes).
In modern world you can apply zlib on top of UPER encoding or internal payload, however, depending on the use case.
I think it's context-dependent: I don't have insight into OSS Nokalva's use inside big companies, but in the Open Source world it certainly isn't dominant.
In Open Source, I think the biggest ASN.1 implementations I come across are OpenSSL's, libtasn1, asn1c, and then per-language implementations like pyasn1.
Most of the open source tools need patching to properly support certain scenarios (been there done that). They also lack support for parsing ASN.1 Value Notation format (textual), which is used everywhere in specifications, OSS Nokalva offers the full set of tools to handle this even with a playground and ASN.1 editor, this is non-existent in open source right now. For now the open source tools only focus on the crypto aspect, and doesn't really dive into telco, banking, biometric, and others.
> In the ASN.1 space everyone hopes that someone can dethrone OSS Nokalva's proprietary solutions
You're buying more than a compiler and runtime, though: you're also getting an SLA and a stricter guarantee about interoperability and bugs and so forth. I have no idea how good their support is (maybe it's atrocious?), but these are important. I had a client who relied on the open-sourced asn1c once who complained about some of the bugs they found in it; they got pushed into buying commercial when the cost-benefit outweighed the software licensing issues.
> Meh. After all, if you're not using ASN.1 you're using something like ProtocolBuffers or FlatBuffers or whatever and all open source tooling.
Oh sure--there are plenty of alternatives to ASN.1. My guess is that most people who have the choice don't use ASN.1 precisely because open-source alternatives exist and can feasibly work for most use cases.
But if you happen to have one of the use cases that require ASN.1, open sourced tooling can be problematic precisely because of the need for a robust SLA.
> But if you happen to have one of the use cases that require ASN.1, open sourced tooling can be problematic precisely because of the need for a robust SLA.
Why would you need a support SLA for ASN.1 and not for PB/FB? That makes no sense. And there's plenty of open source ASN.1 tooling now -- just look around this thread!
The difference is the quality of the OSS implementation: most OSS ASN.1 tool choke on the enormous 3GPP specs and others used in the telco industry, thus cannot generate 100% valid code.
For some use-cases, you can get by with manually adjust the generated code. That works until the hardware vendors release a new device that use a more modern 3GPP specs and your code start breaking again.
When using a commercial ASN.1 tooling, they often update their compilers to support the latest 3GPP specs even before the hardware vendors, and thus supporting a new device is way simpler.
> Why would you need a support SLA for ASN.1 and not for PB/FB? That makes no sense. And there's plenty of open source ASN.1 tooling now -- just look around this thread!
If your business depends on five nines plus of reliability in your 5g communications stack, you might be willing to fork over the price for it. Or if you need a bug fix made in a timely fashion to the compiker or runtime, likewise. As I've noted above, a client of mine moved to a commercial suite of tools for this reason.
Protobuf and flatbuffers have different use cases in my experience, although that's somewhat limited. Protobuf at least also introduced breaking changes between versions 2 and 3. ASN.1 isn't perfect in this regard, but these days incompatibikities have to go through ISO or ITU, etc.
Your experience may be different of course. I'm just pointing out that there are reasons people will opt for a commercial product.
I don't have the time, though I do have the inclination, to finish Heimdal's ASN.1 compiler, which is already quite awesome. u/lukeh used Heimdal's ASN.1 compiler's JSON transformation of modules output to build an ASN.1 compiler and runtime for Swift.
Parser differential exploits are a understated problem, especially with ASN.1, which I didn't expect to see anyone thinking about. Kudos on this initiative!
I understand that it is a problem but I'm more used to seeing arguments that monocultures and single implementations are bad: WebSQL for example didn't become a standard because there was only a single implementation.
If there were only one implementation for ASN.1 people would decry that whatever that implementation does effectively becomes the standard, and people would be clamoring to write a second implementation.
20+ years ago used ASN.1 for talking between micro services. (HTTP Services, as they were called then) Very performant.
Had to buy OSS tools licence but other than that quite nice.
Oh right, the asn1 crate, which supports CHOICE but only up to 3 alternatives, which means it can't even be used to implement X.509 certificate decoding. Makes me wonder what they're going to do when they get that far.
The asn1 crate provides three builtin Choice enums itself - asn1::{Choice1, Choice2, Choice3} - which support 1, 2 and 3 choices respectively. I assume that is what lilyball is referring to. But as you correctly point out the custom derive supports mapping enums with more variants to CHOICE just fine, so the builtin enums are not a relevant limitation.
Interesting. Python's a big community, and there's some disagreement here over whether this would be better done in pure python. I think it's good that there's a rust/cloud contingent in python land but hope pure python remains popular.
For the record: I shopped this project to them; it didn’t originate as an idea from LF or any major company. The idea itself came from PyCA’s maintainers, who have wanted this for a while.
(The native code desirability question is also a red herring here, since PyCA has and will always need to have native code for cryptographic operations. So having a first class ASN.1 API in PyCA was always going to involve native code, with the only variable being how.)
Does anyone miss when "pure python" was a selling point of your library? I understand the need for speed, but I wish it were more common to ship a compiled binary that does the thing fast, as well as the same algorithm in python to fall back on if it's not available.
Pure Python was a huge selling point back when using a compiled library involved downloading and running random .exe files from someone's personal page on a university website. It is much less of a selling point now that we have binary wheels and uv/Poetry/etc. that create cross-platform lock files.
I feel nostalgic seeing (a mirror of) that download page again, but that era was such a pain.
I always thought the selling point of Pure Python was that you might be running on some esoteric implementation of python, or hardware that the library maintainer didn't include binaries for.
I mean, I am glad wheels exist, they make things a lot easier for a lot of people. I just believed in the dream of targeting the language for maximum compatibility and hoping a better implementation would eventually run it faster.
I rather find tragic that contrary to other dynamic languages, Python seems to fall under the curse of rewriting bindings into C and C++, or nowadays more fashionable Rust.
And yes, Smalltalk, Self and various Lisp variants are just as dynamic.
Why is it tragic? It's more or less idiomatic in Python to put the hot or performance-sensitive paths of a package in native code; Rust has arguably made that into a much safer practice.
You don’t have to master Rust to use this, the same way you don’t have to master C to use all of the critical extensions written in it.
(Besides, no language has this regardless of native extensions: a huge part of Python’s success comes from the fact that there isn’t a perfect graph of competencies in the community, and instead that community members provide high quality abstractions over their respective competencies.)
Sure, but that’s the general maintenance risk. I don’t think native code changes that dynamic, particularly in ecosystems where it’s the norm. And doubly so for cryptographic code, where native is the norm for performance and certification reasons.
It’s my impression as a maintainer of many projects that native compilation hasn’t been a driving issue in Python packaging for a while now. Most users download wheels and sidestep the entire problem. Whether or not they should trust random binaries from the internet is of course a bigger question.
IME, and I may be off base, the new generation of Rust/Go binaries have a more "batteries-included" philosophy, i.e. developers don't assume that they can piggyback off existing user system libraries, which generally makes it a nicer install UX.
Generally it was presumed that because these were 'handshake' type steps (which is to say the prelude to establishing a cryptographic context for what would happen next) performance wasn't as important as determinism.
Anywho... XDR isn't my favourite, but I would have definitely preferred it to DER/BER/ASN.1.
Stop me before I make a CORBA reference.
Probably :-). Ron was a huge fan of Roger Needham's (and, ngl, I was too) and Roger along with Andy Birrell and others were on a kick to make RPCs "seamless" so that you could reason about them like you did computer programs that were all local. Roger and I debated whether or not it was achievable (vs. desirable) at Cambridge when we had the PKI meeting there. We both agreed that computers would get fast and cheap enough that the value of having a canonical form on the wire vastly outweighed any disadvantage that "some" clients would have to conversion to put things in a native format they understood. (Andy wasn't convinced of that, at least at that time). But I believe that was the principle behind the insistence on ASN.1, determinism and canonical formats. Once you built the marshalling/unmarshalling libraries you could treat them as a constant tax on latency. That made analyzing state machines easier and debugging race conditions. Plus when they improved you could just replace the constants you used for the time it would take.
It turns out that one should not design protocols to require canonical encoding for things like signature verification. Just verify the signature over the blob being signed as it is, and only then decode. Much like nowadays we understand that encrypt-then-MAC is better than MAC-then-encrypt. (Kerberos gets away with MAC-then-encrypt because nowadays its cryptosystems use AES in ciphertext stealing mode and with confounders, so it never needs padding, so there's no padding oracle in that MAC-then-encrypt construction. Speaking of Kerberos, it's based on Needham-Schroeder... Sun must have been a fun place back then. It still was when I was there much much later.)
XDR would not need much work to be a full-blown ER for ASN.1... But XDR is extremely inefficient as to booleans (4 bytes per!) and optional fields (since they are encoded as a 4-byte boolean followed by the value if the field is present).
Many, many cryptographic disasters would have been avoided by following the advice above.
There's a potential developer experience and efficiency concern, though. This likely forces two deserialization operations, and therefore two big memory copies, once for deserializing the envelope and once for deserializing the inner message. If we assume that most of the outer message is the inner message, and relatively little of it is the signature or MAC, then our extra memory copy is for almost the full length of the full message.
[1]: https://www.latacora.com/blog/2019/07/24/how-not-to/
Yes, that is right; but, the byte sequence can be the canonical form of the data structure, and DER is canonical form.
Disclaimer: I have contributed a lot recently. OER codec (modern flair of ASN.1) is very optimized (almost as much as it can be with safe Rust and without CPU specific stuff). I am still working with benchmarking results, which I plan to share in close future. But it starts to be the fastest there is in open-source world. It is also faster than Google's Protobuf libraries or any Protobuf library in Rust. (naive comparison, no reflection support). Hopefully other codecs could be optimized too.
[1] https://github.com/librasn/rasn
I do object to the idea that one should manually map ASN.1 to Rust (or any other language) type definitions because that conversion is going to be error-prone. I'd rather have a compiler that generates everything. It seems that rasn has that, yes? https://github.com/librasn/compiler
Yes writing the definitions by hand can time consuming and error-prone, but I designed the library in mind around the declarative API to make it easy to both manually write and generate, I also personally prefer writing Rust whenever possible, so nowadays I would sooner write an ASN.1 module in Rust and then if needed build a generator for the ASN.1 textual representation than write ASN.1 directly since I get access to much better and stronger tooling.
Also in my research when designing the crate, there are often requests in other ASN.1 or X.509 libraries to allow decoding semantically invalid messages because in the wild there are often services sending incorrect data, and so I designed rasn to allow you mix and match and easily build your own ASN.1 types from definitions so that when you do need something bespoke, it's easy and safe.
With the proper environment, this isn't that time consuming or error-prone anymore, based on my recent experiments. When I initially started exploring with the library, it was a bit difficult at first (mostly because back then there was no reference documentation and you had to rely on tests), but after some time API gets easy to understand. Actual type definitions remain very small because of the derive macros.
Nowadays LLMs are pretty good at handling ASN.1 definitions. If you provide the context correctly, by giving e.g. the reference from the rasn's README, maybe even some type definitions from the library itself, and just give the ASN.1 schemas, LLMs can generate the bindings very accurately. The workflow switches from being the manual writer to be just the reviewer. Compiler is also already good at generating the basic definitions on most cases.
I really like the idea that then standard can be published as a crate, and nobody needs to ever touch for the same standard again, unless there is a bug or standard gets an update. Crates can be then used with different open-source projects with the same types. What I known about commercial ASN.1 usage, different companies buy these products to compile the same standards over and over again.
I bet that there are no many companies that define their internal APIs with ASN.1 and buys commercial tool just to support ASN.1 usage in their own business, without any international standards included.
> With the proper environment, this isn't that time consuming or error-prone anymore, based on my recent experiments.
If you have modules with automatic, and manual IMPLICIT and EXPLICIT tagging and you fail to translate those things correctly then you can trivially make mistakes that cause your implementation to not interoperate.
[1] https://github.com/alex/rust-asn1
Simply advertising a different project, with all the standard "rust tropes," is not what most people would consider relevant. It's hamfisted and weird.
(You can skip the classes and macros, though they are indeed cool...)
PB is:
smhEven if we restrict ourselves to the former notion, the simple first stage of parsing that handles DER proper is not the only one we have to contend with: we also have to translate things like strings, dates, and times to ones the embedding environment commonly uses. Like, I’m the kind of weird pervert that would find it fun to implement transcoding between T.61 and Unicode faithfully, but has anyone ever actually put T.61 in an ASN.1 T61String? As far as I know, not as far as PKIX is concerned—seemingly every T61String in a certificate just has ISO 8859-1 or *shudder* even Windows-1252 inside it (and that’s part of the reason T61Strings are flat out prohibited in today’s Web PKI, but who can tell about private PKIs?). And I’ll have to answer questions like this about every one of a dozen obscure and/or antiquated data types that core ASN.1 has (EMBEDDED PDV anyone?..).
This shilling for an over-engineered 80s encoding ecosystem that nobody uses is really putting me off.
ASN.1 went through this whole evolution, and ended up developing extensive support for extensibility and "automatic tagging" so you don't have to manually tag. That happened because the tagging was a) annoying, b) led to inconsistent use, c) led to mistakes, d) was almost completely unnecessary in encoding rules that aren't tag-length-value, like PER and OER.
The fact that you are not yet able to imagine that evolution, and that you are not cognizant with ASN.1's history proves the point that one should study what came before before reinventing the wheel [badly].
Not doing it is like inventing new programming language after just learning one of them.
Edited to add: If they need something with a canonical byte representation, for example for hashing or MAC purposes?
So then you don't need DER or anything like it.
Second, ASN.1 is fantastic. You should at least study it a bit before you pick something else.
Third, pick something you have good tooling for. I don't care if it's ASN.1, XDR, DCE RPC / MSRPC, JSON, CBOR, etc. Just make sure you have good tooling. And don't pick XML unless you really need it to interop with things that are already using XML.
EDIT: I generally don't care about downvotes, but in this case I do. Which part of the above was objectionable? Point 1, 2, or 3? My negativity as to XML for protocols? XML for docs is alright.
[1]: https://github.com/paseto-standard/paseto-spec/blob/master/d... [2]: https://github.com/paseto-standard/paseto-spec/blob/master/d...
But, there are bugs in many computer programs, whether or not they use ASN.1, anyways.
CVE-2022-0778 does not seem to be about ASN.1 (although ASN.1 is mentioned in the description); it seems to be a bug with computing a modular square root for non-prime moduli, and these numbers can come from any source and does not necessarily have anything to do with ASN.1.
CVE-2021-3712 does have to do with ASN.1 implementation, but this is a bad assumption in some other parts of the program that use the ASN.1 structure. (My own implementation also does not require the string stored in the ASN1_Value structure to be null-terminated, but none of the functions implicitly null-terminate it or expect it to be. One reason for this is to avoid memory allocations when they are not needed.)
Many programs dealing with OIDs have problems with it, since the program is badly designed; a properly designed program (which is not that difficult to do) will not have these problems with OIDs. It is rarely necessary to decode OIDs, except for display (my own implementation limits it to 160 digits per part when displaying a OID, which is probably much more than is needed, but should avoid the problem described in CVE-2023-2650 anyways). When comparing OIDs, you can compare them in binary format directly (if one is in text format, you should convert that one to binary to compare them, instead of the other way around). If you only want to validate OIDs, that can be done without decoding the numbers: Check that it is at least one byte long, the first byte is not 0x80, the last byte does not have the high bit set, and any byte that does not have the high bit set is not immediately followed by a byte 0x80. (The same validation can apply to relative OIDs, although some applications may allow relative OIDs to be empty, which is OK; but absolute OIDs are never allowed to be empty.) (Some other reports listed there also relate to OIDs. If the program is well-designed then it will not have these problems, as I described.)
My own implementation never decodes ASN.1 values until you explicitly tell it to do so, with a function according to the type being decoded, and returns an error condition if the type is incorrect. All values are stored in a ASN1_Value structure which works the same way.
Some of the CVE reports are about things which can occur just as easily in other programs not related to ASN.1. Things such as buffer overflows, improperly determining the length, integer overflows, etc, can potentially occur in any other program too.
None of the things listed by CVE reports seems to be inherent security issues with ASN.1 itself.
For Java I used yafred's asn1-tool, which is apparently not available anymore. Other than that, it worked well.
Originally it was available here: https://github.com/yafred/asn1-tool (archived: https://web.archive.org/web/20240416031004/https://github.co...)
Any recommendations?
Check the README of: https://web.archive.org/web/20240416031004/https://github.co...
I need something like this.
https://github.com/zhonghuihuo/asn1-tool is still available, but it is very old, it is probably a VERY old fork.
I need something like this :(. I need it for Java / Kotlin. I do not have the repository cloned, so I am kind of in the dark.[1]
Found the archived page of the previously mentioned project that is probably private now: https://web.archive.org/web/20240416031004/https://github.co...
[1] Never mind, I found the newest (probably) asn1-compiler.jar! I still need an actively maintained alternative, however, for Java / Kotlin. For ASN.1 between Erlang / Elixir <> Java / Kotlin.
Here's a post about AlphaWorks : https://www.cnet.com/tech/services-and-software/ibm-alphawor...
Searching for that, I found this post, https://stackoverflow.com/questions/37056554/opensource-java... which mentions : https://www.beanit.com/asn1/ https://sourceforge.net/projects/jac-asn1/ which are more recent java ASN.1 implementations.
https://asn1js.org/
https://www.npmjs.com/package/asn1js
DER does have the advantages they mention in that article, and other advantages.
Some people mention that DER is not compact or not efficient; but often what is used instead is formats that are even less compact or less efficient than DER, and/or that are significantly more complicated to handle.
Unfortunately, the protocols themselves can be confusing, badly (or worse: inconsistently) documented, and the binary formats often lack sensible backwards compatibility (or, even worse: optional backwards compatibility). Definitions are spread across different protocols (and versions thereof) and vendors within the space like to make their own custom protocols that are like the official standardised protocols, but slightly different in infuriating ways.
If you parser works (something open source rarely cares about so good luck finding one for your platform), the definitions extracted from those DOCX files are probably the least of your challenges.
For example, if you have ASN.1 UTF-8 string that is constrained to 52 specific characters - UPER encoding can present every character with 6 bits (not bytes).
In modern world you can apply zlib on top of UPER encoding or internal payload, however, depending on the use case.
In Open Source, I think the biggest ASN.1 implementations I come across are OpenSSL's, libtasn1, asn1c, and then per-language implementations like pyasn1.
You're buying more than a compiler and runtime, though: you're also getting an SLA and a stricter guarantee about interoperability and bugs and so forth. I have no idea how good their support is (maybe it's atrocious?), but these are important. I had a client who relied on the open-sourced asn1c once who complained about some of the bugs they found in it; they got pushed into buying commercial when the cost-benefit outweighed the software licensing issues.
Oh sure--there are plenty of alternatives to ASN.1. My guess is that most people who have the choice don't use ASN.1 precisely because open-source alternatives exist and can feasibly work for most use cases.
But if you happen to have one of the use cases that require ASN.1, open sourced tooling can be problematic precisely because of the need for a robust SLA.
Why would you need a support SLA for ASN.1 and not for PB/FB? That makes no sense. And there's plenty of open source ASN.1 tooling now -- just look around this thread!
For some use-cases, you can get by with manually adjust the generated code. That works until the hardware vendors release a new device that use a more modern 3GPP specs and your code start breaking again.
When using a commercial ASN.1 tooling, they often update their compilers to support the latest 3GPP specs even before the hardware vendors, and thus supporting a new device is way simpler.
If your business depends on five nines plus of reliability in your 5g communications stack, you might be willing to fork over the price for it. Or if you need a bug fix made in a timely fashion to the compiker or runtime, likewise. As I've noted above, a client of mine moved to a commercial suite of tools for this reason.
Protobuf and flatbuffers have different use cases in my experience, although that's somewhat limited. Protobuf at least also introduced breaking changes between versions 2 and 3. ASN.1 isn't perfect in this regard, but these days incompatibikities have to go through ISO or ITU, etc.
Your experience may be different of course. I'm just pointing out that there are reasons people will opt for a commercial product.
Does one pay for an SLA for every piece of hardware, firmware, and software? The codecs are the least likely cause of downtime.
related: you can also create wireshark dissectors from ASN.1 files
https://www.wireshark.org/docs/wsdg_html_chunked/ASN1StepByS...
Edit: Here's an example of a CHOICE implemented with rust-asn1 that has more than three variants[3].
[1]: https://cryptography.io/en/latest/x509/reference/
[2]: https://cryptography.io/en/latest/x509/verification/
[3]: https://github.com/pyca/cryptography/blob/be6c53dd03172dde6a...
> with the help of funding from Alpha-Omega
From the site:
> funded by Microsoft, Google, and Amazon
Also it's a Linux Foundation project.
Interesting. Python's a big community, and there's some disagreement here over whether this would be better done in pure python. I think it's good that there's a rust/cloud contingent in python land but hope pure python remains popular.
(The native code desirability question is also a red herring here, since PyCA has and will always need to have native code for cryptographic operations. So having a first class ASN.1 API in PyCA was always going to involve native code, with the only variable being how.)
I feel nostalgic seeing (a mirror of) that download page again, but that era was such a pain.
Mirror: http://qiniucdn.star-lai.cn/portal/2016/09/05/tu3p2vd4znn
I mean, I am glad wheels exist, they make things a lot easier for a lot of people. I just believed in the dream of targeting the language for maximum compatibility and hoping a better implementation would eventually run it faster.
And yes, Smalltalk, Self and various Lisp variants are just as dynamic.
It became idiomatic as there was no other alternative.
(Besides, no language has this regardless of native extensions: a huge part of Python’s success comes from the fact that there isn’t a perfect graph of competencies in the community, and instead that community members provide high quality abstractions over their respective competencies.)
One of the pain points from Python is exactly native libraries and getting them compiled.
It’s my impression as a maintainer of many projects that native compilation hasn’t been a driving issue in Python packaging for a while now. Most users download wheels and sidestep the entire problem. Whether or not they should trust random binaries from the internet is of course a bigger question.
Rather being a better Perl for UNIX scripts and Zope CMS, there was no other reason to use Python in 2000.
It is supposed to fix everything that previous ones never achieved.