This talk seems set out to prove that "XML is Bad". Yes XML-DSig isn't great with XPaths, but most of these attack vectors has been known for 10 years. There is probably a reason why the vulnerabilities found where in software not commonly used, e.g. SAP. Many of the things possible with XML and UBL simply isn't available in protobuf, json. How would you digitally sign a Json document and embed the signature in the document?
Most of these attack vectors have been known for 10 years, and yet researchers keep finding bugs in major implementations to this day. Here's one from last week: https://portswigger.net/research/the-fragile-lock
> How would you digitally sign a Json document and embed the signature in the document?
You would not, because that's exactly how you get these bugs. Fortunately serialization mechanisms, whether JSON or Protobuf or XML or anything else, turn structured data into strings of bytes, and signature schemes operate on strings of bytes, so you'll have a great time signing data _after_ serializing it.
If one has a reproducible JSON serializer, then one can add a signature to any JSON object via serializing the object, signing that and then adding the resulting signature to the original object.
This avoids JSON-inside-JSOn and allows to pretty-print the original object with the signature.
Pretty significant catch if interoperability is a concern at all. Whitespace is easy enough to handle but how do dict keys get ordered? Are unquoted numbers with high precision output as-is or truncated to floats/JS Numbers? Is scientific notation ever used and if so when?
Just so people this far down can look it up the term is Canonicalization, and its cousin collation.
These are non-trivial issues that, thankfully, some very smart and/or experienced people have usually handled for us. However, they still frequently lead to all sorts of vulnerabilities. "Stuffing" attacks sometimes rely on these issues, as have several major crypto incidents.
Other answers are good. One more that you could do is put the JSON document inside a container (A zip archive for example). Then your document can effectively be
Unless you are worried about something like a gzip bomb, I don't see why this is an issue. A lot of formats are effectively just zips. The xlsx, odf, etc for example. It's a pretty common format style.
It helps to have a well defined expected structure in the archive.
Aside from the security issue, it seems like an awful idea for a government (or governments, in this case) to say 'hey, you need to follow this standard for invoicing. But also, you have to pay to see the entire standard'.. almost feels like extortion a bit
But yes, for commercial offers, presumption of conformity mean you have to pay for norms to adhere to law. Big fail.
Especially since non-commercial but persistent and public, not "for profit", is still surmised in e.g. warranty laws. (E.g. geschäftsmäßige Nutzung / usage with said two terms, even for F/LOSS)
To be clear: The ones who need to follow the standard (companies that create invoices) do not need access to the standard, only some supplier does. And there are a lot of things that the government requires that costs money - you could see it as another tax.
That said, I actually agree with you - it's crazy that we need to pay for a stupid standard document.
What was unclear in that article is that the XML is usually embedded in the invoice. For instance, Factur-X is the mandatory format in Germany, and it's a PDF which contains a metadata block with a XML EN16931 content.
This XML will usually not be read by the companies that pay the invoice. For instance, in France by the end of 2027, every business will have to send e-invoices, but never directly to the real recipient. The business sends the invoice to a registered go-between, which will ask a national platform for the address of the recipient's go-between, etc. So, only those official go-between companies will have to securely parse the XML.
BTW, in 2022 when the French government decided to make e-invoicing mandatory, it announced that it would develop a national unique go-between platform. Two years later, it dropped that part of the project and announced that there would be an official list of private platforms. So, by the end of 2026 or 2027, every French business will have to select one of the 112 platforms and buy a subscription. It give the State more control, but for small businesses it means higher costs and complexity.
A standard for invoices seems like something that an accounting body should create that is optional for businesses, not something mandatory created by the government. People will generally follow an optional standard to make their own lives easier, but a mandatory one introduces a compliance middleman into the invoicing process.
In the EU there is the "reverse charge" mechanism for VAT when commerce crosses country borders, and it is often used for defrauding EU countries / governments.
The invoicing standard is an attempt to mitigate reverse charge fraud by gathering more machine-readable data. Some countries even demand that b2b invoices are sent to the country, which then dispatches a copy to the recipient.
Knowing this background, it's pretty clear why the EU is making it mandatory.
Personally, in the abstract I like the idea to mandate the use of an open standard, I think we have way too many inefficiencies from treating many things as text documents that could be data structures. I don't like this particular standard though, it's bloated and the result of a typical top-down process.
I much prefer it when there are competing standards for a while, and one or a couple of winner emerge on technical merits. THEN I have no objections to a regulatory body picking a standard and mandating it.
As far as I understand there are multiple XML invoice formats and EN 16931 accepts at least two: UBL and CII. At least in theory. I have no idea how it is going to work out in practice, but I will learn the hard way :-) I have invoicing software as side-project and I have decided to make it usable in EU.
> People will generally follow an optional standard to make their own lives easier
People invent their own standard to make their own lives easier at the cost of making everyone else's lives miserable which is exactly what the European Committee for Standardization was intended to prevent.
Having worked with accounting body standards (NAIC), I can tell you that it really does nothing to improve quality. Especially when parts of the standard encode things like COBOL PIC number symbols. [1]
Other major countries that have adopted eInvoicing have kept it optional for a few reasons:
- It's a barrier for small businesses, or those which seldomly invoice, such as craft and hobby businesses (particularly remote online businesses).
- Large companies see eInvoicing as a cost saving method and force it upon their vendors. This reduces the need to make it mandatory and provides a financial incentive for companies to adopt eInvoicing (i.e. more carrot, less stick.)
The EU has a solid trend of finding ways to self-harm when introducing reforms. This self-harm story segue's into how the EU is considering implementing an Australian-style social media restriction for children:
Quote from abc.net.au below:
European Commission president Ursula von der Leyen told the audience she had been "inspired" by Australia's "bold" move to introduce the ban.
"As a mother of seven children and grandmother of five, I share their view," she said.
The European Parliament has since passed a non-legislative report that would set a minimum age of 16 for social media, while allowing those aged 13 to 15 with parental consent.
-- end quote --
Here the EU is walking down the path of another bad implementation.
Limiting the age for social media only works if it's mandatory for all children, otherwise kids will just pester their parents for access. In the EU's plan the parents become the "bad guy" in that arrangement, the home becomes the battleground for obtaining access to social media.
The EU's plan also means that social media remains relevant for young people, where access may be needed for arranging social activities and sports, and those which don't have it are either inconvenienced or miss out. Meanwhile the Australian implementation removes that purpose as no kids are allowed on the platform, thus there are no "haves" and "have nots" kids.
Finally, and probably most importantly, advertisers, data brokers, and bad actors will still continue to target children through social media networks, since they will still be there in useful numbers.
I think there’s a difference between _wanting_ something to work and _needing_ something to work. Enforced standardized invoicing might be a very tidy and neat solution, but tidiness and neatness are not a good enough argument to mandate it in my opinion. There’s no end to the areas of our lives that could be regulated if that’s the standard we’re aiming for, and I don’t particularly want to live in such a uniform, straightjacketed environment.
Would you rather governments insist on everyone using the same format when invoices are passed around or would you rather have massive amounts of taxpayer money wasted on managing countless conflicting standards, any number of which may also include their own security issues. At a certain scale it just makes sense to say "Okay everyone, we have to pick one way to do this".
If tidiness and neatness are not a good enough argument to mandate this taxpayer savings, time efficiency, and better software should be.
Companies who insist on being precious about their favored invoice format can invest their own time and money on conversion tools that let them convert invoices they get into whatever format they like for their own internal records and convert them to meet the standard again when sending invoices out. That leaves them free to use what they want without making everyone else deal with their mess.
Have you ever actually dealt with invoices? I have hired many many contractors in construction and tech, and I’ve never thought it to be that bad. Definitely not enough of a mess to justify another rule for how I’m supposed to run a business.
> People will generally follow an optional standard to make their own lives easier
You must be new to the internet /s
A company does not gain anything by sending "better" invoices that follow a standard. Only if they receive standardized invoices, but usually not enough to pay extra for it. The fact that standardized invoices haven't happened yet without legislation should be proof of that
As having implemented EDIFACT parsers and translation layers, Universal Business Language (Oasis UBL) is a bliss to work with. Yes, it's a big standard and looks scary when starting out with it, but it is very well designed for a complicated world.
In some member states, like Germany, the EDIFACT format, when compliant with the EN 16931 data model, is accepted as a valid e-invoice format.
EN 16931 defines what information needs to be in an invoice (the data model), while EDIFACT INVOIC defines how that information is structured and formatted for electronic transmission (the syntax).
OK, it’s been a long time since I worked in this space. Seems like it’s an XML version of the INVOIC message, but is it required to support the XML syntax, or does the plain old EDI format suffice?
How can there be security issues with a public document? Can't you just sign it with a cert like any other piece of data that needs a proven source?
But also let me get this straight, there is an actual EU standard for invoices? Why the does nobody follow this and I have to keep asking people to put the fucking VAT ID onto it like I'm a broken record?
Actually, it's not universal and depends if it is B2G, B2B or B2C. The last one (B2C) is not enforced basically in EU (except Romania, but there is no EU requirement for B2C).
The concern is that a malicious vendor could send you an evil invoice where the XML either references external entities that get downloaded and allow potential RCE, or where the document contains references to the local execution environment which allow data exfiltration (or both). In theory a properly-secured XML parser shouldn't allow this, but history has shown that's harder than you might think.
The article nor the talk appear to reference the XML standard that EN 16931 is built upon: Universal Business Language, https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=... - which is freely available. Examples can be found here: https://github.com/Tradeshift/tradeshift-ubl-examples/tree/m... . It is a good standard and yes it's complex, but it is not complicated by accident. I would any day recommend UBL over IDOC, Tradacom, EDIFACT and the likes.
> How would you digitally sign a Json document and embed the signature in the document?
You would not, because that's exactly how you get these bugs. Fortunately serialization mechanisms, whether JSON or Protobuf or XML or anything else, turn structured data into strings of bytes, and signature schemes operate on strings of bytes, so you'll have a great time signing data _after_ serializing it.
Hash: SHA1
> How would you digitally sign a Json document and embed the signature in the document?
Embedding a signature into the same file is easy enough.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v0.9.7 (GNU/Linux)
iEYEARECAAYFAjdYCQoACgkQJ9S6ULt1dqz6IwCfQ7wP6i/i8HhbcOSKF4ELyQB1
oCoAoOuqpRqEzr4kOkQqHRLE/b8/Rw2k =y6kj
-----END PGP SIGNATURE-----
This avoids JSON-inside-JSOn and allows to pretty-print the original object with the signature.
Pretty significant catch if interoperability is a concern at all. Whitespace is easy enough to handle but how do dict keys get ordered? Are unquoted numbers with high precision output as-is or truncated to floats/JS Numbers? Is scientific notation ever used and if so when?
These are non-trivial issues that, thankfully, some very smart and/or experienced people have usually handled for us. However, they still frequently lead to all sorts of vulnerabilities. "Stuffing" attacks sometimes rely on these issues, as have several major crypto incidents.
It's effectively what the Java jar is.
It helps to have a well defined expected structure in the archive.
Presumably the same way you accomplish the thing in xml:
For example, in the USA https://www.rcfp.org/briefs-comments/astm-v-upcodes-inc/
This is an especially hot topic in the EU in medical device regulations: https://www.bsigroup.com/en-GB/insights-and-media/insights/b...
But yes, for commercial offers, presumption of conformity mean you have to pay for norms to adhere to law. Big fail.
Especially since non-commercial but persistent and public, not "for profit", is still surmised in e.g. warranty laws. (E.g. geschäftsmäßige Nutzung / usage with said two terms, even for F/LOSS)
That said, I actually agree with you - it's crazy that we need to pay for a stupid standard document.
This XML will usually not be read by the companies that pay the invoice. For instance, in France by the end of 2027, every business will have to send e-invoices, but never directly to the real recipient. The business sends the invoice to a registered go-between, which will ask a national platform for the address of the recipient's go-between, etc. So, only those official go-between companies will have to securely parse the XML.
BTW, in 2022 when the French government decided to make e-invoicing mandatory, it announced that it would develop a national unique go-between platform. Two years later, it dropped that part of the project and announced that there would be an official list of private platforms. So, by the end of 2026 or 2027, every French business will have to select one of the 112 platforms and buy a subscription. It give the State more control, but for small businesses it means higher costs and complexity.
Github: https://github.com/VladSez/easy-invoice-pdf
App: https://easyinvoicepdf.com/?template=stripe
I’m planning to use this package to generate e-invoice: https://github.com/gflohr/e-invoice-eu
UPD: issue to follow the progress https://github.com/VladSez/easy-invoice-pdf/issues/121
If you have any feedback or suggestions please feel free to reach out to me :)
The invoicing standard is an attempt to mitigate reverse charge fraud by gathering more machine-readable data. Some countries even demand that b2b invoices are sent to the country, which then dispatches a copy to the recipient.
Knowing this background, it's pretty clear why the EU is making it mandatory.
Personally, in the abstract I like the idea to mandate the use of an open standard, I think we have way too many inefficiencies from treating many things as text documents that could be data structures. I don't like this particular standard though, it's bloated and the result of a typical top-down process.
I much prefer it when there are competing standards for a while, and one or a couple of winner emerge on technical merits. THEN I have no objections to a regulatory body picking a standard and mandating it.
Besides, many standards have been created over the past 20 years, yet most invoices are still only sent as PDF.
People invent their own standard to make their own lives easier at the cost of making everyone else's lives miserable which is exactly what the European Committee for Standardization was intended to prevent.
[1] https://www.ibm.com/docs/en/cobol-zos/6.4.0?topic=arithmetic...
- It's a barrier for small businesses, or those which seldomly invoice, such as craft and hobby businesses (particularly remote online businesses).
- Large companies see eInvoicing as a cost saving method and force it upon their vendors. This reduces the need to make it mandatory and provides a financial incentive for companies to adopt eInvoicing (i.e. more carrot, less stick.)
The EU has a solid trend of finding ways to self-harm when introducing reforms. This self-harm story segue's into how the EU is considering implementing an Australian-style social media restriction for children:
Quote from abc.net.au below:
European Commission president Ursula von der Leyen told the audience she had been "inspired" by Australia's "bold" move to introduce the ban.
"As a mother of seven children and grandmother of five, I share their view," she said.
The European Parliament has since passed a non-legislative report that would set a minimum age of 16 for social media, while allowing those aged 13 to 15 with parental consent.
-- end quote --
Here the EU is walking down the path of another bad implementation.
Limiting the age for social media only works if it's mandatory for all children, otherwise kids will just pester their parents for access. In the EU's plan the parents become the "bad guy" in that arrangement, the home becomes the battleground for obtaining access to social media.
The EU's plan also means that social media remains relevant for young people, where access may be needed for arranging social activities and sports, and those which don't have it are either inconvenienced or miss out. Meanwhile the Australian implementation removes that purpose as no kids are allowed on the platform, thus there are no "haves" and "have nots" kids.
Finally, and probably most importantly, advertisers, data brokers, and bad actors will still continue to target children through social media networks, since they will still be there in useful numbers.
If tidiness and neatness are not a good enough argument to mandate this taxpayer savings, time efficiency, and better software should be.
Companies who insist on being precious about their favored invoice format can invest their own time and money on conversion tools that let them convert invoices they get into whatever format they like for their own internal records and convert them to meet the standard again when sending invoices out. That leaves them free to use what they want without making everyone else deal with their mess.
You must be new to the internet /s
A company does not gain anything by sending "better" invoices that follow a standard. Only if they receive standardized invoices, but usually not enough to pay extra for it. The fact that standardized invoices haven't happened yet without legislation should be proof of that
In some member states, like Germany, the EDIFACT format, when compliant with the EN 16931 data model, is accepted as a valid e-invoice format.
EN 16931 defines what information needs to be in an invoice (the data model), while EDIFACT INVOIC defines how that information is structured and formatted for electronic transmission (the syntax).
But also let me get this straight, there is an actual EU standard for invoices? Why the does nobody follow this and I have to keep asking people to put the fucking VAT ID onto it like I'm a broken record?