Signing an XML document using XMLDSIG (Part 1)
This page demonstrates how to create a digital signature in XML. This is a simple [sic] example of an enveloping signature where we sign a straightforward text string inside an XML document. If you want information on encryption in XML documents, see Encryption in XML documents using XMLENC.

2017-06-28: See Canonicalization of an XML document for a more detailed how-to guide for canonicalization (C14N) of an XML document prior to signing, and
2017-07-11: SC14N, a straightforward XML canonicalization utility.
See Using SC14N to compute the digest of the input text string directly and Using SC14N to compute the digest of the SignedInfo directly.
To make a digital signature, you need a private key. Our example uses the 1024-bit RSA private key for Alice from RFC 4134 [SMIME-EX]. We use our CryptoSys PKI Toolkit to carry out the necessary computations. We treat an XML document as a simple text file and avoid using any of those frightful, unwieldy XML "DOM" packages.
We give full details of the exact data to be processed at each stage in order to produce the final signed XML document. We hope this is in sufficient detail to help you implement your own version.
For advanced users:
If this is too simple for you, see our page on
XML-Dsig and the Chile SII
where we look in detail at creating digital signatures in XML documents using the standards for electronic invoices
set by the Servicio de Impuestos Internos (SII) of Chile.
There are some useful hints and generic functions in VB6 and C# to create <SignedInfo>
elements for XML-Dsig.
2012-10-01:
See How to create a SAT Cancelacion document
an enveloped XML-DSIG document with the namespace http://cancelacfd.sat.gob.mx
issued by the
Servicio de Administración Tributaria (SAT) in Mexico. Updated 2022-01-29.
See also Accented characters and UTF-8 in XML-DSIG signatures where we look at a simple example to create an XML-DSIG signature of an XML document containing accented characters like áéíóúñ
Contents
Foreword | Download | Testing | Input | Output | Procedure | Message Digests | Canonicalization | References
Foreword
>>I have some questions related to XML-Dsig: > >Argghh!! Run away! A near-universal reaction.
- from Why XML Security is Broken by Peter Gutmann. For another rant, see our page XML is xhite.
Download
Here is the VB6 code, the output XML file, Alice's PKCS#8 encrypted private key (password: "password"), her corresponding X.509 certificate, and all these files collected in a zip file.
Input
In this example we create the digital signature for the text
some text with spaces and CR-LF.
That is, the 35 bytes beginning with 's', 'o', 'm',...
and ending with
...,'L', 'F', '.'
.
There is exactly one CR-LF newline (the two-byte sequence (0x)0D 0A
) in the text,
between the two lines. There are two spaces before the word "with".
There is no newline at the end.
In hexadecimal format, the text is
73 6F 6D 65 20 74 65 78 74 0D 0A 20 20 77 69 74 68 20 73 70 61 63 65 73 20 61 6E 64 20 43 52 2D 4C 46 2E
Output
Output XML file (1 kB).
<?xml version="1.0" encoding="UTF-8"?> <Signature xmlns="http://www.w3.org/2000/09/xmldsig#"> <SignedInfo> <CanonicalizationMethod Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315" /> <SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1" /> <Reference URI="#object"> <DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1" /> <DigestValue>OPnpF/ZNLDxJ/I+1F3iHhlmSwgo=</DigestValue> </Reference> </SignedInfo> <SignatureValue>nihUFQg4mDhLgecvhIcKb9Gz8VRTOlw+adiZOBBXgK4JodEe5aFfCqm8WcRIT8GL LXSk8PsUP4//SsKqUBQkpotcAqQAhtz2v9kCWdoUDnAOtFZkd/CnsZ1sge0ndha4 0wWDV+nOWyJxkYgicvB8POYtSmldLLepPGMz+J7/Uws=</SignatureValue> <KeyInfo> <KeyValue> <RSAKeyValue> <Modulus>4IlzOY3Y9fXoh3Y5f06wBbtTg94Pt6vcfcd1KQ0FLm0S36aGJtTSb6pYKfyX7PqC UQ8wgL6xUJ5GRPEsu9gyz8ZobwfZsGCsvu40CWoT9fcFBZPfXro1Vtlh/xl/yYHm +Gzqh0Bw76xtLHSfLfpVOrmZdwKmSFKMTvNXOFd0V18=</Modulus> <Exponent>AQAB</Exponent> </RSAKeyValue> </KeyValue> </KeyInfo> <Object Id="object">some text with spaces and CR-LF.</Object> </Signature>
Note that the whitespace inside the
<SignedInfo>
and <Object>
elements is important and should not be changed.
Testing
Test with the Online XML Digital Signature Verifer.
2022-03-20: See Troubleshooting problems on the 'Online XML Digital Signature Verifier' site.
Procedure
Algorithm: XMLDSIG of simple text string.
INPUT:
T, text-to-be-signed, a byte string;
Ks, RSA private key;
OUTPUT: XML file, xml
- Canonicalize* the text-to-be-signed, C = C14n(T).
- Compute the message digest of the canonicalized text, m = Hash(C).
- Encapsulate the message digest in an XML
<SignedInfo>
element, SI, in canonicalized form. - Compute the RSA signatureValue of the canonicalized
<SignedInfo>
element, SV = RsaSign(Ks, SI). - Compose the final XML document including the signatureValue, this time in non-canonicalized form.
* Strictly, what we are doing here is encapsulating the text string T inside an
<Object>
element, then canonicalizing that element.
Message Digests
There are two message digests to compute. The input to these two computations has to be exactly correct or you will get the wrong result. We use the SHA-1 message digest function, which outputs a hash value 20 bytes long.
Digest of the input text string
Form the canonicalized <Object>
element with all CR-LF pairs
((0x)0D 0A
) in the text converted
to single LF characters (0x0A
).
In this case there is no newline after the text, so the closing tag
comes directly after the '.' character in the text string.
Note we have added the xmlns attribute,
which exists here but not in the original or final document.
This attribute is propagated from the parent <Signature>
element.
<Object xmlns="http://www.w3.org/2000/09/xmldsig#" Id="object">some text with spaces and CR-LF.</Object>
and compute the message digest of the byte string beginning
'<', 'O', 'b',...
and ending ...,'e','c', 't', '>'
000000 3c 4f 62 6a 65 63 74 20 78 6d 6c 6e 73 3d 22 68 <Object xmlns="h 000010 74 74 70 3a 2f 2f 77 77 77 2e 77 33 2e 6f 72 67 ttp://www.w3.org 000020 2f 32 30 30 30 2f 30 39 2f 78 6d 6c 64 73 69 67 /2000/09/xmldsig 000030 23 22 20 49 64 3d 22 6f 62 6a 65 63 74 22 3e 73 #" Id="object">s 000040 6f 6d 65 20 74 65 78 74 0a 20 20 77 69 74 68 20 ome text. with 000050 73 70 61 63 65 73 20 61 6e 64 20 43 52 2d 4c 46 spaces and CR-LF 000060 2e 3c 2f 4f 62 6a 65 63 74 3e .</Object>
The exact byte string in this case to be digested is (in hex format)
DATA= 3C4F626A65637420786D6C6E733D22687474703A2F2F7777 772E77332E6F72672F323030302F30392F786D6C64736967 23222049643D226F626A656374223E736F6D652074657874 0A2020776974682073706163657320616E642043522D4C46 2E3C2F4F626A6563743E Hash(DATA)=38F9E917F64D2C3C49FC8FB5177887865992C20A Base64(Hash(DATA))=OPnpF/ZNLDxJ/I+1F3iHhlmSwgo=
Using SC14N to compute the digest of the input text string directly
Using SC14N on the XML file: Transform the subset for element with Id="object" and compute digest value of this using default SHA-1.
> sc14n -d -S object XmlAliceSig-base.xml OPnpF/ZNLDxJ/I+1F3iHhlmSwgo=In C#:
string digval = Sc14n.C14n.ToDigest("XmlAliceSig-base.xml", "object", Tran.SubsetById, DigAlg.Sha1);
Digest of the SignedInfo
Form the canonicalized <SignedInfo>
element.
Note the xmlns attribute which we include here, but not in the final document.
This is propagated down from the parent <Signature>
element.
<SignedInfo xmlns="http://www.w3.org/2000/09/xmldsig#"> <CanonicalizationMethod Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315"></CanonicalizationMethod> <SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"></SignatureMethod> <Reference URI="#object"> <DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"></DigestMethod> <DigestValue>OPnpF/ZNLDxJ/I+1F3iHhlmSwgo=</DigestValue> </Reference> </SignedInfo>
In hex format, the byte string is
3C5369676E6564496E666F20786D6C6E733D22687474703A2F2F7777772E7733 2E6F72672F323030302F30392F786D6C6473696723223E0A20203C43616E6F6E 6963616C697A6174696F6E4D6574686F6420416C676F726974686D3D22687474 703A2F2F7777772E77332E6F72672F54522F323030312F5245432D786D6C2D63 31346E2D3230303130333135223E3C2F43616E6F6E6963616C697A6174696F6E 4D6574686F643E0A20203C5369676E61747572654D6574686F6420416C676F72 6974686D3D22687474703A2F2F7777772E77332E6F72672F323030302F30392F 786D6C64736967237273612D73686131223E3C2F5369676E61747572654D6574 686F643E0A20203C5265666572656E6365205552493D22236F626A656374223E 0A202020203C4469676573744D6574686F6420416C676F726974686D3D226874 74703A2F2F7777772E77332E6F72672F323030302F30392F786D6C6473696723 73686131223E3C2F4469676573744D6574686F643E0A202020203C4469676573 7456616C75653E4F506E70462F5A4E4C44784A2F492B3146336948686C6D5377 676F3D3C2F44696765737456616C75653E0A20203C2F5265666572656E63653E 0A3C2F5369676E6564496E666F3E
The message digest of this is in hex is 5AC8EFAB045A9A46FE001AC58C253646FF88DC6A
or WsjvqwRamkb+ABrFjCU2Rv+I3Go=
in base64.
Using SC14N to compute the digest of the SignedInfo directly
Using SC14N on the XML file: Transform the subset for element with tag name SignedInfo and compute digest value of this using default SHA-1.
> sc14n -d -s SignedInfo XmlAliceSig.xml WsjvqwRamkb+ABrFjCU2Rv+I3Go=In C#:
string digval = Sc14n.C14n.ToDigest("XmlAliceSig.xml", "SignedInfo", Tran.SubsetByTag, DigAlg.Sha1);
Actually, this digest value is not output directly. It is computed and then encrypted as part of the signature value calculation. But to verify the signature you need to be able to re-create it. (Thanks to Marcos Paulo Pereira Brito Garcia for pointing out an error in an early version of this.)
The byte string of the <SignedInfo>
element is input to the
sha1WithRSAEncryption
signature algorithm and signed with Alice's private RSA key
to produce the 1024-bit RSA
signatureValue
in hex format
9E285415083898384B81E72F84870A6FD1B3F154533A5C3E69D89938105780AE 09A1D11EE5A15F0AA9BC59C4484FC18B2D74A4F0FB143F8FFF4AC2AA501424A6 8B5C02A40086DCF6BFD90259DA140E700EB4566477F0A7B19D6C81ED277616B8 D3058357E9CE5B227191882272F07C3CE62D4A695D2CB7A93C6333F89EFF530B
In base64 this is
nihUFQg4mDhLgecvhIcKb9Gz8VRTOlw+adiZOBBXgK4JodEe5aFfCqm8WcRIT8GL
LXSk8PsUP4//SsKqUBQkpotcAqQAhtz2v9kCWdoUDnAOtFZkd/CnsZ1sge0ndha4
0wWDV+nOWyJxkYgicvB8POYtSmldLLepPGMz+J7/Uws=
Update 2017-08-13: See some code to compute this signature value.
Comment on SignedInfo
In the composition of the <SignedInfo>
element above, we added some space
characters before the lines, to add to readability. These space characters must be
preserved in both the canonicalized version and the final XML document.
It gets even messier if you use tab characters (0x09) because, if they get changed later into
space characters, you will fail to get the correct signature value.
It is better practice to form the <SignedInfo>
element with no whitespace
before the elements and just a single newline after each line, as follows:
<SignedInfo xmlns="http://www.w3.org/2000/09/xmldsig#"> <CanonicalizationMethod Algorithm="..."></CanonicalizationMethod> <SignatureMethod Algorithm="..."></SignatureMethod> <Reference URI="..."> <DigestMethod Algorithm="..."></DigestMethod> <DigestValue>...</DigestValue> </Reference> </SignedInfo>
Note, though, that this will give a different signature value than our example above. If, at this stage, you are thinking, "But isn't that a rather stupid procedure if it can be messed up so easily?", you would not be wrong...
Canonicalization (c14n)
Canonicalization is a method for generating a physical representation, the canonical form, of an XML document that accounts for permissible syntactic changes.
In other words, no matter what (legal) changes you could make to a given XML document, the canonical form will always be identical, byte-for-byte.
The cute abbreviation for canonicalization is c14n denoting that there are 14 characters between the "c" and the "n" in a word that is obviously too long to begin with.
Note that the canonicalized data does not appear in the original or final XML document. It is composed in memory and a message digest or RSA signature value is computed from it.
This is the official (2001) outline of the procedure for c14n, taken from [XML-C14N]:
- The document is encoded in UTF-8
- Line breaks normalized to #xA on input, before parsing
- Attribute values are normalized, as if by a validating processor
- Character and parsed entity references are replaced
- CDATA sections are replaced with their character content
- The XML declaration and document type declaration (DTD) are removed
- Empty elements are converted to start-end tag pairs
- Whitespace outside of the document element and within start and end tags is normalized
- All whitespace in character content is retained (excluding characters removed during line feed normalization)
- Attribute value delimiters are set to quotation marks (double quotes)
- Special characters in attribute values and character content are replaced by character references
- Superfluous namespace declarations are removed from each element
- Default attributes are added to each element
- Lexicographic order is imposed on the namespace declarations and attributes of each element
Simple, eh?
To make it even worse, the rules above are for a complete XML document.
When you are canonicalizing a Subset of a document, like we are doing here,
you have to propagate the xml namespaces from the parent elements
that have been omitted (unless you are using Exclusive XML Canonicalization (xml-exc-c14n), which we are not!).
The merged xmlns attributes then have to be sorted in a certain order.
In this example, the
<Object>
and <SignedInfo>
elements
inherit the xmlns attribute from their omitted parent <Signature>
.
In our example here, it was sufficient just to replace any CR-LF (0x0D 0A) line break with a single LF (0x0A) character (point 2 above). All other issues were dealt with by simply hardcoding the necessary XML tags and attributes in our variable strings.
Other c14n issues
Given a simple text string input, and the fact that we are composing our own XML document instead of dealing with an existing one, the two other issues that we are most likely to have to deal with are UTF-8 encoding (point 1 above) and entity references (point 4):
- UTF-8 encoding
- If our text-to-be-signed string, T, contains any non-ASCII characters,
make sure these are converted to UTF-8 encoding.
For example, the character á (small letter a with acute accent) is encoded in the ISO-8859-1 character set (Latin-1) as the single byte value 225 (0xE1). This is not an ASCII character, as it has a value greater than 127. Such characters need to be converted to UTF-8 encoding. In this case, the byte
0xE1
must be represented as the two-byte UTF-8 sequence(0x)C3 A1
.In CryptoSys PKI, use the
CNV_UTF8BytesFromLatin1
function to convert a string containing Latin-1 characters to proper UTF-8.With Notepad++, use menu options- Edit > EOL Conversion > Unix (LF)
- Encoding > Convert to UTF-8
- Entity references
- All occurences of the following characters in element content:
- the ampersand (&),
- the less than symbol (<),
- the greater than symbol (>),
- the quotation mark or double quote ("), and
- the apostrophe or single quote (')
& < > " '
respectively. This only applies to characters inside an element's content, not the tags themselves.So, for example, the 8-byte string
<x>&</x>
((0x)3C783E263C2F783E
) is transformed to the 12-byte string<x>&</x>
((0x)3C783E26616D703B3C2F783E
).
These two issues should cover almost all instances for a simple text string.
2017-06-28: See Canonicalization of an XML document
for a more detailed how-to guide for canonicalization (C14N) of an XML document prior to signing.
Re-released 2018-08-09:
Our new program SC14N, a straightforward XML canonicalization utility
performs the canonicalization (C14N) transformation you need to do when creating signed XML documents using XML-DSIG.
For some more examples, see the section Canonicalizing the SII elements on our XML-Dsig and the Chile SII page.
References
- [XML-C14N] RFC 3076 Canonical XML Version 1.0, March 2001, <http://www.ietf.org/rfc/rfc3076.txt>.
- [XML-DSIG] RFC 3275 XML-Signature Syntax and Processing, March 2002, <http://www.ietf.org/rfc/rfc3275.txt>.
- [SMIME-EX] RFC 4134 Examples of S/MIME Messages, July 2005, <http://www.ietf.org/rfc/rfc4134.txt>.
- XML Signature WG
<http://www.w3.org/Signature/>:
- XML-Signature Syntax and Processing <http://www.w3.org/TR/xmldsig-core/>
- Canonical XML Version 1.0, <http://www.w3.org/TR/2001/REC-xml-c14n-20010315/>
- Exclusive XML Canonicalization Version 1.0, <http://www.w3.org/TR/2002/REC-xml-exc-c14n-20020718/>
- [GUTM04] Peter Gutmann Why XML Security is Broken, October 2004, <http://www.cs.auckland.ac.nz/~pgut001/pubs/xmlsec.txt>.
Contact
For more information, or to comment on this page, please send us a message.
This page last updated 15 November 2022