[ietf-dkim] Introducing myself

Discussion:

Charles Lindsey

2006-10-30 21:42:21 UTC

Firstly, let me apologise for not appearing on this list earlier, but I
only became aware of this project a little over a week ago, and I have
been studying the documents carefully since then, as time permitted.

I am familiar with the two existing schemes for signing headers of
messages, namely
PGPVERIFY, for authenticating control messages in Netnews
PGPMOOSE, for authenticating articles posted to moderated newsgroups
and I have experience of both sending and acting upon PGPVERIFY messages,
and of hacking code to process them.

Moreover, at a time when the ietf-usefor WG was considering a replacement
for PGPVERIFY (which has some technical problems, and it not in a fit
state for standardization as it stands) I wrote a draft for a complete
header signing scheme, although the Usefor WG decided at that time not to
proceed with it as it was having trouble enough dealing with more pressing
issues. It is, in principle, still on the list of future work for that WG.

My draft has long since expired as an ID, but it may still be seen at
http://www.imc.org/ietf-usefor/drafts/draft-lindsey-usefor-signed-01.txt
and it may be of interest to members of this list. It has many similarities
with the DKIM-base, but also many differences, in particular a somewhat
more
aggressive canonicalization.

At that time, I tried to interest the ietf-822 mailing list in it, but the
Grandees on that group informed me, in no uncertain terms, that signing of
email headers was a totally unnecessary concept that would never be of any
practical use :-( . Nevertheless, I still took care to ensure that my
draft was workable both for Email and Netnews.

On studying the DKIM-base document, I find many features that are
excellent, a few that are perplexing, and a couple that I consider
downright harmful. But as a newcomer to this list, and particularly AIUI
you are trying to get this proposal finalized as soon as possible, it
would be inappropriate for me to barge in with a long series of problems
and counter-proposals.

So I will, instead, stick to asking questions which I hope some member of
this list will be kind enough to answer, and then some of my perplexities
will hopefully be reduced.

Note that these comments are in terms of the dkim-base-05 draft. I have
looked briefly at draft-06, particularly at its list of changes from
draft-05, but it seems most of what I wanted to say still applies.

3.1 Selectors

Is a <selector> case-insensitive as domain-names are? And is it to be
rendered in IDNA if a Non-ASCII charset is involved?

3.2 Tag=Value Lists
INFORMATIVE IMPLEMENTATION NOTE: Although the "plain text"
defined below (as "tag-value") only includes 7-bit characters, an
implementation that wished to anticipate future standards would be
advised to not preclude the use of UTF8-encoded text in tag=value
lists.

Those future standards are nearer than you think. The currently active
ietf-eai WG is charged with producing an experimental protocol for writing
headers in UTF-8. Would it not be wiser to make support for arbitrary
octets (except those essential for parsing such as ";", CR, LF, etc) a
MUST accept right from the start?

As a matter of interest, why don't <tag>s use the same syntax as <token>s,
which appear in similar contexts in RFC 2045 and other places (but without
any hint of CFWS around them, of course). And are <tag-name>s case
insensitive?

3.3.3 Other algorithms

Presumably there is nothing to prevent allowing PGP as the signing
algorithm in the future, if someone makes out a good case for it.

3.4 Canonicalization

In what circumstances is the 'simple' canonicalization inappropriate, and
why is it the default?

Is it not the case that the "meaning" of a message is, according to RFC
2822 etc., unaffected by changes of folding, or of case of header-names,
or of CTE, or of encodings or re-encodings using RFC 2047 or RFC 2231? And
hence any canonicalization that preserves "meaning" cannot do any harm?
Anyway, I shall return to this when I come to section 5.3.

3.4.2 The "relaxed" Header Field Canonicalization Algorithm

Is it possible that re-folding a structured header en route will introduce
a WSP that was not there beforehand, and thus break the signature? I
avoided this in my own draft by ignoring all WSP in headers, except when
inside comments and quoted-strings and in structured headers such as
Subject.

3.5 The DKIM-Signature header field

Although your charter forbids you from discussing non-repudiation,
authorization, and other matters not strictly relevant for DKIM, it is to
be envisaged that other applications will arise from time to time
requiring signatures over headers, and it would be unfortunate if each
such application had to invent Yet-Another-Signing-Protocol when a simple
adaptation of what you have written would have sufficed. There are already
too many only-slightly-different-wheels in existence for us to be
inventing any more. Surely, a facility for signing headers should be
described as a tool which can then be used for various applications in
future, of which DKIM would be just the first? So why was this approach
not taken?

In fact, you almost made it. The only features which might make it hard
for future applications that I can see are the appearance of "DKIM" in
your newly invented "DKIM-Signature" header (it rather needs an
'application' tag in the signature to indicate why the signature was
made), and the insistence that the d= and s= tags, which together identify
the owner of the key, should be syntactically of the form of domain-names
(which might be totally inappropriate for those other applications, though
it should clearly be required when the application is DKIM).

Can the various tags appear in this header in any order? OTOH, why is
there not an insistence that the b= tag should come last (since it has to
be easily joined to and separated from the rest)?

v= Version (MUST be included).

Does the version relate to the version of the algorithm identified by the
a= tag, or to the version of dkim-base as a whole? IOW, if someone invents
a new tag, or a new tag-value, that can be safely ignored by existing
implementations, is it necessary to invent a new version?

bh= The hash of the canonicalized body part of the message

Yes, I like this, since it enables some useful information to be recovered
if the header hash succeeds but the body hash fails.

d= The domain of the signing entity

Is this case-insensitive?

h= Signed header fields

Why MUST NOT this list be empty? Suppose you want to sign the body, but
not any headers? Unusual, but perhaps sensible for some application. No
interoperability arises.

I don't understand the remark about "message/rfc822 content types". How
can this problem arise?

i= Identity of the user or agent

Is this case-insensitive (I might expect a different answer there for the
<local-part> and the <domain-name>)?

Why MUST the <domain-name> be a subdomain of the d= tag (and why not of
the s= tag, and what interoperability arises anyway)?

Must this tag, if a <local-part> is present, be a valid working email
address?

l= Body length count

I am very suspicious of the propriety of suggesting, in any IETF standard,
that it is legitimate to remove text from a message being conveyed
(certainly without the consent of the recipient). Surely marking it with
blood-red ink, or warnings in 32pt characters is as far as one should go?

q= A colon-separated list of query methods used to retrieve the
public key

Clearly, the use of DNS or some similar global database is the only
sensible PKI that is workable for DKIM. But am I right in saying that this
tag does not preclude the use of other PKIs for other applications (e.g.
attached certificates, web-of-trust, private agreements between the
communicating parties, etc.)?

Why MUST signers support "dns/ext" (clearly, verifiers MUST)? Surely a
signer who, as a matter of policy, always chooses to use some other query
method, is not obliged to implement something he is never going to use.

s= The Selector subdividing the namespace

Case-insensitive?

t= Signature Timestamp
... The format is the number of seconds
since 00:00:00 on January 1, 1970 in the UTC time zone. ...

Strictly speaking not true, since the usual UNIX algorithm for calculating
this quantity takes no account of leap seconds. I presume this is all laid
down in POSIX somewhere.

And expecting this to work up to AD 200,000 seems an overkill (though
beyond 2038 would be helpful).

z= Copied header fields
Verifiers MUST NOT use the header field names or copied values
for checking the signature in any way. Copied header field
values are for diagnostic use only.

Why ever not? I can think of examples where a verifier might find it
exceedingly useful to be aware of the original state of some header which
might have been changed somewhere en route. And what potential
interoperability arises if a verifier makes some use of this information?

3.6 Key Management and Representation
public_key = dkim_find_key(q_val, d_val, s_val)

I do not find the operation 'dkim_find_key' defined or used anywhere else
in the draft.

3.6.1 Textual Representation
h= Acceptable hash algorithms
... Signers and Verifiers MUST
support the "sha256" hash algorithm. Verifiers MUST also support
the "sha1" hash algorithm.

Why MUST signers support the "sha256" hash algorithm (clearly, verifiers
MUST)? Surely a signer who, as a matter of policy, always chooses to use
sha-1 is not obliged to implement something he is never going to use?

k= Key type (plain-text; OPTIONAL, default is "rsa"). Signers and
verifiers MUST support the "rsa" key type.

Why MUST signers support the "rsa" key type (clearly, verifiers MUST)?
Surely a signer who, as a matter of policy, always chooses to use some
other key type is not obliged to implement something he is never going to
use?

3.7 Computing the Message Hashes
.... The header field MUST be presented to
the hash algorithm after the body of the message ...

Does "The header field" mean 'The "DKIM-Signature" header field under
construction'?

4. Semantics of Multiple Signatures
Signers should be cognizant that signing DKIM-Signature headers may
result in signature failures with intermediaries that do not
recognize that DKIM-Signatures are trace headers and unwittingly
reorder them.

This method of relying on the order of headers to distinguish between
multiple signatures seems far from robust. I would be happy to describe an
alternative and more reliable method, applicable at least to signing other
signatures, that I have in mind (but I promised to stick to questions for
now :-) ).

... For
example, a verifier that by policy chooses not to accept signatures
with deprecated cryptographic algorithms should consider such
signatures invalid. As with messages with a single signature,
verifiers are at liberty to use the presence of valid signatures as
an input to local policy; ...

Where are "valid" and "invalid" defined, and is "invalid" synonymous with
"failed"? I would hope not, but it is not clear.

5.1 Determine if the Email Should be Signed and by Whom
INFORMATIVE IMPLEMENTER ADVICE: SUBMISSION servers should not
sign Received header fields if the outgoing gateway MTA obfuscates
Received header fields, for example to hide the details of
internal topology.

I see several mentions of signing Received headers. Signing of any header
that may occur multiple times in a message is always risky (though I can
see a necessity for it in a few cases). Under what circumstances would
including a Received header within a signature provide a security benefit
(in the sense of countering some scam or threat) commensurate with this
risk?

5.3 Normalize the Message to Prevent Transport Conversions

I found this section absolutely astounding.

Message bodies written in Non-ASCII charsets have been commonplace now for
12 or more years, and they are most readily represented as 8-bit. 8BITMIME
has been around for the same length of time and is now almost universally
deployed. 8bit using 8BITMIME has become, or is well on the way to
becoming, the preferred CTE for charsets which will not fit into 7bits.
And yet you are now seriously proposing, for a protocol that needs to be
used in the great majority of future email messages if it is to fulfill
its purpose, to return to encodings that can be squashed into 7bits. That
is one monumental step backwards for the IETF.

Moreover, even as we speak, the ietf-eai WG, which is chartered to bring
in headers using UTF-8 and which can then nearly always be read and
understood by examination of the code as seen on the wire, is advocating
the universal use of 8BITMIME (and more) except when interfacing with
legacy systems which will, hopefully, have faded away by the time the next
12 years or so have elapsed.

And all this is entirely unnecessary. As I have already said, the
"meaning" of any email message is, by definition, independent of the CTE
with which it is transported. All you have to do is to arrange that the
canonicalization decodes any Quoted-Printable or Base64 that is
encountered, and uses the result of that in computing any hash. Was this
option considered and, if so, why was it rejected?

5.4 Determine the header fields to Sign
The

Eliot Lear

2006-10-31 06:27:14 UTC