In HTTP/1.1, RFC 2616, there is a protocol parameter called "Entity Tags" (section 3.11), defined as follows:
entity-tag = [ weak ] opaque-tag
weak = "W/"
opaque-tag = quoted-string
A quoted-string is defined as a string of text parsed as a single word surrounded by double-quotation marks.
Comparing these strings is done in accordance with "Weak and Strong Validators" (section 13.3.3), which defines two types of validators:
- The strong comparison function: in order to be considered equal, both validators MUST be identical in every way, and both MUST NOT be weak.
- The weak comparison function: in order to be considered equal, both validators MUST be identical in every way, but either or both of them MAY be tagged as "weak" without affecting the result.
This is interesting, as what does identical in every way
actually mean? If a client sends something that matches token (i.e., a string without double quotation marks surrounding it), is it equivalent to the quoted-string?
I would logically expect it to be so, however, it is in neither IIS 6.0 nor Apache 2.0. If you argue that it is not, you end up asking yourself whether, e.g., Content-Type: text/plain;charset="UTF-8" is referring to a character set called "UTF-8" (i.e., including the quotation marks) or whether it is referring to one called UTF-8 (i.e., excluding the quotation marks). For compatibility, you must exclude them.
So should we parse the quotes out to find out the value? This would, however, make Etag: W/"a" and Etag: "W/a" equivalent. Are these identical in every way? In this case I'd say no, as in the latter case the W/
is within quotation marks and therefore part of a strong identifier. This also gives the afore mentioned problem with "UTF-8" v. UTF-8. So what are we implementers meant to do? Anyone?