I use anchors in my URLs, allowing people to bookmark 'active pages' in a web application. I used anchors because they fit easily within the GWT history mechanism.
My existing implementation encodes navigation and data information into the anchor, separated by the '-' character. I.e. creating anchors like #location-location-key-value-key-value
Other than the fact that negative values (like -1) cause serious parsing problems, it works, but now I've found that having two separator characters would be better. Also, givin the negative number issue, I'd like to ditch using '-'.
What other characters work in a URL anchor that won't interfere with the URL or its GET params? How stable will these be in the future?
-
Looking at the RFC for URLs, section 3.5 a fragment identifier (which I believe you're referring to) is defined as
fragment = *( pchar / "/" / "?" )
and from Appendix A
pchar = unreserved / pct-encoded / sub-delims / ":" / "@" unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
Interestingly, the spec also says that
"The characters slash ("/") and question mark ("?") are allowed to represent data within the fragment identifier."
So it appears that real anchors, like
<a href="#name?a=1&b=2"> .... <a name="name?a=1&b=2">
are supposed to be legal, an is very much like the normal URL query string. (A quick check verified that these do work correctly in at least chrome, firefox and ie) Since this works, I'm assuming you can use your method to have URLs like
http://www.site.com/foo.html?real=1¶meters=2#fake=2¶meters=3
with no problem (e.g. the 'parameters' variable in the fragment shouldn't interfere with the one in the query string)
You can also use percent encoding when necessary... and there are many other characters defined in sub-delims that could be usable.
NOTE:
Also from the spec:
"A fragment identifier component is indicated by the presence of a number sign ("#") character and terminated by the end of the URI."
So everything after the # is the fragment identifier, and should not interfere with GET parameters.
Paul W Homer : Wouldn't your first example be interpreted as an anchor of "name", and two parameters a and b? Is that second example legal?Daniel LeCheminant : @Paul: No, because according to the spec, everything after the # in a url is considered to be the fragment identifier. Normal URL parameters must be encoded before the #Paul W Homer : So http://machine.domain.com:8232/path?a=2&b=2#mapping?c=-32&d=564 should work? Heck, that's way nicer those those pesky dashes :-)Daniel LeCheminant : @Paul: It's supposed to work (and from what I can tell on my machine, it does work)Paul W Homer : Thanks Daniel, I'll be editing my code, so I'll update this in a few (working) days. I like the idea of using '&' and '=' as seps.Julian Reschke : Keep in mind that you need also to consider what HTML allows. XHTML 1.0 deprecated a/@name in favor of a/@id, and HTML5 removes it (if I recall correctly). But, the syntax for a/@id is constrained to name characters, so many of those allowed in fragments will not be allowed there.
0 comments:
Post a Comment