Version 0.4, 30 June 2003, http://www.neilvandyke.org/uriframe/
by
Neil W. Van Dyke
<neil@neilvandyke.org>
Copyright © 2003 Neil W. Van Dyke. This program is Free Software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. See the GNU Lesser General Public License [LGPL] for more details.
NOTICE: This is an snapshot of work in progress. The documentation for existing code is incomplete, with many "!!!" placeholders. Expect substantial API changes.
UriFrame is a extensible object class framework for parsing, representing, manipulating, and formatting Web Uniform Resource Identifiers (URI) [RFC2396], which includes Uniform Resource Locators (URL). It uses the PLT Scheme object model.
UriFrame has standard support for many popular URI schemes, and contains components that can be used to add support for different schemes. Mixins are used extensively, and class-mapping abstractions support easy addition of application-specific URI behavior. UriFrame is currently oriented towards applications that need to support multiple URI schemes and that often require a full parse of each URI. (Applications that only need to support one URI scheme, such as an HTTP server, are likely better off using an application-specific URI representation.)
Programs using UriFrame should require the (lib "uriframe.ss"
"uriframe") module, and may also need (lib "class.ss").
All URI classes within UriFrame implement the uri<%> interface.
| uri<%> | Interface |
|
Basic interface for URI and URI reference objects. Notably, all URI
classes must implement the |
| fragment | Method on uri<%> |
| set-fragment! v | Method on uri<%> |
|
This pair of methods gets and sets (respectively) the fragment attribute of
a URI reference. The fragment attribute is either a string or (send (string->uri "http://a/b#My%20%231%20Foo") fragment)
=> "My #1 Foo"
|
| uri-scheme | Method on uri<%> |
|
Yields the scheme ID of the URI, which is a lowercase symbol as generated
by |
| parse/layout! str a o f base | Method on uri<%> |
|
Sent by Subclasses implementing this method normally propagate the message to their superclass before performing any other action. |
| absolute? | Method on uri<%> |
|
Predicate for whether or not the URI is absolute. |
| force-absolute! | Method on uri<%> |
|
Force a URI to be absolute, to the extent possible with the given information, such as by discarding excess relative path components. Note that a URI might remain relative despite this operation, such as when the scheme is unknown or there is some scheme-specific information that cannot be resolved. For example: (send* (string->uri "foo") (force-absolute!) (absolute?)) => #f
|
| resolve! base | Method on uri<%> |
|
Resolves relative components of the URI with respect to base URI base, yielding the resolved URI and possibly mutating the receiver. If the URI scheme of base is unknown, then behavior is undefined. However, base does not necessarily have to be absolute in all its components (e.g., it may consist of only a scheme, or it may have a relative path), since in practice this may be sensitive to the URI scheme. Implementing classes are free to ignore non-absolute components of base. The usual case is for the receiver of this message to be mutated and
returned. However, resolving a URI without an explicit scheme against a
base URI with a scheme can result in a new URI object being constructed for
the scheme class, so the code pattern |
| resolve/like! base | Method on uri<%> |
|
!!! |
| ->string | Method on uri<%> |
|
Yields a string representation of the URI in the canonical format, with proper escaping. !!! |
| write-uri port | Method on uri<%> |
|
Writes the same representation as (send u write-uri p)
==
(display (send u ->string) p)
!!! |
The default base class implementing uri<%> is provided by mixin
uri-base-mixin, which can have object% or an
application-specific class as a superclass.
| uri-base-mixin | Mixin |
|
Implements interface |
| authority-absolute? | Method on uri-base-mixin |
| opaque-absolute? | Method on uri-base-mixin |
|
!!! override in subclasses |
| resolve/like! base | Method on uri-base-mixin |
|
Resolve the URI with respect to base URI base, which is of the same URI scheme. This method will normally be overridden for each URI scheme class. !!! propagate up |
!!!
| write-authority-part port | Method on uri-base-mixin |
| write-opaque-part port | Method on uri-base-mixin |
|
!!! override in subclasses |
| plt-url-scheme | Method on uri-base-mixin |
| plt-url-host | Method on uri-base-mixin |
| plt-url-portnum | Method on uri-base-mixin |
| plt-url-path | Method on uri-base-mixin |
| plt-url-params | Method on uri-base-mixin |
| plt-url-query | Method on uri-base-mixin |
| plt-url-fragment | Method on uri-base-mixin |
|
These methods are used in the construction of PLT |
!!! [RFC2396 sec. 2.4]
!!! many applications will not call directly, since uri classes escape and unescape automatically from converting to/from canonical string format
| uri-unescape str [start-k [end-k [always-copy?]]] | Procedure |
| uri-unplusescape str [start-k [end-k [always-copy?]]] | Procedure |
|
!!! |
| uri-escape str [start-k [end-k [always-copy?]]] | Procedure |
| uri-plusescape str [start-k [end-k [always-copy?]]] | Procedure |
|
!!! escapes all except alphanum, minus, underscore, period, and tilde. !!! plusescape discouraged |
!!!
!!! [RFC2396 sec. 3.2]
| server-authority-uri<%> | Interface |
|
!!! [RFC2396 sec. 3.2.2] |
| server-userinfo | Method on server-authority<%> |
| server-host | Method on server-authority<%> |
| server-portnum | Method on server-authority<%> |
|
!!! |
| default-server-portnum | Method on server-authority<%> |
|
!!! defined by the scheme class/mixin !!! users don't normal call directly |
| effective-server-portnum | Method on server-authority<%> |
|
!!! call this to get the port number for initiating connections, etc |
| server-authority-uri-mixin | Mixin |
|
!!! |
!!! segment names representation:
(send (string->uri "http:/a/b") path-segments) => ("a" "b" )
(send (string->uri "http:/a/b/") path-segments) => ("a" "b" #f)
(send (string->uri "http:/a/b/.") path-segments) => !!!
!!! parameters representation:
(send (string->uri "http://host/aaa/bbb;p1;p2/ccc;p3/ddd")
path-segments)
=> ("aaa" ("bbb" "p1" "p2") ("ccc" "p3") "ddd")
| uri-path-segment-name segment | Procedure |
| uri-path-segment-params segment | Procedure |
|
!!! effect of mutating these values is undefined |
| hierarchical-uri<%> | Interface |
|
!!! note that last segment can have empty name |
| path-segments | Method on hierarchical-uri<%> |
| set-path-segments! v | Method on hierarchical-uri<%> |
|
!!! effect of mutating is undefined |
| path-uplevels | Method on hierarchical-uri<%> |
| set-path-uplevels! v | Method on hierarchical-uri<%> |
|
!!! (send (string->uri "http:a/b") path-uplevels) => 0
(send (string->uri "http:/a/b") path-uplevels) => #f
(send (string->uri "http:./a/b") path-uplevels) => 0
(send (string->uri "http:/./a/b") path-uplevels) => #f
(send (string->uri "http:/../a/b") path-uplevels) => 1
(send (string->uri "http:/../../a/b") path-uplevels) => 2
(send (string->uri "http:/../../../a/b") path-uplevels) => 3
(send (string->uri "http:/a/../b") path-uplevels) => #f
(send (string->uri "http:/a/../../b") path-uplevels) => 1
(send (string->uri "http:/a/../../../b") path-uplevels) => 2
(send (string->uri "http:a/../b") path-uplevels) => 0
(send (string->uri "http:a/../../b") path-uplevels) => 1
(send (string->uri "http:a/../../../b") path-uplevels) => 2
|
| path-absolute? | Method on hierarchical-uri<%> |
|
Predicate for whether or not path is absolute. |
| force-path-absolute! | Method on hierarchical-uri<%> |
|
Forces the path to be absolute. |
| resolve-path! base | Method on hierarchical-uri<%> |
|
!!! |
| path-string | Method on hierarchical-uri<%> |
| write-path port | Method on hierarchical-uri<%> |
|
!!! |
| hierarchical-uri-mixin | Mixin |
|
!!! |
!!!
(send (string->uri "http://h/srch?kw=fiendish+scheme&icase&x=1%2B2")
query-attrs)
=> (("kw" . "fiendish scheme") ("icase" . #t) ("x" . "1+2"))
| httplike-query-uri<%> | Interface |
|
!!! |
| parse-query str start-k end-k | Method on httplike-query-uri<%> |
|
!!! |
| query-attrs | Method on httplike-query-uri<%> |
|
!!! |
!!!
| query-string | Method on httplike-query-uri<%> |
| write-query port | Method on httplike-query-uri<%> |
|
!!! |
| httplike-query-uri-mixin | Mixin |
|
Accepts a |
!!! interface and mixin pairs.
!!! interfaces define the uri scheme's interface, mixins supply scheme-specific behavior that is not supplied by the component mixins above.
!!! some mixins have dependencies that can be supplied by the above component mixins or user-supplied mixins or classes.
| file-uri<%> | Interface |
|
!!! |
| file-uri-mixin | Mixin |
|
!!! |
| ftp-uri<%> | Interface |
|
!!! |
| ftp-uri-mixin | Mixin |
|
!!! |
| gopher-uri<%> | Interface |
|
!!! |
| gopher-uri-mixin | Mixin |
|
!!! |
| http-uri<%> | Interface |
|
!!! |
| http-uri-mixin | Mixin |
|
!!! |
| https-uri<%> | Interface |
|
!!! |
| https-uri-mixin | Mixin |
|
!!! |
!!! [RFC2192]
| imap-uri<%> | Interface |
|
!!! |
| imap-uri-mixin | Mixin |
|
!!! |
| ipp-uri<%> | Interface |
|
!!! |
| ipp-uri-mixin | Mixin |
|
!!! |
| javascript-uri<%> | Interface |
|
!!! |
| javascript-uri-mixin | Mixin |
|
!!! |
!!! [RFC2368]
| mailto-uri<%> | Interface |
|
!!! |
| mailto-uri-mixin | Mixin |
|
!!! |
| news-uri<%> | Interface |
|
!!! |
| news-uri-mixin | Mixin |
|
!!! |
!!! [RFC2224]
| nfs-uri<%> | Interface |
|
!!! |
| nfs-uri-mixin | Mixin |
|
!!! |
| telnet-uri<%> | Interface |
|
!!! |
| telnet-uri-mixin | Mixin |
|
!!! |
!!!
| urn-uri<%> | Interface |
|
!!! |
| urn-namespace | Method on urn-uri<%> |
|
!!! |
| urn-uri-mixin | Mixin |
|
!!! |
!!! this is used for adding new uri schemes, changing what uri schemes are
supported, adding a new uri base type, or changing individual uri class
mappings. users who just want the default uri scheme classes and mapping
can ignore this section and just use string->uri and
uri-scheme->class procedures.
| string->uri-scheme str | Procedure |
|
Accepts a URI scheme name in string form and yields a scheme identifier, which is a lowercase symbol: (string->uri-scheme "HtTp") => http
|
!!! why unknown scheme class is needed (for representing unknown scheme uri, especially til can be resolved), and how it differs (resolving unknown-scheme object can instantiate a new object of different class)
!!! Per [RFC2396 sec. 3.1], relative URI do not specify a scheme, although in practice they can have a scheme. Oracle example.
| unknown-scheme-uri<%> | Interface |
|
!!! |
| set-uri-scheme! v | Method on unknown-scheme-uri<%> |
|
!!! note doesn't cause a reparse/remap |
| unknown-scheme-uri-mixin | Mixin |
|
!!! note that parsing *does* copy raw-str. this is because we're keeping the string around for possible resolution later, and might be parsing from a reusable string buffer that's written in blocks, for example. |
!!!
| make-uri-scheme-mapper alist [unknown] | Procedure |
|
!!! |
| make-standard-uri-scheme-mapper [base [unknown? [mix]]] | Procedure |
|
!!! |
| uri-scheme-mapper | Parameter |
|
!!! |
| uri-scheme->class uri-scheme | Procedure |
|
Using the (implementation? (uri-scheme->class 'ftp) uri<%>) => #t
|
Two of the most commonly used procedures in application code will be those
that convert URI objects to and from the standard string syntax:
string->uri and uri->string.
| string/layout/scheme/base->uri str a o f uri-scheme base-uri | Procedure |
|
This method is used by the |
| string->uri str [base-uri [start-k [end-k]]] | Procedure |
|
The usual interface for constructing If the optional base-uri (which defaults to The optional start-k and end-k can be used to restrict parsing
of str to the given substring, with |
| uri->string uri | Procedure |
|
Yields !!! |
!!! note about loading (lib "uriframe-plt-url.ss" "uriframe")
!!! issue with parsing and representing parameters
(define s "http://h/foo;p1/bar;p2;p3")
(url-params (string->url s)) => "p1/bar;p2;p3"
(url-params (uri->plt-url (string->uri s))) => "p1;p2;p3"
| uri->plt-url uri | Procedure |
|
!!! |
| plt-url->uri url | Procedure |
|
!!! uses |
/ and foo/. are now parsed more consistently.
PLT URL conversion code broken off into separate module, to save memory
(approx. 900 KB on author's machine) when not needed.