[nycphp-talk] Some comments on the XML Talk
Elliotte Harold
elharo at metalab.unc.edu
Thu Nov 1 07:21:05 EDT 2007
Kenneth Downs wrote:
> Finally, I would have liked to hear more of Rusty's ideas about the
> relationship between the file system, the web server, and the database.
> Rusty, do you want to expand on that here?
>
Well in most applications, the database stores its data in the file
system. However it's just one or a few files. The structure is inside
the files, just as it is with MySQL. The file system is just a
convenient interface to the hard drive. I suppose it's possible a big
XML DB might talk to the hard drive directly and by pass the file
system, just as Oracle does sometimes, but that's an implementation detail.
The web server is the part I'm still thinking about. In practice today
the web server is designed as an interface to the file system. URLs are
converted into paths which are used to serve files. Sometimes those
files are further processed by PHP or similar tools and what's served
isn't quite what's in the file. Sometimes we use mod_rewrite or similar
tools to remap some URLs to different file paths. However the basic
design is that the URL structure mirrors one or more file system
hierarchies, and everything's layered on top of that.
However, I'm starting to uncover a lot of applications where this
URL==filesystem design doesn't work very well. I want to map URLs to
something other than filesystems; for instance to database queries and
templates. I've been building one such system lately as an internal
controller for another application. All URLs are served by invoking
certain methods in a running program. It's a special purpose system, but
it's one for which the file system doesn't make sense.
I'm considering how one might genericize such a system. That is, what
would a general purpose web server that doesn't necessarily serve files
look like? How would one configure it, and tell it what to serve for
each URL requested? How does one tell it that http://www.example.com/foo
is a file but http://www.example.com/bar is a database query? Existing
solutions like PHP, Java servlets, and mod_rewrite are too inflexible
for what I envision. They're also too hard to use and too confusing.
That may be partially a result of poor design, but I suspect it's mostly
because they still implicitly assume that what we're doing is serving a
file system with a few small tweaks. Perhaps we can do better if we get
rid of the assumption that there must be a file system in place.
I don't have an answer yet. I'm mostly just musing on some
possibilities, and letting the ideas cook in my head for now. The tricky
bit is figuring out how to design this so that there aren't a lot of
confusing precedence rules for resolving conflicts between different
mappings, while still allowing arbitrary mappings. For instance, one
should be able to say that http://www.example.com/foo/bar/baz1 through
http://www.example.com/foo/bar/baz100 are all database queries except
for http://www.example.com/foo/bar/baz23 which is a static file, or that
http://www.example.com/foo/baz1 through
http://www.example.com/foo/baz100 are database queries unless there's a
static 23.html file in directory /baz, in which case that should be used
instead.
It's possible I'm being too demanding. There may be a really clean 80/20
cut somewhere, but so far I don't see it. I may need to build a few more
applications along these lines first, just to see which features are
really needed and which are just paint in the lilies. In any case, I
don't have the answer yet, just the question.
This is orthogonal to the issue of whether the backend is an XML DB, a
SQL DB, or something else.
--
Elliotte Rusty Harold elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/
More information about the talk
mailing list