|
Allegro CL version 11.0 |
Common Lisp pathnames do not always map easily into operating-system filenames. In this document we describe the mapping chosen for Allegro CL on the Unix and Windows operating systems and discuss the implementation of logical pathnames.
Most symbols naming pathname functionality are standard CL symbols in the :common-lisp
package. Extensions are usually in the :excl
package.
Symbolic links are a feature of Unix filesystems. A symbolic link is a Unix file that is interpreted as a filename in a different location. Since a symbolic link can contain arbitrary filenames, a symbolic link can traverse an arbitrary number of hierarchical levels, can create back reference loops, and can make directories on other filesystems appear local. The Common Lisp function truename does resolve symbolic links in the default. An additional keyword argument to that function (and to open (when :direction
is :probe
), probe-file, and rename-file-raw), follow-symlinks, controls whether symbolic links are followed. When its value is true (the default for truename, open, and probe-file), symbolic links are followed. When the value is nil
(the default for rename-file-raw), they are not. This means that (delete-file (truename p))
deletes the actual file while (delete-file (truename p :follow-symlinks nil))
deletes the symbolic link. See also rename-file-raw and also pathname-resolve-symbolic-links.
The drive (usually a letter followed by a colon e.g. c:) in a Windows pathname is the :device
component of the pathname. It is not part of the :directory
component. Therefore, this will fail:
(defvar *root* # mswindows "d:/foo/bar/"
#-mswindows "/usr/foo/")
(make-pathname :directory *root* ...)
You can either specify "d:" as the value of the device keyword argument to make-pathname or use the pathname function to convert the string and add additional components using merge-pathnames:
(defvar *root* (pathname # mswindows "d:/foo/bar/"
#-mswindows "/usr/foo/")
(merge-pathnames ... *root*)
(pathname "c:foo.cl")
would in Unix file system be interpreted as a logical pathname with host "c" and filename and type "foo.cl". On Windows, so long as "c" has not been defined as a logical host (in, for example, hosts.cl), it is interpreted as specifying the device. This is a deviation from the ANS which makes behavior on Windows more intuitive.
Suppose hosts.cl contains this line:
"r" '(";**;*.*" #p"sys:;code;")
but has no other single letter hosts defined. Then on Windows, we have this behavior:
cg-user(6): (pathname "r:foo.cl")
#P"r:foo.cl"
cg-user(7): (type-of *)
logical-pathname
cg-user(8): (pathname "s:foo.cl")
#P"s:foo.cl"
cg-user(9): (type-of *)
pathname
cg-user(10):
(pathname "r:foo.cl")
returns a logical pathname because "r" is known to be a logical host. (pathname "s:foo.cl")
returns a (regular) pathname because since "s" is not known to be a logical host, it is assumed to be a device in the Windows file system. On Unix system, both are logical pathnames even if there is no definition of "r" or "s" as logical hosts.
When presented with a string naming a file, Allegro CL parses the string to create a pathname object as described in this section and its subsections.
Common Lisp pathnames have six components:
:host
:device
:version
:directory
:name
:type
On Unix systems, the :host
, :device
, and :version
components are ignored. Only the other three have meaning. In this section, we will describe how to transform a Unix pathname into an Allegro CL pathname object. There are four steps:
We then have several paragraphs describing unusual cases and labeled:
Finally, there are some examples, labeled
The tilde (~
) character is used to denote the user's home directory. If Allegro CL encounters a tilde as the first character of a pathname string, Allegro CL converts it to the absolute pathname of the home directory of the user whose name follows, or, if a slash (/
) follows, the home directory of the user running Lisp. Further, double slashes (//
) are converted to single slashes (/
) and /./
is also converted to a single slash (/
). At this point, the pathname string will have the form of the following schematic:
[/][<dir1>/]...[<dirn>/][<name>][.<type>]
(1)( ... 2 ............)(.. 3 .)(.. 4 ..)
The brackets ([ ]
) indicate that the elements may or may not appear. The contents of the angle brackets (< >
) describe what type of object goes in a particular location. The suspension points (...
) indicate that any number of objects of the specified type may appear.
The :directory
component (a list in Allegro CL) is determined by parts (1), (2), and (3) of the schematic. Here are the rules:
If (1) is present, the first element of the list that is the :directory
component is :absolute
. If (1) is not present and (2) is not empty, the first element is :relative
. If (1) and (2) are both empty, the :directory
component is nil
unless (3) is ..
and (4) is empty. In the latter case (where the whole pathname is ..
), the :directory
component is :up
. We now have the first element of the list that is the :directory
component.
If (1) is present and (2) is empty, the entire list is
(:absolute :root)
If (2) is not empty, each <dir
i>
is made into a string and added to the list unless it is two dots (..
), in which case the keyword :up
is added to the list. We have now resolved both (1) and (2).
(3) affects the :directory
component only if it is ..
and the type, (4), is empty. In that case, the keyword :up
is added to the end of the list. If (3) is anything other than ..
or if (4) is not empty, its value does not affect the :directory
component.
Finally, if :up
appears anywhere in the list following a string, the :up
and the string are removed. For example
(:absolute "foo" :up "bar")
is resolved to
(:absolute "bar")
We have now determined the :directory
component. See Table of examples below for examples.
The :name
component is determined from (3) in the schematic. Whatever appears as <name>
is converted to a string to become the :name
component unless the type, (4), is empty and <name>
is all dots (.
, ..
, ...
, etc.). In that case, a single dot (.
) means the :name
component will be nil
. Two dots (..
), as mentioned above, cause the keywords :up
to be added to the list which is the value of the :directory
component. Three or more dots are put in a string which becomes the value of the :name
component.
The type, (4), must start with a dot, cannot contain another dot, and must contain at least one character other than the dot. In that case, everything after the dot (but not the dot itself) is made into a string and it becomes the value of the :type
component.
There are anomalies because dots play so many roles in Unix pathnames. We have already discussed most of these. The remaining two are illustrated by the following cases:
.bar
bar.
The .bar looks like a type, as if the file has no name and type bar. Instead, .bar is taken to be the filename (including the dot) since the use of the dot in this case is to hide the file from the standard Unix ls
command listing, not to specify a type.
In the case of bar., the :name
is "bar" and the :type
is the empty string (not nil
).
That completes the rules for converting pathnames in Allegro CL. Table 1 just below has many examples of pathnames including ones with dots in all locations. Following each example, we indicate which rules were used in producing the result. There are 5 Directory (D) rules. Rules for name (N) and type (T) are not subdivided. Anomalies are shown as A.
Table 1: Examples of converting Namestrings to Pathnames |
||||
Namestring |
Pathname components |
|||
Directory |
Name |
Type |
Rules |
|
"/" | (:absolute :root) | nil |
nil |
D 1,2 |
"/foo" | (:absolute :root) | "foo" | nil |
D 1, N |
"/foo." | (:absolute :root) | "foo" | "" | D 1, N, T |
"/foo.b" | (:absolute :root) | "foo" | "b" | D 1, N, T |
"/foo.bar." | (:absolute :root) | "foo.bar" | "" | D 1, N, T |
"/foo.bar.baz" | (:absolute :root) | "foo.bar" | "baz" | D 1, N, T |
"/foo/bar" | (:absolute "foo") | "bar" | nil |
D 1,3, N |
"/foo..bar" | (:absolute :root) | "foo." | "bar" | D 1, N, T |
"foo.bar" | nil |
"foo" | "bar" | D 1, N, T |
"foo/" | (:relative "foo") | nil |
nil |
D 1,3 |
"foo/bar" | (:relative "foo") | "bar" | nil |
D 1,3, N |
"foo/bar/baz" | (:relative "foo" "bar") | "baz" | nil |
D 1,3, N |
"foo/bar/" | (:relative "foo" "bar") | nil |
nil |
D 1,3 |
"foo/bar/.." | (:relative "foo") | nil |
nil |
D 1,3,4,5 |
"/foo/../" | (:absolute :root) | nil |
nil |
D 1,3 |
".lisprc" | nil |
".lisprc" | nil |
N, A |
"x.lisprc" | nil |
"x" | "lisprc" | N, T |
"." | (:relative) | nil |
nil |
N |
".." | (:relative :up) | nil |
nil |
N |
"..." | nil |
"..." | nil |
N |
Merging of pathnames is handled by Allegro CL to take advantage of directory hierarchies. Allegro CL follows the Common Lisp standard for merging pathnames. This section provides examples showing the directory component of the resulting pathname.
Given two pathnames a and b, then the result, c, of merging these pathnames may cause merging of their directory components.
(setf c (merge-pathnames a b))
This merging follows these rules:
If pathname a does not have a directory component, then the directory component of pathname b becomes the directory component of the result c.
If pathname a's directory component is absolute (i.e. it begins with :absolute
) then pathname c will have pathname a's directory component.
If pathname a has a directory component that is relative, (that is begins with :relative
), then the directory component of pathname c depends on the combined directory component of pathname a and b. If pathname b is absolute, then the resulting directory component will also be absolute. When a and b's directory components are combined they are canonicalized by the appropriate removal of :back
entries. For example:
cl-user(4): (let ((a #p"bar/")
(b #p"/foo/"))
(pathname-directory (merge-pathnames a b)))
(:absolute "foo" "bar")
cl-user(5): (let ((a #p"bar/")
(b #p"foo/"))
(pathname-directory (merge-pathnames a b)))
(:relative "foo" "bar")
cl-user(6): (let ((a #p"../bar/")
(b #p"foo/"))
(pathname-directory (merge-pathnames a b)))
(:relative "bar")
cl-user(7): (let ((a #p"../bar/")
(b #p"../foo/"))
(pathname-directory (merge-pathnames a b)))
(:relative :back "bar")
cl-user(8):
Finally, if pathname b does not have a directory component, the directory component of pathname a becomes c's directory component.
We have tried to make the handling of Windows pathnames (really DOS pathnames) as consistent as possible with the handling of UNIX pathnames. Note the following differences:
/
and \
can be used interchangeably but if you use \
in a string, it should be doubled. (\
is the Lisp escape character and in a string is interpreted as treat the next character specially. The combination \\
in a string is interpreted as a single \
.) Thus these strings name the same pathnames:
"foo\\bar.cl"
"foo/bar.cl"
The drive, specified with a letter followed by a colon, is the device component of a pathname. It cannot be combined with the directory component. In particular:
make-pathname :directory "c:\\foo\\" ...)
will fail. This will work:
(make-pathname :device "c" :directory "\\foo\\" ...)
UNC pathnames (e.g. \\<machine-name>\<share-name>\<path>
) are supported as follows: the share name must be in the :device
slot and the machine name in the :host
slot. This classification of the parts of a UNC pathname are important when creating a pathname with make-pathname. If you apply pathname to a string naming a UNC pathname, the parts are classified correctly automatically. In the following example, the :host
is hobart and the share name cl, the directory is src, the filename acl and the type mak.
(pathname "\\\\hobart\\cl\\src\\acl.mak")
See also Windows devices for a discussion of why (pathname "c:foo.cl")
returns a regular pathname on Windows (unless "c" has been defined as a logical host) but a logical pathname on Unix.
Allegro CL provides some additional pathname manipulation functions that users might find useful:
nil
if its argument does not exist or if its argument does not name a directory (i.e. if it names a file rather than a directory).The logical pathname facility was added to the Common Lisp standard by the X3J13 committee. The logical pathname specification leaves various details of behavior up to the implementation. Here we describe these details of the Allegro CL implementation. We do not describe the whole facility.
Logical pathnames were added to the Common Lisp language by X3J13 as a facility for the portable specification of filenames comprising an application. This section describes various implementation specifics about logical pathnames on Allegro CL, and then gives suggestions how logical pathnames can be used effectively in both developing and delivering a complex lisp application with many files. A final section discusses issues raised by Unix symbolic links, first in general, and then with regard to logical pathnames.
A pathname or logical pathname is a Lisp object, but the language defines conversions of a pathname to and from a character string representation, called a namestring or logical-pathname namestring. The mapping between physical pathnames and namestrings is implementation dependent, but (with certain exceptions discussed in table 1 above) a physical pathname representing a file in a Unix filesystem will have a namestring representation equivalent to the filename used by Unix utilities to name that file. Since Unix is fairly permissive about which characters may be used as pathname components, almost any character other than / (slash) is allowed almost anywhere in a non-logical-pathname namestring. In Allegro, physical pathnames and physical pathname namestrings (again with certain minor exceptions) can represent all possible Unix filenames.
There is also a mapping between logical pathnames and logical-pathname namestrings. Logical pathnames exist so a portable application written in portable code can reference the files it needs to operate, so this mapping is not implementation dependent. It is not the purpose of logical-pathname namestrings to represent all filenames possible on an arbitrary host filesystem, e.g. Unix. Rather, logical-pathname namestrings are limited to a reasonable subset of possible filename syntax that can be accommodated by all plausible filesystems.
For this reason, characters other than alphabetics, decimal digits, and the minus sign are not supported in "words" of a logical-pathname namestring (see this section of the ANSI Spec, noting particularly the definition of a word). The intention is that a large multi-file system should limit its filenames in this way, and then the logical pathname mechanism will guarantee that the software can be ported to other platforms with minimal difficulty.
Many have found this restriction onerous. Because actual work is typically done on just a few platforms, forbidding characters such as an underscore (_), which in fact causes no problems on any popular platform, contributes nothing to actual portability and adds additional development rules whose purpose (to non-Lisp programmers and to many Lisp programmers) is obscure. For this reason, Allegro CL has added the variable *additional-logical-pathname-name-chars*. Characters on this list are permitted in logical pathname "words" regardless of the standard. The initial value of this variable is nil
. Character objects can be pushed onto this list as desired.
To encourage portability the Allegro CL implementation will not convert any namestring containing incorrect logical-pathname syntax into a logical pathname (except as the allowed syntax is extended by characters in the list which is the value of *additional-logical-pathname-name-chars*). Thus, assuming that expert has been defined as a logical host, this call to pathname
(pathname "expert:;engine;steam-power.lisp")
will return a logical-pathname object with name component steam-power, type lisp, and directory (:relative "engine"). However, this call
(pathname "expert:;engine;steam_power.lisp")
cannot parse the string as a valid logical-pathname namestring and will instead return a physical pathname with name component expert:;engine;steam_power. This is obviously not what was intended; nonetheless, the implementation is not justified signaling an error because this physical pathname is a perfectly legal Unix filename, however unlikely.
But if the underscore character, #_, is added to the list which is the value of *additional-logical-pathname-name-chars*, then "expert:;engine;steam_power.lisp" will parse as a logical pathname:
(push #\_ *additional-logical-pathname-name-chars*)
(describe (pathname "expert:;engine;steam_power.lisp"))
#p"expert:;engine;steam_power.lisp" is a structure of
type logical-pathname.
It has these slots:
host "expert"
device nil
directory (:relative "engine")
name "steam_power"
[...]
Extending allowable logical pathname syntax is not, of course, portable. It may be necessary for a system to refer to platform-dependent files (perhaps preexisting library files) with non-conforming names in a portable way. There is no reason not to use the translation services provided by logical pathnames for such files as well, given that such files are platform dependent in the first place. While the pathname implementation will not parse such a namestring as a logical pathname, it is nonetheless possible to construct a logical pathname with arbitrary strings as words, and portably so, using make-pathname. For example, the expert:;engine;steam_power.lisp pathname above could be constructed with a form such as this, assuming logical host expert has been defined:
(make-pathname :host "expert"
:directory '(:relative "engine")
:name "steam_power"
:type "lisp")
<strong>[returns]</strong>
#p"expert:;engine;steam_power.lisp" ; a logical-pathname
This logical-pathname will violate print/read consistency if and when the printed representation is re-read because a physical pathname will be created. Otherwise, it will work just like any other logical pathname. One obvious place where this technique could be useful is in naming pre-existing foreign code system object files and libraries, perhaps in conjunction with defsystem.
:device
and :version
default to :unspecific
. (The standard requires :device
always to be :unspecific
since devices are not supported in logical pathnames.)
In make-pathname, device and version default to :unspecific
. An error is signaled if these arguments are not nil
or :unspecific
(or :newest
for version). Hosts are represented as strings in Allegro CL. They are always compared case-insensitively.
make-pathname: returns a logical pathname if host is given and it is logical, that is if logical pathname translations have been defined for it.
The specification says of parse-namestring:
thing is recognized as a logical pathname namestring when host is logical or defaults is a logical pathname. In the latter case the host portion of the logical pathname namestring and its following colon are optional. If the host portion of the namestring and host are both present and do not match, an error is signaled.
Allegro CL implements the following extension:
Allegro CL recognizes a namestring as a logical pathname in one additional circumstance: the namestring has logical namestring syntax and host is given. In other words, the host need not already have translations defined for it.
when pathname is a relative logical pathname and defaults is not a logical of the same host, then Allegro CL translates pathname before the merge. Keep in mind that the relative logical pathname could translate into an absolute physical pathname--this is the reason for this rule, as relative and absolute merging rules are different. For example:
cl-user(1): (setf (logical-pathname-translations "frob")
`((";foo;**;*.*" "/bar/**/*.*")))
((";foo;**;*.*" "/bar/**/*.*"))
cl-user(2): (merge-pathnames "frob:;foo;whack.cl" "/blam/")
#p"/bar/whack.cl"
cl-user(3): (setf (logical-pathname-translations "frob")
`((";foo;**;*.*" "bar/**/*.*")))
((";foo;**;*.*" "bar/**/*.*"))
cl-user(4): (merge-pathnames "frob:;foo;whack.cl" "/blam/")
#p"/blam/bar/whack.cl"
cl-user(5):
when pathname is an absolute physical pathname and defaults is a logical pathname, pathname is returned without further consideration, thus preventing obvious nonsensical merging such as:
(merge-pathnames "/foo/bar/baz.fasl" "sys:;wham;.fasl")
RETURNING #p"sys:foo;bar;baz.fasl"
That would not make sense because #p"sys:foo;bar;baz.fasl" will (in all probability) not make sense, especially if sys:
has translation into an absolute pathname, since rarely, if ever, would the concatenation of two absolute pathnames make any sense.
The ANSI specification says "merge-pathnames returns a logical pathname if and only if its first argument is a logical pathname, or its first argument is a logical pathname namestring with an explicit host, or its first argument does not specify a host and the default-pathname is a logical pathname." In Allegro CL, ANSI compliance is modified by 1. above. That is, in the case of a relative logical pathname (or logical namestring) and an absolute default, Allegro CL will return a physical not a logical pathname.
parse-namestring: a directory sub-component of "**" is parsed as :wild-inferiors
and "*" as :wild
.
There is a translation for the logical host "sys" which is defined as follows:
(setf (logical-pathname-translations "sys")
(list
(list "**;*."
(namestring <location of exe file>))))
The <location of the exe file>
is the location of the executable being run. With the initial installation, the executable is named mlisp.exe, alisp.exe , or allegro.exe on Windows and mlisp or alisp on Unix, or allegro.exe on platforms that support the IDE. You may have made copies of the executable with different names for general use or as part of a runtime distribution.
Implementation notes for load-logical-pathname-translations: see the description of this function below.
Implementation notes for translate-logical-pathname: if this function is called on a logical pathname for which no translation exists, it will in Allegro CL try calling load-logical-pathname-translations for the host rather than signaling an error immediately.
Implementation notes for translate-pathname and pathname-match-p: In addition to the meanings of * and ** as entire pathname words, standing for :wild
and :wild-inferiors
, Allegro CL also allows * within a single pathname word, with the normal UNIX globbing interpretation. These capabilities also apply to translate-logical-pathname, which calls these functions. On the right hand side of a translation rule, a * indicates where to substitute the wildcard-matched text. Only one * is allowed within each word on the right hand side of a rule.
See also the section Windows devices for a discussion of why (pathname "c:foo.cl")
returns a regular pathname on Windows (unless "c" has been defined as a logical host) but a logical pathname on Unix.
load-logical-pathname-translations
Arguments: host
Certain details of this standard Common Lisp function are explicitly implementation dependent. In all implementations, host must be a string naming a logical-pathname host and this function returns nil
if the logical-pathname host named host is already defined. Otherwise, it searches for a logical pathname host definition as follows (This is the implementation-dependent part):
It returns a list of strings or pathnames which can be coerced to pathnames naming files.
These files are examined in order (if a file does not exist, it is skipped) until a translation is found. The format of these files is:
[host (from-wildcard to-wildcard) ]*
where host is a string naming a logical host (i.e., src
) and from-wildcard and to-wildcard specify the translation (i.e., are the source and target), such as
(list "**;*.*" "sys:**;*.*")
The list of from-wildcard and to-wildcard is evaluated.
For example, logical-pathname-translations-database-pathnames initially returns the list ("sys:hosts.cl")
, which tells the system to look at the files hosts.cl in the Allegro directory (usually the directory containing the executable).
Other files besides hosts.cl can be searched for logical pathname information. The function logical-pathname-translations-database-pathnames returns a list of strings naming files that will be searched for logical-pathname information. It initially returns ("sys:hosts.cl")
. Additional strings can be added with pushnew and friends, like this:
(pushnew "~/myhosts"
(excl:logical-pathname-translations-database-pathnames))
logical-pathname-translations-database-pathnames will now return
("~/myhosts" "sys:hosts.cl")
The ANSI Specification nowhere defines how a logical host can be undefined, or whether a logical host may have an empty set of translations. However, Allegro CL interprets an empty set of translations as equivalent to the logical host not being defined at all. Therefore, a logical host may be undefined by setting its logical-pathname-translations to nil
. Note that this behavior may be different from that of other Common Lisp implementations.
If you have set logical-pathname-translations to nil
for a host defined in sys:hosts.cl and then call translate-logical-pathname on a logical pathname using that host, the system will, as usual, look in sys:hosts.cl to find translation rules, and, finding them will use them.
All logical-pathname translations are flushed when an image dumped (by dumplisp) is restarted.
Pathnames can have wildcard components, which causes all files that match the remainder of the pathname to match. See directory, pathname-match-p, and wild-pathname-p. The newly added function glob also supports wildcards. This section describes wildcards in pathnames and how they are processed by Allegro CL.
The definition of wild is:
wild: adj. 1. (of a namestring) using an implementation-defined syntax for naming files, which might "match" any of possibly several possible filenames, and which can therefore be used to refer to the aggregate of the files named by those filenames. 2. (of a pathname) a structured representation of a name which might "match" any of possibly several pathnames, and which can therefore be used to refer to the aggregate of the files named by those pathnames. The set of wild pathnames includes, but is not restricted to, pathnames that have a
:wild
component, or that have a:wild
or:wild-inferiors
directory component. See the function wild-pathname-p.
That implementation-defined syntax in Allegro CL is this:
A
*
or?
can appear in the filename portion of a pathname or path namestring.*
will match 0 or more characters and?
will match a single character. For example, the wild pathname#p"foo?.cl
" will match#p"foo1.cl
" and#p"foo2.cl
", but not#p"foo12.cl"
.
The above, in addition to the following two definitions complete the description of what the
A pathname with a name and/or type component of :wild
. A string that when parsed as a pathname has a name or type that parses as "*" is converted to :wild. That is:
cl-user(6): (describe (pathname "*.*"))
#p"*.*" is a structure of type pathname. It has these slots:
host nil
device :unspecific
directory nil
name :wild
type :wild
version :unspecific
namestring "*.*"
hash nil
dir-namestring "./"
plist nil
cl-user(7):
A pathname with a directory component containing an element of :wild
or :wild-inferiors
. A string that when parsed as a pathname has a directory component that contains "*" or "**" are converted to :wild
and :wild-inferiors
, respectively. Test is:
cl-user(9): (describe (pathname "a/b/*/foo.cl"))
#p"a/b/*/foo.cl" is a structure of type pathname. It has these slots:
host nil
device :unspecific
directory (:relative "a" "b" :wild)
name "foo"
type "cl"
version :unspecific
namestring "a/b/*/foo.cl"
hash nil
dir-namestring "a/b/*/"
plist nil
cl-user(10): (describe (pathname "a/b/**/foo.cl"))
#p"a/b/**/foo.cl" is a structure of type pathname. It has these slots:
host nil
device :unspecific
directory (:relative "a" "b" :wild-inferiors)
name "foo"
type "cl"
version :unspecific
namestring "a/b/**/foo.cl"
hash nil
dir-namestring "a/b/**/"
plist nil
cl-user(11):
Copyright (c) Franz Inc. Lafayette, CA., USA. All rights reserved.
|
Allegro CL version 11.0 |