[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: [ATM] How to make clean(er) html for web pages in MS Word
On Mon, 3 Apr 2006, Mark Holm wrote:
> Different computer operating systems have different rules about what is
> legal and what is not in file names. Also, there are rules about what
> is legal in Internet url's. The safest and easiest way is to use only
> Ascii letters and numbers, that is A-Z, a-z and 0-9 (and the underscore
> if you want.) Use the period (full stop) only between a file's name and
> its extension as in thisisawebpage.html
The underscore character, '_', is not allowed in URLs, except
in filenames. This is true of DNS in general, I believe.
http://foo.bar_com.net/ not legal
http://foo.bar-com.net/foo_files_for_me.html legal
> A convention, though not a rule, is to use only the lower case Ascii
> letters in url's It makes them easier to remember if you don't have to
> remember capitalization. (Also, Windows does not always distinguish
> upper and lowercase letters in file names, while *nix systems usually do.)
In the resource section, case is leveled; one may use whatever case
one wishes, but it will be leveled. In the filename part, local system
rules prevail. (Local = the server).
http://FooBar.com/ , http://FOOBAR.COM/ and http://foobar.com/
refer to the same site. This is a feature not of URLs as much as it
is a feature of the DNS, which is case insensitive.
Unix file systems always recognize and distinguish case, except for
a pathological special circumstance that I'll not mention here. Foreign
file systems hosted by a Unix system will follow the rules of the
foreign operating system, such as VMS (cases levelled to uppercase)
or MS (I don't know or care).
Unix file systems do not have the concept of "extension", and can
contain some pretty strange characters, including as many periods
as one wishes. The concept of "extension" is a gross design error,
dating from the 1960s (PDP-8?), I believe, but perpetuated in the
toy systems introduced by MS taken from CP/M.
/home/foo/this.is.my.file legal
/home/foo/this is my file legal, but most shells will want the blanks
escaped.
/home/foo/this%is)))my[[[file$name legal, but shells may misinterpret.
Even backspace and newline characters are valid in a unix filename.
Many fun tricks are based on this "feature".
I believe the only forbidden character of the 256 possible in a
Unix ffs filename is '/'. The file '/' is a special file, the root
of the filesystem. The names of directories, devices, pipes, sockets
and so on are not special or reserved.
It is very mysterious why Bill Gates elected to use '\' as a pathname
delimiter, possibly it was to sabotage things, or to just be stupid.
Early versions of MSDOS accepted '/' as well as '\', or one could
set the delimiter character arbitrarily. '\' is used as an escape
character in proper shells under adult operating systems. It makes
it cumbersome to write source code that will operate under both
Unix and MS.
The maximum length of filenames varies, too.
Dave
--
The law has converted plunder into a right and lawful defense
into a crime. -- Frederic Bastiat, 1850
_______________________________________________
ATM mailing list http://www.atmlist.net/