Q: I’ve been told my file has illegal characters. What is that?

A. Every operating system recognizes some set of characters. In the earlier days of computing that set was called ASCII (American Standard Code for Information Interchange). This consisted of 255 characters including a-Z, 0-9, special characters such as !, and control characters such as carriage return.

And then someone realized that some places on the planet didn't use a Roman language. This led to the development of Unicode. Unicode includes 137,439 characters (including the original ASCII set), covering 146 modern and historic scripts (what most would call languages).

And hidden within these 137,439 characters can be found the illegal characters. An illegal character is one which the operating system has reserved for its own use, and should not be used when naming a file, folder, or application.

For example, in macOS, the period is an illegal character when used as the first character in a name. In fact, the Finder won't allow you to rename a file starting with a period. Oh, sure, you can work around that. But then the fun starts. If you discover the way to do this, the file, folder, or application disappears! This is because the period as the first character tells the system to make the item invisible. (To be fully accurate, the period isn't an illegal character–it is a reserved character. But it makes the point.

In the case of Windows, it only allows a period to be used in a name just before the extension, such as document.txt. If there is a second period, it can cause grief. The best example is if using OneDrive, the improperly named item cannot be synchronized between server and client.

The bad news is that each operating system (and often versions of systems) have their own subset of illegal characters.

Listed below are the illegal characters for various systems:

macOS

  • Illegal characters: colon ":"
  • File, folder, and application names are not permitted to begin with a period ".".
  • Name length: may not exceed 255 characters.

Windows

  • NTFS illegal characters: | / ? < > : *  " and any character you type with the Ctrl key.
  • FAT illegal characters:  | / ? < > : *  " ^ and any character you type with the Ctrl key.
  • NTFS name length: may not exceed 256 characters.
  • FAT name length: may not exceed 255 characters.
  • Path length: may not exceed 260 characters.
  • Names may not end with a space or period. I've seen problems when the first character is a space, although not technically prohibited.
  • Reserved names: com1, com2, com3, com4, com5, com6, com7, com8, com9, lpt1, lpt2, lpt3, lpt4, lpt5, lpt6, lpt7, lpt8, lpt9, con, nul, and prn

Windows notes:

  • Although the Windows file system may support most of these conventions, the operating system may not. For example, NTFS allows paths up to 32,767 characters with each component (folder, file, etc.) being limited to 256 characters. However some windows applications–for example, Explorer–may not behave correctly in this circumstance.

What does this mean for me?

If your files never leave your computer, sorry. I just wasted two minutes of your time with the above information. But if your files may find themselves on another computer, be it server or client, it is wise to keep illegal characters in mind. A file name that works on one platform may not work on another.

The easiest way to ensure using only legal characters is to stick with a-Z, 0-9, underscore, spaces only within the name (not at beginning or end), and reserve periods to just before a file extension.

Don't have a life? Every detail about illegal characters for the major operating systems can be found at https://en.wikipedia.org/wiki/Filename.

No Comments Yet.

Leave a comment