sanename is a set of rules for naming software.
First the format for package naming is defined and then translation rules are defined for representation in 4 other common formatting conventions.
The package names are the name by which package managers refer to a library. i.e. the name apt-get install
or npm install
uses.
Sanename naming restrictions can be used for other software elements such as table names, file names, hash key rules, allowing for early, strict, validation and easy communication about the naming rules.
The aim of a sanename is to produce a proper noun that uniquely defines the package, is descriptive and concise. Package names should be unique in the realm in which they live, Node.js packages should be unique withing npm, an attempt to avoid overlaoded names from other areas is recommended.
Simple names are good, proper nouns naturally aid in use within natural language conventions, for example fluent apis, avoid single English words since this leads to confusion when a convention prevents capitalization to mark the proper noun.
Concatenating names to make a proper noun is a simple way of creating descriptive unique names (for easy searching): jsonbinder , logingreeter , logencrypter . If a project is assigned a composite name like this it should be treated as a proper noun. e.g. sanename and sane-name may not be grouped in text searches. This is an important part of the naming guide, remain consistent. nodejs project styles the name Node.JS which results in a full stop in the wrong place in sentences, this complicates delimiting sentences that contain the project name with code, e.g. automatically generating summaries from a larger body of text.
Concatenating more than two words starts to get difficult to read. Longer names should be divided into words: node-jsonbinder-utils. If a package is a sub-project respect the original sanename, i.e. do not create a package node-json-binder-utils if it is based on or part of jsonbinder. Concatentating whole phrases should be avoided, only concatenate words to create proper nouns. e.g prefer antinstaller-build-tool over antinstaller-buildtool.
This part of the rules is to resolve issues mapping to CamelCase. In sanename spec mp3 is a word, as is nsa , gchq and ibm. This makes little difference in lower case packagenames such as ibm-nsa-gchq-tools, however when transforming to CamelCase this name becomes IbmNsaGchqTools. This looks a bit odd at first sight because you know that IBM are initials. Its better than IBMNSAGCHQTools because the transformationcan be aplied in both directions. It enables code to determine a consistent package sanename from the classname and vice versa.
Most code is written in US English, and it is the defacto standard. Check your spelling in an online US dictionary if there is any doubt. Blatantly incorrect spelling is encouraged since it helps generate uniqueness: tumblr, speling. Single nouns from any other language make good sanenames: ubuntu, paris, simba, equisemel. Avoid non-english in the descriptive parts of a name xml-parseador.
No upper case letters to be used: mixed case makes translation to and from constants and camel case complicated. Lower case is faster to type. Debian do it. Marketing departments may not like this rule since case style can be part of a trade mark, notwistanding the rule applies. A package name with an upper case letter is not a sanename. Whitespace is used by almost all languages to delimit tokens, that should be obvious. While underscore is permitted in almost all languages its excluded from sanename. N.B. unlike Debian packages + is not permitted, limiting sanenames to simple latin characters simplifies validation and removes the need for any escaping in almost all string representations. + would create issues in URIs.
This rule is borrowed from C rules that have found there way into most other languages. Its not strictly necessary but, by enforcing it in sanename, there are more use cases where no translation of the project name is required. Most importantly variable names in code where digits define numbers. Complete rewrites of a code base are often given a new package name, this can be represented by attaching digit as a suffix. sanename numeric suffixes form part of the word boundry, so this is compatible with semver. e.g. apache2, junit4, there are technical reasons projects are repackaged like this, primarily so both versions can co-exist at runtime. sanename specifies that such suffixes should be part of the word boundry, i.e. apache-2 is invalid. Conceptually apache2 is a spearate proper noun and a separate entity from apache. Since dots are forbidden the temptation to name a project with a semver number is averted, web2.0 is not a valid sanename, use web2 .
Avoid 3 letter names, they are more likly to clash and likely to result in false positives in searches. Ultimatly pathnames will hit OS limits, its 255 in DNS names and Windows, it can be hit quite easily. Packagenames are often used as part of paths, e.g. repository URLs, its important to keep the name short. While very short aliases can be convenient to type, e.g apache-bench has the command ab, ab is an invalid sanename.
Try to avoid formatting conventions such as MY_PROJECT and MyProject, these have explicit meaning in some programming languages, i.e. typically constant and class.
The four formatting conventions considered are as follows.
sanename guidelines make it possible to transform across all of the above usecases deterministically. That is to say given a sanename in any one of these formats its possible to determine exactly the name that will be used in the other formats. One of these forms is generally compatibel with existing naming conventions, however there are some which sanename does not cater for, for example the convention for Linux libraries "lib" plus the library name in all lowercase without any word boundries.
A great many software projects already use sanenames.
Its common to concatenate 2 words to create a proper noun in English and other languages, the noun can there fore be descriptive.
github
sourceforge
nginx
apache
semver
Acronyms, backronmys, nacronyms and random collections of letters can be used as a word to great effect. e.g.
irc - internet relay chat.
twain - technology without an interesting name.
npm - which does not stand for node package manager.
java - just another vague acronym.
Copyright 2105 Paul Hinds. Permission is granted to copy this text freely.