Numbering Starts with Zero

>721530 Deutsche Version dieser Seite

When developing programs for computers, it often is advisable to start numbering with the number zero, not with the number one. This article is intended to give some reasons and hints in this regard.

Three Kinds of Numbers

There are three kinds of non-negative integral number usage:

Quantities 0, 1, 2, …—Quantities (cardinal numbers) are used to give the quantity (a count) of items. They always include zero, because zero is a possible quantity.

Position Words 1st, 2nd, 3rd, …—Position words give the position of an item within a sequence with a beginning. The first item at the beginning of the sequence is called the “first” item, the next is called the “second”, and so on. The meaning of “first” is given by the English language, which also dictates that there is no “zeroth” item. „First“ is also written as „1st“, „second“ is also written as „2nd“ and so on.

Identification Numbers 0, 1, 2, … or 1, 2, 3, … or any other schema —Identification numbers are just a methodic way to give numerical names to items of a list (to “number” them). While any imaginable schema could be invented (like A, B, C,…), the most common schemas are 0, 1, 2, … or 1, 2, 3, …. The whole debate is not about whether quantities should include 0 or position words should start at “1st”. It is just about whether identification numbers should start at 0 (like quantity numbers) or at 1 (like position specifications).

Names and Descriptions

Even if the first entry of a sequence is being called “entry 0”, it is an error of language to refer to the “zeroth” thing of a sequence in any context. The ordinal-number word “the first (thing)” by its meaning in the English language always refers to the first entry within a sequence, even if this is called “zero”. Thus, one might say. for example, “Zero is the first natural number.” There can not be anything in front of the “first thing”, therefore, there is no “zeroth” thing. Whether the first house of a street is called “house 0” (“house zero”), “house 1”, or “house 2”, it always remains the first house of that street. One might rightly call the hour beginning at midnight “hour 0”, but it's the first hour of the new day, not the “zeroth”.

So, he who wants to always start counting with zero, still must not speak of a “zeroth” entry of a sequence. The name “zero” is the name that we give to a certain number, “the first natural number” is a description of that number. A name can be assigned arbitrarily, while a description must use words with their meaning given by the English language.

Everything Starts With Zero

Normally, when programming, it is important to allow zero when a number is to be designated.

One might think of a program to allow indentation of a line: The indentation specifies by how many positions a line is shifted to the right.

Indentation

12345678 
  text   -- indented by two character's widths 
 text    -- indented by one character's width 
text     -- indented not at all (by zero character's widths)

When the first position of a line is named »1«, the first non-blank character appears at position »1« when indenting by 0 character's widths. When indenting by 1 character's width, it appears at position »2«, and when indenting by 2 character's widths, it appears at position »3«. To get the position of the first character from its indentation, one always has to add 1. The first non-blank character of a line indented by i characters is located at position i +1 when this numbering schema is used.

However, once one names the positions beginning with the number zero, one can spare this addition.

Indentation

01234567 
  text   -- indented by two character's widths 
 text    -- indented by one character's width 
text     -- indented not at all (by zero character's widths)

Now, the first non-blank character of a line appears at position i if the line was indented by i character's widths.

This simplification of the treatment of a displacement and the position resulting from it can also be found in many other cases. It is one of the reasons suggesting to start numbering with the number zero.

Although adding or subtracting the value 1 does not seem to be a great effort, it is a frequent source of errors, when it was forgotten or used in the wrong place. By leaving out this step, thus, a source for the common off-by-one errors is eliminated.

If numbering is started with zero, every number correctly gives its difference to the first number.

If one counts “1, 2, 3”, the difference of the number “3” to the first number “1” is 2, so it is not the number 3 itself. If one counts “0, 1, 2”, the difference of the number “2” to the first number “0” is 2, so it is the number itself. This property simplifies a lot when counting is started with zero.

If one counts the time for 10 swingings of a pendulum and at the start of a stopwatch (at a zero-crossing of the amplitude) starts to count beginning at one, one would have to stop the stopwatch when one is counting “eleven”, because when one counts “n ”, there have been n −1 oscillations since counting was started. If one start to count at zero, every number will give the quantity of the intervals occurring before the number. So, in this case, one counts up to “ten” in order to count ten swingings and one again has eliminated a possible cause for an off-by-one error.

If numbering is started with zero, every number gives the quantity of the unit intervals counted so far.

That is, the number of these unit intervals is just the difference of the number to zero.

Positions on Chess Boards

Often two-dimensional schemas occur, such as in the case of a chess board or a spreadsheet.

Within the following rectangular schema with 3 rows and 3 columns, every cell is numbered using numbers from 1 to 9. Numbering was started with the number 1 everywhere.

A Rectangle

   1 2 3 < column

1  1 2 3 
2  4 5 6 
3  7 8 9 
^ 
row

The column c of a cell can be obtained from its identification number k by c =( k − 1 )mod 3 + 1, where "mod" is the modulo operation ("m mod n " is the remainder of the division of m by n ).

Now, the same situation is depicted, just with the numbering starting at zero.

A Rectangle

   0 1 2 < column

0  0 1 2 
1  3 4 5 
2  6 7 8 
^ 
row

Now, the column c of a cell can be obtained from its identification number k as c = k mod 3. This time, two offsets of 1 have disappeared and, therefore, two possibilities for off-by-one errors.

Such schemas (with two or more dimensions) often appear in programming. But similar situations also appear when the identification number of a day is to be obtained from the date of a (Gregorian) calendar.

The Calendar

Calendar calculation often need the distance of a day from the start of the month as an intermediate value. The often-needed distance of a day from the start of the year is the distance of the first day of the month to the first day of the year plus the distance of the day from the first day of the month. The distance of a day from the first day of a month is the identification number of this day minus 1.

This extra subtraction of the value 1 would not be required for such calculations if the naming of days of a month would start with zero.

The classic Gregorian calendar does not include a year zero. The first year is the year one. This seems to be a simplification, but in fact it provokes many errors, such as the opinion that the duration of the 20th century would extend from 1900 until 1999: Since the first century, that is, the first hundred years, include the years from 1 to 100, the 20th century includes the years from 1901 until 2000, and the third millennium starts at January 1, 2001.

Counting From Zero is not Alien to Our Culture

The idea to name the first day of the month as the day zero, might seem strange at the first sight.

It can be seen that this is not so ludicrous if one remembers that this schema is applied for the hours of a day. When one is calling the speaking clock at midnight, one hears “At the third stroke it will be 0 hours, 0 minutes and 10 seconds.” The numbering of hours, minutes, and seconds really does start at zero and thus simplifies many calculations involving hours, minutes, and seconds. Thus, at 5:20 just 5·60+20 minutes, or 5·1440+20·60 seconds have passed. An extra addition of 1 is not necessary within those calculations.

The hour zero is the hour between 0 and 1 o'clock: At every instant of this hour 0 hours (and several minutes) have passed since 0 o'clock. As the hour of all these 0-hours instants it bears its name for a perfectly good reason.

Blowers and sometimes guitar players use “0” for the thumb, “1” for the index finger, and so on. Thus, starting at zero is not even unusual when counting fingers within our culture.

The “0” indicates a non-depressed string of string instruments.

In 2005, it was reported that the African gray parrot Alex according to observations of Irene Pepperberg of the Brandeis University in Waltham (USA ), spontaneously invented the concept „zero“ and applied it properly in counting tasks.

Report: http://www.alexfoundation.org/papers/JCPAlexComp.pdf; Journal of Comparative Psychology, Bd. 119 (2), S. 197

Zero is the First Natural Number

To define natural numbers, one usually refers to the axiom system of Peano. According to these Axioms, zero is a natural number. „Zero es numero “ (Assiomi di Peano [da Peano G. , Formulario Mathematico, Fratelli Bocca {indicato sul frontespizio come Fratres Bocca } Editore, Torino 1908, pag. 21]). There are, however, also sources, claiming that Peano sometimes has started with one, and it was not possible to find out whether zero was the first natural number to him, indeed.

According to the German standard DIN 5473 as of June 1976 the set of natural numbers N contains the number zero.

Zero is Compatible with Range Designations

Intervals of natural numbers are often given by an inequality "m ≤ i < t ". Therein, "m " is the minimum of the interval, "i " is a value from the interval, and "t " is the top of the interval. (The top is one more then the maximum.)

If one would use the relation "<" instead of "≤" as the first relation, an interval supposed to contain zero could not be designated, unless an even smaller number would be used in front of "<", which does not exists within the natural number. (The same problem would occur when the number "1" would be chosen as the first natural number.) Would the second relation be the relation "≤" in the place of the relation "<", an empty interval "0 ≤ i < t ", which begins with the first number, could not be designated if only natural numbers are to be used. (Any other empty interval could be used, but sometimes one can only control the second number.)

Asymmetric interval designations, such as the designation "m ≤ i < t ", have the pleasant property that the difference "t −m " between both the values is giving the quantity of the values within the interval.

To write an interval including n values, one might use the condition "1 ≤ i < 1+n " or "0 ≤ i < n ". The second form obviously is more simple. In fact, programmers prefer such interval designations, because they help them to avoid off-by-one errors, because the top "n " gives exactly the number of the values of the interval. Thus, in the programming language C , a loop "for( int i = 0; i < n; ++i );" will repeat n times. This would be different, if the operator "<=" would be used in place of "<", or, if the variable "i" would have been initialized with the value "1". When starting with "0", at the beginning of each loop cycle, the value "i" specifies how often the loop has been repeated so far. This assertion is even valid after the end of the loop. This property simplifies the wording of so-called loop invariants. Starting the numbering at zero also allows to write the condition by the expression "i != n" (in place of the expression "i < n"), which simplifies the proof that assertion "i = n" is true after the end of the loop. This text would need to be written as "i != n+1", if the counting would have started at the number 1.

A Digression about asymmetric Interval Designations

To the ancient Romans, an interval would always include both the first and the last value of its designation. Thus, the interval “from Monday to Monday” would include eight days, which is why in some languages even today “within eight days” means “within a week”. However, this convention led to off-by-one errors once in a while: The leap years, which where supposed to take place every interval of four years, where specified in such a manner, that for the first leap year occurring in, say, year 1, the next would take place in year 4. Counting year 1 to year 4, including both, gives four years. In fact, from 1-01-01 to 4-01-01 only three years have passed (since the fourth years is only just beginning). So, the leap years were put into action the wrong way for several decades, until the error was eventually noticed.

Zero is Compatible With Address Arithmetic

Numbers as names are also called “addresses” (one might think of house numbers). Arithmetic involving such numbers also is called “address arithmetic”. When programming, in languages like the programming language C, arrays of objects are indexed by such addresses. A number corresponding to an index is also called an “offset”. The identification number of an object within an array is called the index of this object. The properties being described here do not only hold for the programming language C, but more general, for every low-level (machine) language, and many other languages. The following assertions about sub-arrays hold for every language with arrays.

When "a" is the address of an array, the address of its first component is "a+0" (not "a+1"). This suggest to use the number zero as the identification number of this component. When one is dealing with sub-arrays, this proves to be a simplification: The first component of the sub-array beginning at index "i" has the index "i+0", so that the offset "0" doubles as the natural relative index of the first component of this sub-array. (The relative index of a sub-array is the identification number of an object within the sub-array, while the absolute index is the identification number of an object within the whole array.) If one would insist to number the sub-array beginning at "i" using index values "j" starting at 1, the absolute index, corresponding to the index "j", would be "j+i−1", if one starts at zero, it is simply "j+i".

Zero Is a Possible State of Memory

With the usual representation of numbers by tuples of bits, the value 0 comes up naturally. If one does not use this value as an address, one is losing a possible value.

Four Two-Bit-Addresses when using 0

00 Address 0
01 Address 1
10 Address 2
11 Address 3

Only three Two-Bit-Addresses when not using 0

00 (unused)
01 Address 1
10 Address 2
11 Address 3

Zero is Compatible with Linear Operations

Also in the context of arithmetic with numbers with digits after a decimal point it is helpful to use zero as the starting point of intervals, because many linear or similar operations are easier to apply in this case. For example, a random number generator often delivers floating point numbers from a range starting with zero. Using the “floor” operation (which gives the largest integral number that is less than or equal to its argument) one directly obtains integral numbers of a range, which then also starts with zero.

Using a simple division or multiplication (a linear operation), the upper limit and the whole range can be rescaled to another range. The multiplication does not change the lower limit of the range (zero). If the range of random numbers would start with 1 instead, such linear operations could not be applied directly, because they also would change the lower range limit (which usually is not wanted).

Zero is the Neutral Element of Addition

When the natural numbers are regarded as a set, then "0" and "1" are only two distinct names—as soon as one is adding the usual structure, however, a significant difference appears: The "0", not the "1", is the neutral element regarding the addition. The natural numbers without "0" do not include such a neutral element, which thus is missed then sometimes.

What is a “number”?

Sometimes, it was argued that zero is not a “proper number”. Such beliefs depend on the culture one is living in. In fact, to the ancient Greeks, one was not a number, but the unit, from which all numbers could be constructed.

Comments
Comment 2 (`2008-10-15T15:20:07+02:00`)

Von Marc H.

Re https://www.purl.org/stefan_ram/pub/zero

The URL given at the top of the page seems to miss a trailing slash, if you copy/paste it you get "HTTP/1.1 403 Access Denied."

https://www.purl.org/stefan_ram/pub/zero

This one works:

https://www.purl.org/stefan_ram/pub/zero/

Answer 2008-10-18T03:06:54+02:00 Hi Marc ,

there have been annoying automatic requests of the first URI, so, temporarily, additional access restrictions were added for the first URI. The intention was to disallow some repeated automatic requests but to allow requests by a web browser. The detection of the type of request sometimes fails and, unfortunately, sometimes the access is denied to legitimate readers. This is what you have observed.
The second URI actually is wrong, but still it is accepted by the web server; it will return the same contents as the first URI. Because the additional access restrictions were not added for the second URI, more requests are answered with the document when the second URI is being used.
I apologize to all legitimate readers who suffer from spurious “403 Access Denied” messages. I will try to adjust my filter settings. For now, you have found a way to circumvent the access restrictions by using the second URI, so I hope that you can use this as long as I have not adjusted my filter settings to allow more legitimate requests.

Thanks, again!

Stefan Ram

Comment 1 (`2008-10-13T17:44:33+02:00`)

Von Marc H.

Re https://www.purl.org/stefan_ram/pub/zero

Hi Stefan,

Vielen Dank for your article "Numbering from Zero", the best Google found me so far. Here is a minor enhancement suggestion:

- While any imaginable schema could be invented,

- the most common schemas are 0, 1, 2, … or 1, 2, 3, ….

+ While any imaginable schema could be invented (like A, B, C,…),

+ the most common schemas are 0, 1, 2, … or 1, 2, 3, ….

Cheers,

〈E-Mail address 〉

Answer 2008-10-18T02:43:44+02:00 Hi Marc ,

I'm glad to read that you like the article and have added your suggestion.

Thank you!