There are quite a few different conventions for binary datetime, depending on the platform or protocol. Some of these have severe drawbacks. For example, people using Unix time (seconds since Jan 1, 1970, usually in a 32-bit integer) think that they are safe until near the year 2038. But cases can and do arise where arithmetic manipulations causes serious problems. Consider the computation of the average of two datetimes, for example: if one calculates them with averageTime = (time1 + time2)/2, there will be overflow even with dates beginning in 2004. Moreover, even if these problems don't occur, there is the issue of conversion back and forth between different systems.
Binary datetimes differ in a number of ways: the data type, the unit, and the epoch (origin). We'll refer to these as time scales. For example: (Sorted by epoch and unit, descending. In Java, int64_t=long and int32_t=int.)
All of the epochs start at 00:00 am (the earliest possible time on the day in question), and are usually assumed to be UTC.
The ranges, in years, for different data types are given in the following table. The range for integer types includes the entire range expressible with positive and negative values of the data type. The range for double is the range that would be allowed without losing precision to the corresponding unit.
ICU implements a universal time scale that is similar to the .NET framework's System.DateTime . The universal time scale is a 64-bit integer that holds ticks since midnight, January 1st, 0001. Negative values are supported. This has enough range to guarantee that calculations involving dates around the present are safe.
The universal time scale always measures time according to the proleptic Gregorian calendar. That is, the Gregorian calendar's leap year rules are used for all times, even before 1582 when it was introduced. (This is different from the default ICU calendar which switches from the Julian to the Gregorian calendar in 1582. See GregorianCalendar::setGregorianChange() and ucal_setGregorianChange().)
ICU provides conversion functions to and from all other major time scales, allowing datetimes in any time scale to be converted to the universal time scale, safely manipulated, and converted back to any other datetime time scale.
So how did we decide what to use for the universal time scale? Java time has plenty of range, but cannot represent a .NET System.DateTime value without severe loss of precision. ICU4C time addresses this by using a double that is otherwise equivalent to the Java time. However, there are disadvantages with doubles. They provide for much more graceful degradation in arithmetic operations. But they only have 53 bits of accuracy, which means that they will lose precision when converting back and forth to ticks. What would really be nice would be a long double (80 bits -- 64 bit mantissa), but that is not supported on most systems.
The Unix extended time uses a structure with two components: time in seconds and a fractional field (microseconds). However, this is clumsy, slow, and prone to error (you always have to keep track of overflow and underflow in the fractional field). BigDecimal would allow for arbitrary precision and arbitrary range, but we did not want to use this as the normal type, because it is slow and does not have a fixed size.
Because of these issues, we concluded that the .NET System.DateTime is the best timescale to use. However, we use the full range allowed by the data type, allowing for datetimes back to 29,000 BC and up to 29,000 AD. (System.DateTime uses only 62 bits and only supports dates from 0001 AD to 9999 AD.) This time scale is very fine grained, does not lose precision, and covers a range that will meet almost all requirements. It will not handle the range that Java times do, but frankly, being able to handle dates before 29,000 BC or after 29,000 AD is of very limited interest.
ICU provides routines to convert from other timescales to the universal time scale, to convert from the universal time scale to other timescales, and to get information about a particular timescale. In all of these routines, the timescales are referenced using an integer constant, according to the following table:
The routine that gets a particular piece of information about a timescale takes an integer constant that identifies the particular piece of information, according to the following table:
Here is what the values mean:
Precision - the precision of the timescale, in ticks.
Epoch offset – the distance from the universal timescale's epoch to the timescale's epoch, in the timescale's precision.
Minimum “from” value – the minimum timescale value that can safely be converted to the universal timescale.
Maximum “from” value – the maximum timescale value that can safely be converted to the universal timescale.
Minimum “to” value – the minimum universal timescale value that can safely be converted to the timescale.
Maximum “to” value – the maximum universal timescale value that can safely be converted to the timescale.
You can convert from other timescale values to the universal timescale using the “from” methods. In ICU4C, you use utmscale_fromInt64:
In ICU4J, you use UniversalTimeScale.from:
You can convert values in the universal timescale to other timescales using the “to” methods. In ICU4C, you use utmscale_toInt64:
In ICU4J, you use UniversalTimeScale.to:
That's all there is to it! If the conversion is out of range, the ICU4C routines will set the error code to U_ILLEGAL_ARGUMENT_ERROR, and the ICU4J methods will throw IllegalArgumentException. In ICU4J, you can avoid out of range conversions by using the BigDecimal methods:
Currently, ICU does not support direct formatting or parsing of Universal Time Scale values. If you want to format a Universal Time Scale value, you will need to convert it to an ICU time scale value first. Use UTDS_ICU4C_TIME with ICU4C, and UniversalTimeScale.JAVA_TIME with ICU4J.
When you parse a datetime string, the result will be an ICU time scale value. You can convert this value to a Universal Time Scale value using UDTS_ICU4C_TIME with ICU4C, and UniversalTime.JAVA_TIME for ICU4J.
See the previous section, Converting, for details of how to do the conversion.
To get information about a particular timescale in ICU4C, use utmscale_getTimeScaleValue:
In ICU4J, use UniversalTimeScale.getTimeScaleValue:
If the integer constants for selecting the timescale or the timescale value are out of range, the ICU4C routines will set the error code to U_ILLEGAL_ARGUMENT_ERROR, and the ICU4J methods will throw IllegalArgumentException.