深入理解GO时间处理(time.Time)

1. 前言

时间包括时间值和时区, 没有包含时区信息的时间是不完整的、有歧义的. 和外界传递或解析时间数据时, 应当像HTTP协议或unix-timestamp那样, 使用没有时区歧义的格式, 如果使用某些没有包含时区的非标准的时间表示格式(如yyyy-mm-dd HH:MM:SS), 是有隐患的, 因为解析时会使用场景的默认设置, 如系统时区, 数据库默认时区可能引发事故. 确保服务器系统、数据库、应用程序使用统一的时区, 如果因为一些历史原因, 应用程序各自保持着不同时区, 那么编程时要小心检查代码, 知道时间数据在使用不同时区的程序之间交换时的行为. 第三节会详细解释go程序在不同场景下time.Time的行为.

2. Time的数据结构

go1.9之前, time.Time的定义为

type Time struct {
// sec gives the number of seconds elapsed since
// January 1, year 1 00:00:00 UTC.
sec int64
// nsec specifies a non-negative nanosecond
// offset within the second named by Seconds.
// It must be in the range [0, 999999999].
nsec int32
// loc specifies the Location that should be used to
// determine the minute, hour, month, day, and year
// that correspond to this Time.
// The nil location means UTC.
// All UTC times are represented with loc==nil, never loc==&utcLoc.
loc *Location
}

sec表示从公元1年1月1日00:00:00UTC到要表示的整数秒数, nsec表示余下的纳秒数, loc表示时区. sec和nsec处理没有歧义的时间值, loc处理偏移量.

因为2017年闰一秒, 国际时钟调整, Go程序两次取time.Now()相减的时间差得到了意料之外的负数, 导致cloudFlare的CDN服务中断, 详见https://blog.cloudflare.com/how-and-why-the-leap-second-affected-cloudflare-dns/, go1.9在不影响已有应用代码的情况下修改了time.Time的实现. go1.9的time.Time定义为

// A Time represents an instant in time with nanosecond precision.
//
// Programs using times should typically store and pass them as values,
// not pointers. That is, time variables and struct fields should be of
// type time.Time, not *time.Time.
//
// A Time value can be used by multiple goroutines simultaneously except
// that the methods GobDecode, UnmarshalBinary, UnmarshalJSON and
// UnmarshalText are not concurrency-safe.
//
// Time instants can be compared using the Before, After, and Equal methods.
// The Sub method subtracts two instants, producing a Duration.
// The Add method adds a Time and a Duration, producing a Time.
//
// The zero value of type Time is January 1, year 1, 00:00:00.000000000 UTC.
// As this time is unlikely to come up in practice, the IsZero method gives
// a simple way of detecting a time that has not been initialized explicitly.
//
// Each Time has associated with it a Location, consulted when computing the
// presentation form of the time, such as in the Format, Hour, and Year methods.
// The methods Local, UTC, and In return a Time with a specific location.
// Changing the location in this way changes only the presentation; it does not
// change the instant in time being denoted and therefore does not affect the
// computations described in earlier paragraphs.
//
// Note that the Go == operator compares not just the time instant but also the
// Location and the monotonic clock reading. Therefore, Time values should not
// be used as map or database keys without first guaranteeing that the
// identical Location has been set for all values, which can be achieved
// through use of the UTC or Local method, and that the monotonic clock reading
// has been stripped by setting t = t.Round(0). In general, prefer t.Equal(u)
// to t == u, since t.Equal uses the most accurate comparison available and
// correctly handles the case when only one of its arguments has a monotonic
// clock reading.
//
// In addition to the required “wall clock” reading, a Time may contain an optional
// reading of the current process's monotonic clock, to provide additional precision
// for comparison or subtraction.
// See the “Monotonic Clocks” section in the package documentation for details.
//
type Time struct {
// wall and ext encode the wall time seconds, wall time nanoseconds,
// and optional monotonic clock reading in nanoseconds.
//
// From high to low bit position, wall encodes a 1-bit flag (hasMonotonic),
// a 33-bit seconds field, and a 30-bit wall time nanoseconds field.
// The nanoseconds field is in the range [0, 999999999].
// If the hasMonotonic bit is 0, then the 33-bit field must be zero
// and the full signed 64-bit wall seconds since Jan 1 year 1 is stored in ext.
// If the hasMonotonic bit is 1, then the 33-bit field holds a 33-bit
// unsigned wall seconds since Jan 1 year 1885, and ext holds a
// signed 64-bit monotonic clock reading, nanoseconds since process start.
wall uint64
ext int64
// loc specifies the Location that should be used to
// determine the minute, hour, month, day, and year
// that correspond to this Time.
// The nil location means UTC.
// All UTC times are represented with loc==nil, never loc==&utcLoc.
loc *Location
}

3. time的行为

  1. 构造时间-获取现在时间-time.Now(), time.Now()使用本地时间, time.Local即本地时区, 取决于运行的系统环境设置, 优先取”TZ”这个环境变量, 然后取/etc/localtime, 都取不到就用UTC兜底.

    func Now() Time {
    sec, nsec := now()
    return Time{sec + unixToInternal, nsec, Local}
    }
  1. 构造时间-获取某一时区的现在时间-time.Now().In(), Time结构体的In()方法仅设置loc, 不会改变时间值. 特别地, 如果是获取现在的UTC时间, 可以使用Time.Now().UTC().
    时区不能为nil. time包中只有两个时区变量time.Local和time.UTC. 其他时区变量有两种方法取得, 一个是通过time.LoadLocation函数根据时区名字加载, 时区名字见IANA Time Zone database, LoadLocation首先查找系统zoneinfo, 然后查找$GOROOT/lib/time/zoneinfo.zip.另一个是在知道时区名字和偏移量的情况下直接调用time.FixedZone("$zonename", $offsetSecond)构造一个Location对象.

    // In returns t with the location information set to loc.
    //
    // In panics if loc is nil.
    func (t Time) In(loc *Location) Time {
    if loc == nil {
    panic("time: missing Location in call to Time.In")
    }
    t.setLoc(loc)
    return t
    }
    // LoadLocation returns the Location with the given name.
    //
    // If the name is "" or "UTC", LoadLocation returns UTC.
    // If the name is "Local", LoadLocation returns Local.
    //
    // Otherwise, the name is taken to be a location name corresponding to a file
    // in the IANA Time Zone database, such as "America/New_York".
    //
    // The time zone database needed by LoadLocation may not be
    // present on all systems, especially non-Unix systems.
    // LoadLocation looks in the directory or uncompressed zip file
    // named by the ZONEINFO environment variable, if any, then looks in
    // known installation locations on Unix systems,
    // and finally looks in $GOROOT/lib/time/zoneinfo.zip.
    func LoadLocation(name string) (*Location, error) {
    if name == "" || name == "UTC" {
    return UTC, nil
    }
    if name == "Local" {
    return Local, nil
    }
    if zoneinfo != "" {
    if z, err := loadZoneFile(zoneinfo, name); err == nil {
    z.name = name
    return z, nil
    }
    }
    return loadLocation(name)
    }
  1. 构造时间-手动构造时间-time.Date(), 传入年元日时分秒纳秒和时区变量Location构造一个时间. 得到的是指定location的时间.

    func Date(year int, month Month, day, hour, min, sec, nsec int, loc *Location) Time {
    if loc == nil {
    panic("time: missing Location in call to Date")
    }
    .....
    }
  1. 构造时间-从unix时间戳中构造时间, time.Unix(), 传入秒和纳秒构造.
  2. 序列化反序列化时间-文本和JSON, fmt.Sprintf,fmt.SScanf, json.Marshal, json.Unmarshal时的, 使用的时间格式均包含时区信息, 序列化使用RFC3339Nano()”2006-01-02T15:04:05.999999999Z07:00”, 反序列化使用RFC3339()”2006-01-02T15:04:05Z07:00”, 反序列化没有纳秒值也可以正常序列化成功.

    // String returns the time formatted using the format string
    // "2006-01-02 15:04:05.999999999 -0700 MST"
    func (t Time) String() string {
    return t.Format("2006-01-02 15:04:05.999999999 -0700 MST")
    }
    // MarshalJSON implements the json.Marshaler interface.
    // The time is a quoted string in RFC 3339 format, with sub-second precision added if present.
    func (t Time) MarshalJSON() ([]byte, error) {
    if y := t.Year(); y < 0 || y >= 10000 {
    // RFC 3339 is clear that years are 4 digits exactly.
    // See golang.org/issue/4556#c15 for more discussion.
    return nil, errors.New("Time.MarshalJSON: year outside of range [0,9999]")
    }
    b := make([]byte, 0, len(RFC3339Nano)+2)
    b = append(b, '"')
    b = t.AppendFormat(b, RFC3339Nano)
    b = append(b, '"')
    return b, nil
    }
    // UnmarshalJSON implements the json.Unmarshaler interface.
    // The time is expected to be a quoted string in RFC 3339 format.
    func (t *Time) UnmarshalJSON(data []byte) error {
    // Ignore null, like in the main JSON package.
    if string(data) == "null" {
    return nil
    }
    // Fractional seconds are handled implicitly by Parse.
    var err error
    *t, err = Parse(`"`+RFC3339+`"`, string(data))
    return err
    }
  1. 序列化反序列化时间-HTTP协议中的date, 统一GMT, 代码位于net/http/server.go:878

    // TimeFormat is the time format to use when generating times in HTTP
    // headers. It is like time.RFC1123 but hard-codes GMT as the time
    // zone. The time being formatted must be in UTC for Format to
    // generate the correct format.
    //
    // For parsing this time format, see ParseTime.
    const TimeFormat = "Mon, 02 Jan 2006 15:04:05 GMT"
  1. 序列化反序列化时间-time.Format("$layout"), time.Parse("$layout","$value"), time.ParseInLocation("$layout","$value","$Location")

    • time.Format("$layout")格式化时间时, 时区会参与计算. 调time.Time的Year()Month()Day()等获取年月日等时时区会参与计算, 得到一个使用偏移量修正过的正确的时间字符串, 若$layout有指定显示时区, 那么时区信息会体现在格式化后的时间字符串中. 如果$layout没有指定显示时区, 那么字符串只有时间没有时区, 时区是隐含的, time.Time对象中的时区.
    • time.Parse("$layout","$value"), 若$layout有指定显示时区, 那么时区信息会体现在格式化后的time.Time对象. 如果$layout没有指定显示时区, 那么使用会认为这是一个UTC时间, 时区是UTC.
    • time.ParseInLocation("$layout","$value","$Location") 使用传参的时区解析时间, 建议用这个, 没有歧义.

      // Parse parses a formatted string and returns the time value it represents.
      // The layout defines the format by showing how the reference time,
      // defined to be
      // Mon Jan 2 15:04:05 -0700 MST 2006
      // would be interpreted if it were the value; it serves as an example of
      // the input format. The same interpretation will then be made to the
      // input string.
      //
      // Predefined layouts ANSIC, UnixDate, RFC3339 and others describe standard
      // and convenient representations of the reference time. For more information
      // about the formats and the definition of the reference time, see the
      // documentation for ANSIC and the other constants defined by this package.
      // Also, the executable example for time.Format demonstrates the working
      // of the layout string in detail and is a good reference.
      //
      // Elements omitted from the value are assumed to be zero or, when
      // zero is impossible, one, so parsing "3:04pm" returns the time
      // corresponding to Jan 1, year 0, 15:04:00 UTC (note that because the year is
      // 0, this time is before the zero Time).
      // Years must be in the range 0000..9999. The day of the week is checked
      // for syntax but it is otherwise ignored.
      //
      // In the absence of a time zone indicator, Parse returns a time in UTC.
      //
      // When parsing a time with a zone offset like -0700, if the offset corresponds
      // to a time zone used by the current location (Local), then Parse uses that
      // location and zone in the returned time. Otherwise it records the time as
      // being in a fabricated location with time fixed at the given zone offset.
      //
      // No checking is done that the day of the month is within the month's
      // valid dates; any one- or two-digit value is accepted. For example
      // February 31 and even February 99 are valid dates, specifying dates
      // in March and May. This behavior is consistent with time.Date.
      //
      // When parsing a time with a zone abbreviation like MST, if the zone abbreviation
      // has a defined offset in the current location, then that offset is used.
      // The zone abbreviation "UTC" is recognized as UTC regardless of location.
      // If the zone abbreviation is unknown, Parse records the time as being
      // in a fabricated location with the given zone abbreviation and a zero offset.
      // This choice means that such a time can be parsed and reformatted with the
      // same layout losslessly, but the exact instant used in the representation will
      // differ by the actual zone offset. To avoid such problems, prefer time layouts
      // that use a numeric zone offset, or use ParseInLocation.
      func Parse(layout, value string) (Time, error) {
      return parse(layout, value, UTC, Local)
      }
      // ParseInLocation is like Parse but differs in two important ways.
      // First, in the absence of time zone information, Parse interprets a time as UTC;
      // ParseInLocation interprets the time as in the given location.
      // Second, when given a zone offset or abbreviation, Parse tries to match it
      // against the Local location; ParseInLocation uses the given location.
      func ParseInLocation(layout, value string, loc *Location) (Time, error) {
      return parse(layout, value, loc, loc)
      }
      func parse(layout, value string, defaultLocation, local *Location) (Time, error) {
      .....
      }
  2. 序列化反序列化时间-go-sql-driver/mysql中的时间处理.
    MySQL驱动解析时间的前提是连接字符串加了parseTime和loc, 如果parseTime为false, 会把mysql的date类型变成[]byte/string自行处理, parseTime为true才处理时间, loc指定MySQL中存储时间数据的时区, 如果没有指定loc, 用UTC. 序列化和反序列化均使用连接字符串中的设定的loc, SQL语句中的time.Time类型的参数的时区信息如果和loc不同, 则会调用t.In(loc)方法转时区.

    • 解析连接字符串的代码位于parseDSNParams函数https://github.com/go-sql-driver/mysql/blob/master/dsn.go#L467-L490

      // Time Location
      case "loc":
      if value, err = url.QueryUnescape(value); err != nil {
      return
      }
      cfg.Loc, err = time.LoadLocation(value)
      if err != nil {
      return
      }
      // time.Time parsing
      case "parseTime":
      var isBool bool
      cfg.ParseTime, isBool = readBool(value)
      if !isBool {
      return errors.New("invalid bool value: " + value)
      }
    • 解析SQL语句中time.Time类型的参数的代码位于mysqlConn.interpolateParams方法https://github.com/go-sql-driver/mysql/blob/master/connection.go#L230-L273

      case time.Time:
      if v.IsZero() {
      buf = append(buf, "'0000-00-00'"...)
      } else {
      v := v.In(mc.cfg.Loc)
      v = v.Add(time.Nanosecond * 500) // To round under microsecond
      year := v.Year()
      year100 := year / 100
      year1 := year % 100
      month := v.Month()
      day := v.Day()
      hour := v.Hour()
      minute := v.Minute()
      second := v.Second()
      micro := v.Nanosecond() / 1000
      buf = append(buf, []byte{
      '\'',
      digits10[year100], digits01[year100],
      digits10[year1], digits01[year1],
      '-',
      digits10[month], digits01[month],
      '-',
      digits10[day], digits01[day],
      ' ',
      digits10[hour], digits01[hour],
      ':',
      digits10[minute], digits01[minute],
      ':',
      digits10[second], digits01[second],
      }...)
      if micro != 0 {
      micro10000 := micro / 10000
      micro100 := micro / 100 % 100
      micro1 := micro % 100
      buf = append(buf, []byte{
      '.',
      digits10[micro10000], digits01[micro10000],
      digits10[micro100], digits01[micro100],
      digits10[micro1], digits01[micro1],
      }...)
      }
      buf = append(buf, '\'')
      }
    • 从MySQL数据流中解析时间的代码位于textRows.readRow方法https://github.com/go-sql-driver/mysql/blob/master/packets.go#L772-L777, 注意只要MySQL连接字符串设置了parseTime=true, 就会解析时间, 不管你是用string还是time.Time接收的.

      if !isNull {
      if !mc.parseTime {
      continue
      } else {
      switch rows.rs.columns[i].fieldType {
      case fieldTypeTimestamp, fieldTypeDateTime,
      fieldTypeDate, fieldTypeNewDate:
      dest[i], err = parseDateTime(
      string(dest[i].([]byte)),
      mc.cfg.Loc,
      )
      if err == nil {
      continue
      }
      default:
      continue
      }
      }
      }

4. time时区处理不当案例

  1. 有个服务频繁使用最新汇率, 所以缓存了最新汇率对象, 汇率对象的过期时间设为第二天北京时间零点, 汇率过期则从数据库中去最新汇率, 设置过期时间的代码如下:

    var startTime string = time.Now().UTC().Add(8 * time.Hour).Format("2006-01-02")
    tm2, _ := time.Parse("2006-01-02", startTime)
    lastTime = tm2.Unix() + 24*60*60

    这段代码使用了time.Parse, 如果时间格式中没有指定时区, 那么会得到使用本地时区下的第二天零点, 服务器时区设置为UTC0, 于是汇率缓存在UTC零点即北京时间八点才更新.

  2. 公共库中有一个GetBjTime()方法, 注释写着将服务器UTC转成北京时间, 代码如下

    // 原版
    func GetBjTime() time.Time {
    // 将服务器UTC转成北京时间
    uTime := time.Now().UTC()
    dur, _ := time.ParseDuration("+8h")
    return uTime.Add(dur)
    }
    // 改
    func GetBjTime() time.Time {
    // 将服务器UTC转成北京时间
    uTime := time.Now()
    return uTime.In(time.FixedZone("CST", 8*60*60))
    }

    同事用这个方法将得到的time.Time参与计算, 发现多了8个小时. 觉得有问题, 同事和我讨论了之后, 我们得出结论后就大意地直接把原有函数改了, 我们都没有意识到这是个非常危险操作, 只所以危险是因为这个函数已经在很多服务的代码里用着(要稳!不能乱动公共库!!!). 之前用这个函数是因为老Java项目运行在时区为东八区的系统上, 大量代码使用东八区时间, 但数据库MySQL时区设置为UTC, go项目也运行在UTC时区. 也就是说, Java项目在把时区为UTC数据库当做是东八区来用, Java程序往MySQL写东八区的时间字符串, 在sequel软件中看表内容时虽然字符串是一样的, 但其实内部是UTC的时间, go代码的mysql连接字符串中loc选项为空, 就会使用UTC时区去解析数据, 拿到的数据会多八个小时. 例如Java代码往mysql插入一条”2017-10-29 22:00:00”数据本意是东八区2017年10月29日22点, 但在MySQL内部看来, 这是UTC的2017年10月29日22点, 换算成东八区时间为2017年10月30日6点, 如果其它程序解析时认为时间数据是MySQL的UTC时区, 那么会得到一个错误的时间. 所以才会在GO中要往Java代码创建的表写入数据时用time.Now().UTC().Add(time.Hour*8)直接相加八小时使得Java项目行为一致, 拿UTC的数据库存东八区时间.

    后面想想, 面对这种数据库中有时区不一致数据的情况, 在没有办法统一UTC时区的情况下, 应当使用MySQL时间字符串而不是time.Time来传递以避免时区隐含转换问题, 写入时参数传string类型的时间字符串, 解析时先拿到时间字符串, 然后自行判断建表时这个字段用的是东八区的时间字符串还是UTC时间字符串进行time.ParseInLocation得到时间对象, MySQL连接字符串的parseTime选项要设置为false. 比如我想在MySQL中存东八区的当前时间, SQL参数用Format后的字符串而不是传time.Time, 原版的time.Now().UTC().Add(time.Hour*8).Format("2006-01-02 15:04:05")和修改的time.Now().In(time.FixedZone("CST", 8*60*60))的输出将是一样, 但后者是正确的东八区现在时间. 原版的GetBjTime()返回time.Time可能用GetBeijingNowTimeString返回string更能体现本意吧.

5. 时间有关的标准

  • UTC

    协调世界时(英语:Coordinated Universal Time,法语:Temps Universel Coordonné,简称UTC)是最主要的世界时间标准,其以原子时秒长为基础,在时刻上尽量接近于格林尼治标准时间。中华民国采用CNS 7648的《资料元及交换格式–资讯交换–日期及时间的表示法》(与ISO 8601类似)称之为世界协调时间。中华人民共和国采用ISO 8601:2000的国家标准GB/T 7408-2005《数据元和交换格式 信息交换 日期和时间表示法》中亦称之为协调世界时。
    协调世界时是世界上调节时钟和时间的主要时间标准,它与0度经线的平太阳时相差不超过1秒[4],并不遵守夏令时。协调世界时是最接近格林威治标准时间(GMT)的几个替代时间系统之一。对于大多数用途来说,UTC时间被认为能与GMT时间互换,但GMT时间已不再被科学界所确定。

  • ISO 8601 计算某一天在一年的第几周/循环时间RRlue/会用到此标准

    国际标准ISO 8601,是国际标准化组织的日期和时间的表示方法,全称为《数据存储和交换形式·信息交换·日期和时间的表示方法》。目前是第三版“ISO8601:2004”以替代第一版“ISO8601:1988”与第二版“ISO8601:2000”。

  • UNIX时间

    UNIX时间,或称POSIX时间是UNIX或类UNIX系统使用的时间表示方式:从协调世界时1970年1月1日0时0分0秒起至现在的总秒数,不考虑闰秒[1]。 在多数Unix系统上Unix时间可以通过date +%s指令来检查。

  • 时区

    时区列表

分享到 评论

Go如何精确计算小数-Decimal研究-Tidb MyDecimal问题

##1 浮点数为什么不精确
先看两个case

// case1: 135.90*100 ====
// float32
var f1 float32 = 135.90
fmt.Println(f1 * 100) // output:13589.999
// float64
var f2 float64 = 135.90
fmt.Println(f2 * 100) // output:13590

浮点数在单精度下, 135.9*100即出现了偏差, 双精度下结果正确.

// case2: 0.1 add 10 times ===
// float32
var f3 float32 = 0
for i := 0; i < 10; i++ {
f3 += 0.1
}
fmt.Println(f3) //output:1.0000001
// float64
var f4 float64 = 0
for i := 0; i < 10; i++ {
f4 += 0.1
}
fmt.Println(f4) //output:0.9999999999999999

0.1加10次, 这下无论是float32和float64都出现了偏差.

为什么呢, Go和大多数语言一样, 使用标准的IEEE754表示浮点数, 0.1使用二进制表示结果是一个无限循环数, 只能舍入后表示, 累加10次之后就会出现偏差.

此外, 还有几个隐藏的坑https://play.golang.org/p/bQPbirROmN

  1. float32和float64直接互转会精度丢失, 四舍五入后错误.
  2. int64转float64在数值很大的时候出现偏差.
  3. 合理但须注意: 两位小数乘100强转int, 比期望值少了1.
package main
import (
"fmt"
)
func main() {
// case: float32==>float64
// 从数据库中取出80.45, 历史代码用float32接收
var a float32 = 80.45
var b float64
// 有些函数只能接收float64, 只能强转
b = float64(a)
// 打印出值, 强转后出现偏差
fmt.Println(a) //output:80.45
fmt.Println(b) //output:80.44999694824219
// ... 四舍五入保留小数点后1位, 期望80.5, 结果是80.4
// case: int64==>float64
var c int64 = 987654321098765432
fmt.Printf("%.f\n", float64(c)) //output:987654321098765440
// case: int(float64(xx.xx*100))
var d float64 = 1129.6
var e int64 = int64(d * 100)
fmt.Println(e) //output:112959
}

##2 数据库是怎么做的
MySQL提供了decimal(p,d)/numberlic(p,d)类型的定点数表示法, 由p位数字(不包括符号、小数点)组成, 小数点后面有d位数字, 占p+2个字节, 计算性能会比double/float类型弱一些.

##3 Go代码如何实现Decimal
Java有成熟的标准库java.lang.BigDecimal,Python有标准库Decimal, 可惜GO没有. 在GitHub搜decimal, star数量比较多的是TiDB里的MyDecimal和ithub.com/shopspring/decimal的实现.

  • shopspring的Decimal实现比较简单, 思路是使用十进制定点数表示法, 有多少位小数就小数点后移多少位, value保存移之后的整数, exp保存小数点后的数位个数, number=value*10^exp, 因为移小数点后的整数可能很大, 所以这里借用标准包里的math/big表示这个大整数. exp使用了int32, 所以这个包最多能表示小数点后有32个十进制数位的情况.

    Decimal结构体的定义如下

    // Decimal represents a fixed-point decimal. It is immutable.
    // number = value * 10 ^ exp
    type Decimal struct {
    value *big.Int
    // NOTE(vadim): this must be an int32, because we cast it to float64 during
    // calculations. If exp is 64 bit, we might lose precision.
    // If we cared about being able to represent every possible decimal, we
    // could make exp a *big.Int but it would hurt performance and numbers
    // like that are unrealistic.
    exp int32
    }
  • TiDB里的MyDecimal定义位于github.com/pingcap/tidb/util/types/mydecimal.go, 实现比shopspring的Decimal复杂多了, 也更底层(不依赖math/big), 性能也更好(见下面的benchmark). 其思路是:
    digitsInt保存数字的整数部分数字个数, digitsFrac保存数字的小数部分数字个数, resultFrac保存计算及序列化时保留至小数点后几位, negative标明数字是否为负数, wordBuf是一个定长的int32数组(长度为9), 数字去掉小数点的主体保存在这里, 一个int32有32个bit, 最大值为(2**31-1)2147483647(10个十进制数), 所以一个int32最多能表示9个十进制数位, 因此wordBuf 最多能容纳9*9个十进制数位.

    // MyDecimal represents a decimal value.
    type MyDecimal struct {
    digitsInt int8 // the number of *decimal* digits before the point.
    digitsFrac int8 // the number of decimal digits after the point.
    resultFrac int8 // result fraction digits.
    negative bool
    // wordBuf is an array of int32 words.
    // A word is an int32 value can hold 9 digits.(0 <= word < wordBase)
    wordBuf [maxWordBufLen]int32
    }

看看这两种decimal类型在文首的两个case下的结果, 同时跑个分.

main_test.go

package main
import (
"testing"
"github.com/shopspring/decimal"
"github.com/pingcap/tidb/util/types"
"log"
)
var case1String = "135.90"
var case1Bytes = []byte(case1String)
var case2String = "0"
var case2Bytes = []byte("0")
func ShopspringDecimalCase1() decimal.Decimal {
dec1, err := decimal.NewFromString(case1String)
if err != nil {
log.Fatal(err)
}
dec2 := decimal.NewFromFloat(100)
dec3 := dec1.Mul(dec2)
return dec3
}
func TidbDecimalCase1() *types.MyDecimal {
dec1 := new(types.MyDecimal)
err := dec1.FromString(case1Bytes)
if err != nil {
log.Fatal(err)
}
dec2 := new(types.MyDecimal).FromInt(100)
dec3 := new(types.MyDecimal)
err = types.DecimalMul(dec1, dec2, dec3)
if err != nil {
log.Fatal(err)
}
return dec3
}
func ShopspringDecimalCase2() decimal.Decimal {
dec1, err := decimal.NewFromString(case2String)
if err != nil {
log.Fatal(err)
}
dec2 := decimal.NewFromFloat(0.1)
for i := 0; i < 10; i++ {
dec1 = dec1.Add(dec2)
}
return dec1
}
func TidbDecimalCase2() *types.MyDecimal {
dec1 := new(types.MyDecimal)
dec1.FromString(case2Bytes)
dec2 := new(types.MyDecimal)
dec2.FromFloat64(0.1)
for i := 0; i < 10; i++ {
types.DecimalAdd(dec1, dec2, dec1)
}
return dec1
}
// case1: 135.90*100 ====
func BenchmarkShopspringDecimalCase1(b *testing.B) {
for i := 0; i < b.N; i++ {
ShopspringDecimalCase1()
}
b.Log(ShopspringDecimalCase1()) // output: 13590
}
func BenchmarkTidbDecimalCase1(b *testing.B) {
for i := 0; i < b.N; i++ {
TidbDecimalCase1()
}
b.Log(TidbDecimalCase1()) // output: 13590.00
}
// case2: 0.1 add 10 times ===
func BenchmarkShopspringDecimalCase2(b *testing.B) {
for i := 0; i < b.N; i++ {
ShopspringDecimalCase2()
}
b.Log(ShopspringDecimalCase2()) // output: 1
}
func BenchmarkTidbDecimalCase2(b *testing.B) {
for i := 0; i < b.N; i++ {
TidbDecimalCase2()
}
b.Log(TidbDecimalCase2()) // output: 1.0
}
BenchmarkShopspringDecimalCase1-8 2000000 664 ns/op 340 B/op 10 allocs/op
BenchmarkTidbDecimalCase1-8 20000000 99.2 ns/op 48 B/op 1 allocs/op
BenchmarkShopspringDecimalCase2-8 300000 5210 ns/op 4294 B/op 111 allocs/op
BenchmarkTidbDecimalCase2-8 3000000 517 ns/op 83 B/op 3 allocs/op

可见两种实现在上面两个case下表示准确, TiDB的decimal实现的性能高于shopspring的实现, 堆内存分配次数也更少.

##4. MyDecimal的已知问题

用了一段时间后, tidb.MyDecimal也有一些问题

  1. 原版除法有bug, 可以通过除数和被除数同时放大一定倍数临时修复, 更好的解决方法需要官方人员解决, 已提issue, 这个bug真是匪夷所思. https://github.com/pingcap/tidb/issues/4873, 2017.11.3官方修复decimal除法问题:https://github.com/pingcap/tidb/pull/4995/files.
  2. 原版乘法有小问题, 行为不一致, 原版的from1和to不能为同一个指针, 但 Add Sub Div却可以. 可以通过copy参数修复.
  3. 移位小坑, 右移属于扩大数值, 没有问题. 左移有问题, 注意1左移两位不会变成0.01, 所以shift不要传负数.
  4. round, 目前这个库的Round模式ModeHalfEven实际上是ModeHalfUp, 正常的四舍五入, 不是float的ModeHalfEven. 3.5=>4, 4.5=>5, 5.5=>6, 注意后期是否有变更.
分享到 评论