Go 接口：nil接口为什么不等于nil？

本文介绍: 接口的静态特性体现在接口类型变量具有静态类型。比如 var err error中变量 err的静态类型为 error。拥有静态类型，那就意味着编译器会在编译阶段对所有接口类型变量的赋值操作进行类型检查，编译器会检查右值的类型是否实现了该接口方法集合中的所有方法。**而接口的动态特性，就体现在接口类型变量在运行时还存储了右值的真实类型信息，这个右值的真实类型被称为接口类型变量的动态类型。我们可以看到，这个示例通过 err ros.New构造了一个错误值，赋值给了 error接口类型变量 err。

接口的静态特性体现在接口类型变量具有静态类型。

比如 var err error 中变量 err 的静态类型为 error。拥有静态类型，那就意味着编译器会在编译阶段对所有接口类型变量的赋值操作进行类型检查，编译器会检查右值的类型是否实现了该接口方法集合中的所有方法。如果不满足，就会报错：

var err error = 1 // cannot use 1 (type int) as type error in assignment: int does not implement error (missing Error method)

**而接口的动态特性，就体现在接口类型变量在运行时还存储了右值的真实类型信息，这个右值的真实类型被称为接口类型变量的动态类型。例如，下面示例代码：

var err error
err = errors.New("error1")
fmt.Printf("%Tn", err)  // *errors.errorString

我们可以看到，这个示例通过 errros.New 构造了一个错误值，赋值给了 error 接口类型变量 err，并通过 fmt.Printf 函数输出接口类型变量 err 的动态类型为 *errors.errorString。

首先，接口类型变量在程序运行时可以被赋值为不同的动态类型变量，每次赋值后，接口类型变量中存储的动态类型信息都会发生变化，这让 Go 语言可以像动态语言（比如 Pyt h on）那样拥有使用 Duc k Typing（鸭子类型）的灵活性。所谓鸭子类型，就是指某类型所表现出的特性（比如是否可以作为某接口类型的右值），不是由其基因（比如 C++ 中的父类）决定的，而是由类型所表现出来的行为（比如类型拥有的方法）决定的。

比如下面的例子：

type QuackableAnimal interface {
    Quack()
}

type Duck struct{}

func (Duck) Quack() {
    println("duck quack!")
}

type Dog struct{}

func (Dog) Quack() {
    println("dog quack!")
}

type Bird struct{}

func (Bird) Quack() {
    println("bird quack!")
}                         
                          
func AnimalQuackInForest(a QuackableAnimal) {
    a.Quack()             
}                         
                          
func main() {             
    animals := []QuackableAnimal{new(Duck), new(Dog), new(Bird)}
    for _, animal := range animals {
        AnimalQuackInForest(animal)
    }  
}

这个例子中，我们用接口类型 QuackableAnimal 来代表具有“会叫”这一特征的动物，而 Duck、Bird 和 Dog 类型各自都具有这样的特征，于是我们可以将这三个类型的变量赋值给 QuackableAnimal 接口类型变量 a。每次赋值，变量 a 中存储的动态类型信息都不同，Quack 方法的执行结果将根据变量 a 中存储的动态类型信息而定。

我们先来看一段改编自GO FAQ 中的例子的代码：

type MyError struct {
    error
}

var ErrBad = MyError{
    error: errors.New("bad things happened"),
}

func bad() bool {
    return false
}

func returnsError() error {
    var p *MyError = nil
    if bad() {
        p = &amp;ErrBad
    }
    return p
}

func main() {
    err := returnsError()
    if err != nil {
        fmt.Printf("error occur: %+vn", err)
        return
    }
    fmt.Println("ok")
}

我们运行这段程序后，输出如下：

error occur: <nil>

接口类型“动静兼备”的特性也决定了它的变量的内部表示绝不像一个静态类型变量（如 int、float64）那样简单，我们可以在 $GOROOT/src/runtime/runtime2.go 中找到接口类型变量在运行时的表示：

// $GOROOT/src/runtime/runtime2.go
type iface struct {
    tab  *itab
    data unsafe.Pointer
}

type eface struct {
    _type *_type
    data  unsafe.Pointer
}

那它们的不同点在哪呢？就在于 eface 表示的空接口类型并没有方法列表，因此它的第一个指针字段指向一个 _type 类型结构，这个结构为该接口类型变量的动态类型的信息，它的定义是这样的：

// $GOROOT/src/runtime/type.go

type _type struct {
    size       uintptr
    ptrdata    uintptr // size of memory prefix holding all pointers
    hash       uint32
    tflag      tflag
    align      uint8
    fieldAlign uint8
    kind       uint8
    // function for comparing objects of this type
    // (ptr to object A, ptr to object B) -> ==?
    equal func(unsafe.Pointer, unsafe.Pointer) bool
    // gcdata stores the GC type data for the garbage collector.
    // If the KindGCProg bit is set in kind, gcdata is a GC program.
    // Otherwise it is a ptrmask bitmap. See mbitmap.go for details.
    gcdata    *byte
    str       nameOff
    ptrToThis typeOff
}

而 iface 除了要存储动态类型信息之外，还要存储接口本身的信息（接口的类型信息、方法列表信息等）以及动态类型所实现的方法的信息，因此 iface 的第一个字段指向一个 itab 类型结构。itab 结构的定义如下：

// $GOROOT/src/runtime/runtime2.go
type itab struct {
    inter *interfacetype
    _type *_type
    hash  uint32 // copy of _type.hash. Used for type switches.
    _     [4]byte
    fun   [1]uintptr // variable sized. fun[0]==0 means _type does not implement inter.
}

// $GOROOT/src/runtime/type.go
type interfacetype struct {
    typ     _type
    pkgpath name
    mhdr    []imethod
}

下面我们再结合例子用图片来直观展现 eface 和 iface 的结构。首先我们看一个用 eface 表示的空接口类型变量的例子：

type T struct {
    n int
    s string
}

func main() {
    var t = T {
        n: 17,
        s: "hello, interface",
    }
    
    var ei interface{} = t // Go运行时使用eface结构表示ei
}

我们再来看一个更复杂的用 iface 表示非空接口类型变量的例子：

type T struct {
    n int
    s string
}

func (T) M1() {}
func (T) M2() {}

type NonEmptyInterface interface {
    M1()
    M2()
}

func main() {
    var t = T{
        n: 18,
        s: "hello, interface",
    }
    var i NonEmptyInterface = t
}

在编译阶段，编译器会根据要输出的参数的类型将 println 替换为特定的函数，这些函数都定义在 $GOROOT/src/runtime/print.go 文件中，而针对 eface 和 iface 类型的打印函数实现如下：

// $GOROOT/src/runtime/print.go
func printeface(e eface) {
    print("(", e._type, ",", e.data, ")")
}

func printiface(i iface) {
    print("(", i.tab, ",", i.data, ")")
}

我们知道，未赋初值的接口类型变量的值为 nil，这类变量也就是 nil 接口变量，我们来看这类变量的内部表示输出的例子：

func printNilInterface() {
  // nil接口变量
  var i interface{} // 空接口类型
  var err error     // 非空接口类型
  println(i)
  println(err)
  println("i = nil:", i == nil)
  println("err = nil:", err == nil)
  println("i = err:", i == err)
}

运行这个函数，输出结果是这样的：

(0x0,0x0)
(0x0,0x0)
i = nil: true
err = nil: true
i = err: true

下面是空接口类型变量的内部表示输出的例子：

  func printEmptyInterface() {
      var eif1 interface{} // 空接口类型
      var eif2 interface{} // 空接口类型
      var n, m int = 17, 18
  
      eif1 = n
      eif2 = m

      println("eif1:", eif1)
      println("eif2:", eif2)
      println("eif1 = eif2:", eif1 == eif2) // false
  
      eif2 = 17
      println("eif1:", eif1)
      println("eif2:", eif2)
      println("eif1 = eif2:", eif1 == eif2) // true
 
      eif2 = int64(17)
      println("eif1:", eif1)
      println("eif2:", eif2)
      println("eif1 = eif2:", eif1 == eif2) // false
 }

这个例子的运行输出结果是这样的：

eif1: (0x10ac580,0xc00007ef48)
eif2: (0x10ac580,0xc00007ef40)
eif1 = eif2: false
eif1: (0x10ac580,0xc00007ef48)
eif2: (0x10ac580,0x10eb3d0)
eif1 = eif2: true
eif1: (0x10ac580,0xc00007ef48)
eif2: (0x10ac640,0x10eb3d8)
eif1 = eif2: false

这里，我们也直接来看一个非空接口类型变量的内部表示输出的例子：

type T int

func (t T) Error() string { 
    return "bad error"
}

func printNonEmptyInterface() { 
    var err1 error // 非空接口类型
    var err2 error // 非空接口类型
    err1 = (*T)(nil)
    println("err1:", err1)
    println("err1 = nil:", err1 == nil)

    err1 = T(5)
    err2 = T(6)
    println("err1:", err1)
    println("err2:", err2)
    println("err1 = err2:", err1 == err2)

    err2 = fmt.Errorf("%dn", 5)
    println("err1:", err1)
    println("err2:", err2)
    println("err1 = err2:", err1 == err2)
}

这个例子的运行输出结果如下：

err1: (0x10ed120,0x0)
err1 = nil: false
err1: (0x10ed1a0,0x10eb310)
err2: (0x10ed1a0,0x10eb318)
err1 = err2: false
err1: (0x10ed1a0,0x10eb310)
err2: (0x10ed0c0,0xc000010050)
err1 = err2: false

err1 = (*T)(nil)

下面是非空接口类型变量和空接口类型变量之间进行比较的例子：

func printEmptyInterfaceAndNonEmptyInterface() {
  var eif interface{} = T(5)
  var err error = T(5)
  println("eif:", eif)
  println("err:", err)
  println("eif = err:", eif == err)

  err = T(6)
  println("eif:", eif)
  println("err:", err)
  println("eif = err:", eif == err)
}

这个示例的输出结果如下：

eif: (0x10b3b00,0x10eb4d0)
err: (0x10ed380,0x10eb4d8)
eif = err: true
eif: (0x10b3b00,0x10eb4d0)
err: (0x10ed380,0x10eb4e0)
eif = err: false

不过，这里要注意，由于 runtime 中的 eface、iface，或者它们的组成可能会随着 Go 版本的变化发生变化，因此这个方法不具备跨版本兼容性。也就是说，基于 Go 1.17 版本复制的代码，可能仅适用于使用 Go 1.17 版本编译。这里我们就以 Go 1.17 版本为例看看：

// dumpinterface.go 
type eface struct {
    _type *_type
    data  unsafe.Pointer
}

type tflag uint8
type nameOff int32
type typeOff int32

type _type struct {
    size       uintptr
    ptrdata    uintptr // size of memory prefix holding all pointers
    hash       uint32
    tflag      tflag
    align      uint8
    fieldAlign uint8
    kind       uint8
    // function for comparing objects of this type
    // (ptr to object A, ptr to object B) -> ==?
    equal func(unsafe.Pointer, unsafe.Pointer) bool
    // gcdata stores the GC type data for the garbage collector.
    // If the KindGCProg bit is set in kind, gcdata is a GC program.
    // Otherwise it is a ptrmask bitmap. See mbitmap.go for details.
    gcdata    *byte
    str       nameOff
    ptrToThis typeOff
}

type iface struct {
    tab  *itab
    data unsafe.Pointer
}

type itab struct {
    inter *interfacetype
    _type *_type
    hash  uint32 // copy of _type.hash. Used for type switches.
    _     [4]byte
    fun   [1]uintptr // variable sized. fun[0]==0 means _type does not implement inter.
}

... ...

const ptrSize = unsafe.Sizeof(uintptr(0))

func dumpEface(i interface{}) {
    ptrToEface := (*eface)(unsafe.Pointer(&amp;i))
    fmt.Printf("eface: %+vn", *ptrToEface)

    if ptrToEface._type != nil {
        // dump _type info
        fmt.Printf("t _type: %+vn", *(ptrToEface._type))
    }

    if ptrToEface.data != nil {
        // dump data
        switch i.(type) {
        case int:
            dumpInt(ptrToEface.data)
        case float64:
            dumpFloat64(ptrToEface.data)
        case T:
            dumpT(ptrToEface.data)

        // other cases ... ...
        default:
            fmt.Printf("t unsupported data typen")
        }
    }
    fmt.Printf("n")
}

func dumpItabOfIface(ptrToIface unsafe.Pointer) {
    p := (*iface)(ptrToIface)
    fmt.Printf("iface: %+vn", *p)

    if p.tab != nil {
        // dump itab
        fmt.Printf("t itab: %+vn", *(p.tab))
        // dump inter in itab
        fmt.Printf("tt inter: %+vn", *(p.tab.inter))

        // dump _type in itab
        fmt.Printf("tt _type: %+vn", *(p.tab._type))

        // dump fun in tab
        funPtr := unsafe.Pointer(&amp;(p.tab.fun))
        fmt.Printf("tt fun: [")
        for i := 0; i < len((*(p.tab.inter)).mhdr); i++ {
            tp := (*uintptr)(unsafe.Pointer(uintptr(funPtr) + uintptr(i)*ptrSize))
            fmt.Printf("0x%x(%d),", *tp, *tp)
        }
        fmt.Printf("]n")
    }
}

func dumpDataOfIface(i interface{}) {
    // this is a trick as the data part of eface and iface are same
    ptrToEface := (*eface)(unsafe.Pointer(&amp;i))

    if ptrToEface.data != nil {
        // dump data
        switch i.(type) {
        case int:
            dumpInt(ptrToEface.data)
        case float64:
            dumpFloat64(ptrToEface.data)
        case T:
            dumpT(ptrToEface.data)

        // other cases ... ...

        default:
            fmt.Printf("t unsupported data typen")
        }
    }
    fmt.Printf("n")
}

func dumpT(dataOfIface unsafe.Pointer) {
    var p *T = (*T)(dataOfIface)
    fmt.Printf("t data: %+vn", *p)
}
... ...

我们利用这三个函数来输出一下前面 printEmptyInterfaceAndNonEmptyInterface 函数中的接口类型变量的信息：

package main

import "unsafe"

type T int

func (t T) Error() string {
    return "bad error"
}

func main() {
    var eif interface{} = T(5)
    var err error = T(5)
    println("eif:", eif)
    println("err:", err)
    println("eif = err:", eif == err)
    
    dumpEface(eif)
    dumpItabOfIface(unsafe.Pointer(&err))
    dumpDataOfIface(err)
}

运行这个示例代码，我们得到了这个输出结果：

eif: (0x10b38c0,0x10e9b30)
err: (0x10eb690,0x10e9b30)
eif = err: true
eface: {_type:0x10b38c0 data:0x10e9b30}
   _type: {size:8 ptrdata:0 hash:1156555957 tflag:15 align:8 fieldAlign:8 kind:2 equal:0x10032e0 gcdata:0x10e9a60 str:4946 ptrToThis:58496}
   data: bad error

iface: {tab:0x10eb690 data:0x10e9b30}
   itab: {inter:0x10b5e20 _type:0x10b38c0 hash:1156555957 _:[0 0 0 0] fun:[17454976]}
     inter: {typ:{size:16 ptrdata:16 hash:235953867 tflag:7 align:8 fieldAlign:8 kind:20 equal:0x10034c0 gcdata:0x10d2418 str:3666 ptrToThis:26848} pkgpath:{bytes:<nil>} mhdr:[{name:2592 ityp:43520}]}
     _type: {size:8 ptrdata:0 hash:1156555957 tflag:15 align:8 fieldAlign:8 kind:2 equal:0x10032e0 gcdata:0x10e9a60 str:4946 ptrToThis:58496}
     fun: [0x10a5780(17454976),]
   data: bad error

从输出结果中，我们看到 eif 的 _type（0x10b38c0）与 err 的 tab._type（0x10b38c0）是一致的，data 指针所指内容（“bad error”）也是一致的，因此 eif == err 表达式的结果为 true。

再次强调一遍，上面这个实现可能仅在 Go 1.17 版本上测试通过，并且在输出 iface 或 eface 的 data 部分内容时只列出了 int、float64 和 T 类型的数据读取实现，没有列出全部类型的实现，你可以根据自己的需要实现其余数据类型。dumpinterface.go 的完整代码你可以在这里找到。

我们现在已经知道了，接口类型有着复杂的内部结构，所以我们将一个类型变量值赋值给一个接口类型变量值的过程肯定不会像 var i int = 5 那么简单，那么接口类型变量赋值的过程是怎样的呢？其实接口类型变量赋值是一个“装箱”的过程。

在 Go 语言中，将任意类型赋值给一个接口类型变量也是装箱操作。有了前面对接口类型变量内部表示的学习，我们知道接口类型的装箱实际就是创建一个 eface 或 iface 的过程。接下来我们就来简要描述一下这个过程，也就是接口类型的装箱原理。

// interface_internal.go

  type T struct {
      n int
      s string
  }
  
  func (T) M1() {}
  func (T) M2() {}
  
  type NonEmptyInterface interface {
      M1()
      M2()
  }
  
  func main() {
      var t = T{
          n: 17,
          s: "hello, interface",
      }
      var ei interface{}
      ei = t
 
      var i NonEmptyInterface
      i = t
      fmt.Println(ei)
      fmt.Println(i)
  }

这个例子中，对 ei 和 i 两个接口类型变量的赋值都会触发装箱操作，要想知道 Go 在背后做了些什么，我们需要“下沉”一层，也就是要输出上面 Go 代码对应的汇编代码：

$go tool compile -S interface_internal.go > interface_internal.s

    0x0026 00038 (interface_internal.go:24) MOVQ    $17, ""..autotmp_15+104(SP)
    0x002f 00047 (interface_internal.go:24) LEAQ    go.string."hello, interface"(SB), CX
    0x0036 00054 (interface_internal.go:24) MOVQ    CX, ""..autotmp_15+112(SP)
    0x003b 00059 (interface_internal.go:24) MOVQ    $16, ""..autotmp_15+120(SP)
    0x0044 00068 (interface_internal.go:24) LEAQ    type."".T(SB), AX
    0x004b 00075 (interface_internal.go:24) LEAQ    ""..autotmp_15+104(SP), BX
    0x0050 00080 (interface_internal.go:24) PCDATA  $1, $0
    0x0050 00080 (interface_internal.go:24) CALL    runtime.convT2E(SB)

    0x005f 00095 (interface_internal.go:27) MOVQ    $17, ""..autotmp_15+104(SP)
    0x0068 00104 (interface_internal.go:27) LEAQ    go.string."hello, interface"(SB), CX
    0x006f 00111 (interface_internal.go:27) MOVQ    CX, ""..autotmp_15+112(SP)
    0x0074 00116 (interface_internal.go:27) MOVQ    $16, ""..autotmp_15+120(SP)
    0x007d 00125 (interface_internal.go:27) LEAQ    go.itab."".T,"".NonEmptyInterface(SB), AX
    0x0084 00132 (interface_internal.go:27) LEAQ    ""..autotmp_15+104(SP), BX
    0x0089 00137 (interface_internal.go:27) PCDATA  $1, $1
    0x0089 00137 (interface_internal.go:27) CALL    runtime.convT2I(SB)

// $GOROOT/src/runtime/iface.go

func convT2E(t *_type, elem unsafe.Pointer) (e eface) {
    if raceenabled {
        raceReadObjectPC(t, elem, getcallerpc(), funcPC(convT2E))
    }
    if msanenabled {
        msanread(elem, t.size)
    }
    x := mallocgc(t.size, t, true)
    typedmemmove(t, x, elem)
    e._type = t
    e.data = x
    return
}

func convT2I(tab *itab, elem unsafe.Pointer) (i iface) {
    t := tab._type
    if raceenabled {
        raceReadObjectPC(t, elem, getcallerpc(), funcPC(convT2I))
    }
    if msanenabled {
        msanread(elem, t.size)
    }
    x := mallocgc(t.size, t, true)
    typedmemmove(t, x, elem)
    i.tab = tab
    i.data = x
    return
}

convT2E 用于将任意类型转换为一个 eface，convT2I 用于将任意类型转换为一个 iface。两个函数的实现逻辑相似，主要思路就是根据传入的类型信息（convT2E 的 _type 和 convT2I 的 tab._type）分配一块内存空间，并将 elem 指向的数据拷贝到这块内存空间中，最后传入的类型信息作为返回值结构中的类型信息，返回值结构中的数据指针（data）指向新分配的那块内存空间。

func main() {
  var n int = 61
  var ei interface{} = n
  n = 62  // n的值已经改变
  fmt.Println("data in box:", ei) // 输出仍是61
}

// $GOROOT/src/runtime/iface.go
func convT16(val any) unsafe.Pointer     // val must be uint16-like
func convT32(val any) unsafe.Pointer     // val must be uint32-like
func convT64(val any) unsafe.Pointer     // val must be uint64-like
func convTstring(val any) unsafe.Pointer // val must be a string
func convTslice(val any) unsafe.Pointer  // val must be a slice

// $GOROOT/src/runtime/iface.go
// staticuint64s is used to avoid allocating in convTx for small integer values.
var staticuint64s = [...]uint64{
    0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
    0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
  ... ...
}

要更好地理解 Go 接口的这两种特性，我们需要深入到 Go 接口在运行时的表示层面上去。接口类型变量在运行时表示为 eface 和 iface，eface 用于表示空接口类型变量，iface 用于表示非空接口类型变量。只有两个接口类型变量的类型信息（eface._type/iface.tab._type）相同，且数据指针（eface.data/iface.data）所指数据相同时，两个接口类型变量才是相等的。