Go研究之Interface

空接口interface{}

interface{} 赋值

interface{} 有点类似于 C/C++ 里的 void*，interface{} ，在 Golang 中可以存储任何数据类型：int、string、struct、function、nil、map等等所有：

1 2	var a interface{} = 1 //字面1为int类型 var v interface{} = nil

由于Go中所有的变量有类型信息，因此存储到 interface{} 里也会带上类型信息，这样才可以在运行时支持反射等特性（这也是不同于void*的地方）。而且interface{} 还可以通过类型assert反转换到具体类型：

1 2	var a interface{} = 1 b := a.(int)

空接口interface{} 底层是通过eface结构来实现的，意思是empty interface。eface 本质上类似一个 pair<type, data> ，其中type 存储了变量的实际类型，而data 指向变量的值。具体如下：

type eface struct {
    _type *_type
    data  unsafe.Pointer
}

type _type struct {
    size       uintptr // type size 描述类型的大小
    ptrdata    uintptr // size of memory prefix holding all pointers
    hash       uint32  // hash of type; avoids computation in hash tables
    tflag      tflag   // extra type information flags
    align      uint8   // 变量对齐
    fieldalign uint8   // 结构体对齐
    kind       uint8   // 和反射里的kind一致，数据的大类
    alg        *typeAlg  //算法函数指针，存储了hash/equal/print/copy四个函数操作
    gcdata    *byte    // garbage collection data
    str       nameOff  // string form
    ptrToThis typeOff  // type for pointer to this type, may be zero
}

Go1.7 源码中将变量赋值给 interface{}是通过convT2E 实现的：

func convT2E(t *_type, elem unsafe.Pointer, x unsafe.Pointer) (e eface) {
    if raceenabled {
        raceReadObjectPC(t, elem, getcallerpc(unsafe.Pointer(&t)), funcPC(convT2E))
    }   
    if msanenabled {
        msanread(elem, t.size)
    }   
    if isDirectIface(t) {
        throw("direct convT2E")
    }   
    if x == nil {
        x = newobject(t)
        // TODO: We allocate a zeroed object only to overwrite it with
        // actual data. Figure out how to avoid zeroing. Also below in convT2I.
    }   
    typedmemmove(t, x, elem)
    e._type = t 
    e.data = x 
    return
}

可以看到在运行时，通过 typedmemmove 进行了内存拷贝，data 不是简单的指向原数据区。而反射里修改数据时，如果不是指针类型，修改会失败，应该也是基于这个原因：修改的只是拷贝的数据。

我们可以用以下实验试一下

package main

import (
    "fmt"
)

type User struct {
    id int
    name string 
}

func main() {
    u := User{1, "Tom"}
    var i interface{} = u
    u.id = 2
    u.name = "Jack"
    fmt.Printf("u: %#v\n", u);
    fmt.Printf("i: %#v\n", i);
    
    u2 := &User{2, "Tom2"}
    var i2 interface{} = u2
    u2.id = 2
    u2.name = "Jack2"
    fmt.Printf("u2: %#v\n", u2);
    fmt.Printf("i2: %#v\n", i2);
}

运行结果如下

u: main.User{id:2, name:"Jack"}
i: main.User{id:1, name:"Tom"}
u2: &main.User{id:2, name:"Jack2"}
i2: &main.User{id:2, name:"Jack2"}

证明了代码里的拷贝实现。

interface{} 与 nil

当将 nil 赋值给 interface{} 变量时，type 和 data 域都将被赋值为 nil, 因此其本质上是一个nil

而如果是一个其他类型的 nil 值，被赋值给 interface{}，则其 type是有具体类型的，只不过data 是nil，因而组合而成的 eface结构就不是一个nil

package main
  
import (
    "fmt"
)

type User struct {
    id int
    name string 
}

func main() {
    var i1 interface{} = nil   // type 和 data 都是 nil
    fmt.Printf("%v\n", i1 == nil);

    var u2 *User
    var i2 interface{} = u2   // type 是 *User，data是 nil
    fmt.Printf("%v\n", i2 == nil);
}

运行结果

1 2	true false

不仅是空接口interface{} 是这样，其他有方法的interface 如果被赋值为一个具体类型的nil 值，本质上是不等于nil，而只有被直接赋值为nil，才是真正上的nil。可以认为直接赋值字面上的nil 是类型type和data 都为nil 的nil。

非空 interface

非空interface赋值

非空 interface 一般用来实现类似C++的运行时的多态特性。将一个struct 变量赋值给非空interface时编译器会先做一次校验：看该struct类型是否实现了接口所需的所有方法，如果没有，则会报错。例如

type I interface {
    String()
}
var a int = 5
var b I = a

编译器会给出提示

1 2	cannot use a (type int) as type I in assignment: int does not implement I (missing String method)

运行时赋值底层借助接口 iface 来实现：

type iface struct {
    tab  *itab
    data unsafe.Pointer
}

// layout of Itab known to compilers
// allocated in non-garbage-collected memory
// Needs to be in sync with
// ../cmd/compile/internal/gc/reflect.go:/^func.dumptypestructs.
type itab struct {
    inter  *interfacetype
    _type  *_type
    link   *itab
    bad    int32
    unused int32
    fun    [1]uintptr // variable sized
}

type interfacetype struct {
	typ     _type
	pkgpath name
	mhdr    []imethod
}

itab 结构包含了两个类型：1)该 interface自己的类型*interfacetype； 2) 其data所指向的具体接口实现的实际类型*_type。interfacetype 是对_type的封装，加上了一些interface才有的数据，专门来表示interface的具体类型。我们可以看到其mhdr成员表示该interface的方法集，但是注意这里只是函数原型metadata，不是具体的函数定义，具体的函数定义是由实现接口的struct来定义的。

相比于 empty interface，non-empty interface 要包含实现该 interface的method 具体定义，定义会被存放在 itab.fun 变量里。虽然 fun 数组只有一个元素，但实际赋值的时候会在内存上依次连续的存储各函数指针。

一个法国的bloger teh-cmc 的 go-internals 里通过汇编代码，详细说明了如何在运行时一个个填充itab结构的各个成员的，有兴趣的同学可以自行查看。

当itab结构被填充好了之后，运行时就可以通过调用convT2I 来将变量赋值给非空 interface

func convT2I(tab *itab, elem unsafe.Pointer) (i iface) {
	t := tab._type
	if raceenabled {
		raceReadObjectPC(t, elem, getcallerpc(unsafe.Pointer(&tab)), funcPC(convT2I))
	}
	if msanenabled {
		msanread(elem, t.size)
	}
	if isDirectIface(t) {
		// This case is implemented directly by the compiler.
		throw("direct convT2I")
	}
	x := newobject(t)
	typedmemmove(t, x, elem)
	i.tab = tab
	i.data = x
	return
}

其中x := newobject(t) 会在堆上分配一个 t 类型的对象。由此可见，不管赋值给非空interface的变量存放在哪里，赋值操作都会在堆上重新生成一个对象，然后将对象的类型和指针存储在非空interface里，必要时可能会引发变量逃逸。因此该转换是比较消耗性能的，看下一个benchmark

type Addifier interface{ Add(a, b int32) int32 }

type Adder struct{ id int32 }

//go:noinline
func (adder Adder) Add(a, b int32) int32 { return a + b }

func BenchmarkDirect(b *testing.B) {
    adder := Adder{id: 6754}
    for i := 0; i < b.N; i++ {
        adder.Add(10, 32)
    }
}

func BenchmarkInterface(b *testing.B) {
    adder := Adder{id: 6754}
    for i := 0; i < b.N; i++ {
        Addifier(adder).Add(10, 32)
    }
}