Go Map扩容底层机制解析

后端

2023-10-07 09:14:03

Go 中 Map 的扩容：深入探索底层实现

在之前的博文中，我们探讨了 Go 中切片和 map 的基本用法以及切片的扩容机制。现在，让我们将目光转向 map 的扩容，从源码的角度深入剖析它的底层实现。

Map 的概述

Map 是 Go 中一种至关重要的数据结构，用于存储键值对。它可以存储各种类型的数据，并具有高效的查找和检索性能。当 map 中的数据量达到一定程度时，系统会自动增加 map 的容量以容纳更多的数据，这个过程称为 map 的扩容。

Map 的扩容过程

Map 的扩容过程可以分为以下几个步骤：

检测扩容条件： 当 map 中的数据量达到其容量的 66.67% 时，系统会检测是否需要进行扩容。
查找新的容量： 如果需要扩容，系统会查找一个新的容量，通常是原来容量的两倍。
分配新的内存： 系统会为新的容量分配一块内存空间。
复制数据： 系统会将原 map 中的数据复制到新的内存空间中。
更新 map 指针： 系统会将 map 指针指向新的内存空间。

Map 扩容的源码分析

Map 的扩容过程在 Go 的源码中实现，具体位于 runtime/map.go 文件中。下面我们来看看 map 扩容的源码是如何实现的：

// growWork computes the new map size and allocates two chunks of that size.
func growWork(sizelog uint8) (oldsize, newsize uintptr, ptr *uint8) {
    // Compute new allocation size.
    // We allocate an extra bucket worth of memory so that the
    // new map can hold up to twice the number of entries as the
    // old map.

    newsize = uintptr(1) << sizelog
    if sizelog > 20 {
        panic("map size too large")
    }
    if newsize < 4*sys.PtrSize {
        newsize = 4 * sys.PtrSize
    }

    // The extra bucket helps slightly with most ops, but it
    // especially helps with Maps that suffer from hash collisions.
    // This makes Maps perform better when used as Sets.

    oldsize = newsize / 2
    ptr = allocm(newsize + newsize)

    return
}

这段代码首先计算新的 map 容量，然后分配两块相同大小的内存空间。其中，oldsize 是原来的 map 容量，newsize 是新的 map 容量，ptr 是指向新内存空间的指针。

// transfer allocates a new map span and transfers the contents
// of the old map into it.
func transfer(h *hmap, old *hmap) {
    // allocate a new map span.
    newsize := uintptr(1) << h.sizelog
    if h.sizelog > 20 {
        panic("map size too large")
    }
    if newsize < 4*sys.PtrSize {
        newsize = 4 * sys.PtrSize
    }
    newmap := allocm(newsize)

    // transfer old buckets into new map
    var b *bmap
    for i := uintptr(0); i < h.B; i++ {
        b = h.buckets[i]
        if b == nil {
            continue
        }

        // Make sure elements have correct hash in new table.
        // This handles the case where the hash depends on the map size.
        b.tov(h.sizelog)
        b.overflow = nil

        // link b into old chain so we can free it later.
        b.next = old.oldbuckets
        old.oldbuckets = b

        // update new map with new bucket.
        b.next = newmap.buckets[b.hash%uintptr(newsize)]
        newmap.buckets[b.hash%uintptr(newsize)] = b
    }

    // free old map span.
    mspan_free(old.spans[0], old.spans[1], false)

    // install new map.
    lock(&h.lock)
    h.spans[0] = newmap.spans[0]
    h.spans[1] = newmap.spans[1]
    h.B = newmap.B
    h.buckets = newmap.buckets
    h.oldbuckets = old.oldbuckets
    unlock(&h.lock)

    // free old buckets.
    for b := old.oldbuckets; b != nil; b = b.next {
        mspan_free(b.spans[0], b.spans[1], false)
    }
}