Go의 sync/atomic을 이용한 원자 연산 이해

동시성 프로그래밍에서는 데이터 경합을 피하고 예측 가능한 동작을 보장하기 위해 공유 상태를 올바르게 관리하는 것이 가장 중요합니다. Go는 뮤텍스, 채널, 웨이트 그룹을 포함한 여러 동시성 제어 메커니즘을 제공합니다. 뮤텍스(`sync.Mutex`)는 임계 구역을 보호하는 강력한 방법을 제공하지만, 특히 카운터를 증가시키는 것과 같은 간단한 연산의 경우 잠금 및 잠금 해제로 인해 때때로 오버헤드를 발생시킬 수 있습니다. 이때 원자 연산이 사용됩니다.

Go의 sync/atomic 패키지는 명시적인 잠금 없이 스레드 안전한 방식으로 값 추가, 비교 후 교환 또는 로드와 같은 일반적인 작업을 수행하는 저수준의 기본 연산을 제공합니다. 이러한 연산은 일반적으로 멀티코어 환경에서도 단일하고 분할할 수 없는 단계로 완료됨을 보장하는 특수 CPU 명령어를 사용하여 구현됩니다. 이를 통해 특정 사용 사례에 매우 효율적입니다.

왜 원자 연산을 사용하는가?

여러 고루틴이 공유 카운터를 증가시켜야 하는 시나리오를 생각해 보세요. 단순한 접근 방식은 다음과 같습니다.

package main

import (
	"fmt"
	runtime "runtime"
	sync "sync"
	time "time"
)

func main() {
	counter := 0

	numGoroutines := 1000

	var wg sync.WaitGroup

	wg.Add(numGoroutines)

	for i := 0; i < numGoroutines; i++ {
		go func() {
			defer wg.Done()
			for j := 0; j < 1000; j++ {
				counter++ // 데이터 경합!
			}
		}()
	}

	wg.Wait()

	fmt.Println("Final Counter (potential race):", counter)
}

이 코드를 여러 번 실행하면 결과가 다를 수 있으며, 최종 counter 값은 1,000,000보다 작을 가능성이 높습니다. counter++는 원자적이지 않기 때문입니다. 이는 읽기, 증가, 쓰기의 세 단계를 포함합니다. 이러한 단계 사이에 컨텍스트 전환이 발생하여 업데이트가 누락될 수 있습니다.

이를 해결하는 한 가지 방법은 sync.Mutex를 사용하는 것입니다.

package main

import (
	"fmt"
	sync "sync"
)

func main() {
	counter := 0

	numGoroutines := 1000

	var wg sync.WaitGroup

	var mu sync.Mutex // 카운터를 보호하기 위한 뮤텍스

	wg.Add(numGoroutines)

	for i := 0; i < numGoroutines; i++ {
		go func() {
			defer wg.Done()
			for j := 0; j < 1000; j++ {
				mu.Lock()
				counter++
				mu.Unlock()
			}
		}()
	}

	wg.Wait()

	fmt.Println("Final Counter (with mutex):", counter) // 1,000,000이어야 함
}

모든 작은 증가에 대해 뮤텍스를 획득하고 해제하는 것은 정확하지만, 불필요한 오버헤드를 발생시킬 수 있습니다. 간단한 산술 연산이나 값 교환의 경우, sync/atomic은 더 효율적인 대안을 제공합니다.

핵심 원자 연산

sync/atomic 패키지는 다양한 정수 유형(int32, int64, uint32, uint64), 포인터(unsafe.Pointer), 부울 값(정수에 의해 암시적으로 처리됨)에 대한 원자 연산을 제공합니다. 다음은 가장 일반적으로 사용되는 함수들입니다.

1. `Add*` 함수

이 함수는 값에 델타를 원자적으로 더하고 새 값을 반환합니다.

atomic.AddInt32(addr *int32, delta int32) (new int32)
atomic.AddInt64(addr *int64, delta int64) (new int64)
atomic.AddUint32(addr *uint32, delta uint32) (new uint32)
atomic.AddUint64(addr *uint64, delta uint64) (new uint64)

atomic.AddInt64를 사용하여 카운터 예제를 리팩토링해 봅시다.

package main

import (
	"fmt"
	sync "sync"
	atomic "sync/atomic"
)

func main() {
	var counter int64 // 원자 연산을 위해 int64 사용

	numGoroutines := 1000

	var wg sync.WaitGroup

	wg.Add(numGoroutines)

	for i := 0; i < numGoroutines; i++ {
		go func() {
			defer wg.Done()
			for j := 0; j < 1000; j++ {
				atomic.AddInt64(&counter, 1) // 1을 원자적으로 더함
			}
		}()
	}

	wg.Wait()

	fmt.Println("Final Counter (with atomic):", counter) // 1,000,000이어야 함
}

이 버전은 정확할 뿐만 아니라, 뮤텍스 관리 오버헤드를 피하고 저수준 CPU 명령어를 활용하기 때문에 단순히 증가하는 경우 뮤텍스 기반 접근 방식보다 일반적으로 더 효율적입니다.

2. `Load*` 함수

이 함수는 주소에 저장된 값을 원자적으로 로드(읽기)합니다.

atomic.LoadInt32(addr *int32) (val int32)
atomic.LoadInt64(addr *int64) (val int64)
atomic.LoadUint32(addr *uint32) (val uint32)
atomic.LoadUint64(addr *uint64) (val uint64)
atomic.LoadPointer(addr *unsafe.Pointer) (val unsafe.Pointer)

다른 고루틴에 의해 원자적으로 쓰일 수 있는 값을 읽을 때는 항상 원자 로드를 사용하는 것이 중요합니다. 이렇게 하면 최신 일관성 있는 값을 얻을 수 있습니다.

예제: 원자 카운터 읽기:

package main

import (
	"fmt"
	sync "sync"
	atomic "sync/atomic"
	time "time"
)

func main() {
	var counter int64

	stop := make(chan struct{})

	go func() {
		for {
			select {
			case <-stop:
				return
			default:
				atomic.AddInt64(&counter, 1) // 카운터 증가
				time.Sleep(time.Millisecond) // 약간의 작업 시뮬레이션
			}
		}
	}()

	time.Sleep(5 * time.Second) // 5초 동안 실행

	// 카운터의 현재 값을 원자적으로 로드
	currentValue := atomic.LoadInt64(&counter)
	fmt.Println("Current counter value:", currentValue)

	close(stop)
	time.Sleep(100 * time.Millisecond) // 고루틴이 중지될 시간 제공
	fmt.Println("Final counter value:", atomic.LoadInt64(&counter))
}

3. `Store*` 함수

이 함수는 새 값을 주소에 원자적으로 저장(쓰기)합니다.

atomic.StoreInt32(addr *int32, val int32)
atomic.StoreInt64(addr *int64, val int64)
atomic.StoreUint32(addr *uint32, val uint32)
atomic.StoreUint64(addr *uint64, val uint64)
atomic.StorePointer(addr *unsafe.Pointer, val unsafe.Pointer)

예제: 새 상태 값 저장:

package main

import (
	"fmt"
	sync "sync"
	atomic "sync/atomic"
	time "time"
)

const (
	StateRunning = 0
	StatePaused  = 1
	StateStopped = 2
)

func main() {
	var currentState int32 = StateRunning // 초기 상태

	var wg sync.WaitGroup
	wg.Add(2)

	// 상태를 변경하는 고루틴

go func() {
		defer wg.Done()
		fmt.Println("Service: Changing state to Paused...")
		atomic.StoreInt32(&currentState, StatePaused) // 상태를 원자적으로 설정
		time.Sleep(time.Second)

		fmt.Println("Service: Changing state to Stopped...")
		atomic.StoreInt32(&currentState, StateStopped)
	}()

	// 상태를 모니터링하는 고루틴

go func() {
		defer wg.Done()
		for i := 0; i < 5; i++ {
			// 현재 상태를 원자적으로 로드
			val := atomic.LoadInt32(&currentState)
			fmt.Printf("Monitor: Current state is %d\n", val)
			time.Sleep(500 * time.Millisecond)
			if val == StateStopped {
				break
			}
		}
	}()

	wg.Wait()
	fmt.Println("All done.")
}

4. `Swap*` 함수

이 함수는 주소의 값을 새 값과 원자적으로 교환하고 이전 값을 반환합니다.

atomic.SwapInt32(addr *int32, new int32) (old int32)
atomic.SwapInt64(addr *int64, new int64) (old int64)
atomic.SwapUint32(addr *uint32, new uint32) (old uint32)
atomic.SwapUint64(addr *uint64, new uint64) (old uint64)
atomic.SwapPointer(addr *unsafe.Pointer, new unsafe.Pointer) (old unsafe.Pointer)

Swap은 값을 교체하고 이전 값을 동시에 가져와야 하는 시나리오, 예를 들어 잠금 없는 큐 구현 또는 플래그 지우기에 유용합니다.

예제: 플래그를 재설정하고 이전 상태 확인:

package main

import (
	"fmt"
	atomic "sync/atomic"
	time "time"
)

func main() {
	var isProcessing int32 = 0 // 0은 false, 1은 true

	// "잠금"을 획득하거나 처리 중임을 신호하려는 워커 시뮬레이션

go func() {
		for i := 0; i < 3; i++ {
			// isProcessing을 1(true)로 설정하고 이전 값을 가져옴
			// 이전 값이 0이면 "잠금"을 성공적으로 획득했음을 의미
			oldVal := atomic.SwapInt32(&isProcessing, 1)
			if oldVal == 0 {
				fmt.Printf("Worker %d: Acquired processing lock. Doing work...\n", i+1)
				time.Sleep(time.Second) // 작업 시뮬레이션
				atomic.StoreInt32(&isProcessing, 0) // 잠금 해제
				fmt.Printf("Worker %d: Released processing lock.\n", i+1)
			} else {
				fmt.Printf("Worker %d: Could not acquire lock, already busy.\n", i+1)
				time.Sleep(200 * time.Millisecond) // 재시도 전 잠시 대기
			}
		}
	}()

	time.Sleep(3 * time.Second) // 메인 고루틴 라이브 유지
	fmt.Println("Final processing state:", atomic.LoadInt32(&isProcessing))
}

5. `CompareAndSwap*` (CAS) 함수

이들은 아마도 가장 강력한 원자 연산일 것입니다. 값의 현재 값이 예상 값과 일치하는 경우에만 주소의 값을 조건부로 변경합니다.

atomic.CompareAndSwapInt32(addr *int32, old, new int32) (swapped bool)
atomic.CompareAndSwapInt64(addr *int64, old, new int64) (swapped bool)
atomic.CompareAndSwapUint32(addr *uint32, old, new uint32) (swapped bool)
atomic.CompareAndSwapUint64(addr *uint64, old, new uint64) (swapped bool)
atomic.CompareAndSwapPointer(addr *unsafe.Pointer, old, new unsafe.Pointer) (swapped bool)

CAS 연산은 비차단 큐, 스택 및 스레드 안전 데이터 구조를 포함한 많은 잠금 없는 알고리즘의 기본 빌딩 블록입니다. 일반적인 패턴은 "읽기-수정-쓰기" 루프입니다.

for {
	oldVal := atomic.Load*(addr) // 1. 현재 값 읽기
	newVal := calculateNewValue(oldVal) // 2. 이전 값을 기반으로 새 값 계산
	if atomic.CompareAndSwap*(addr, oldVal, newVal) { // 3. 교환 시도
		break // 성공!
	}
	// 그렇지 않으면 다른 고루틴이 값을 변경했으므로 재시도
}

예제: 스레드 안전 최대값 업데이트 프로그램 구현:

package main

import (
	"fmt"
	sync "sync"
	atomic "sync/atomic"
)

func main() {
	var maxVal int64 = 0 // 초기 최대값

	numGoroutines := 10
	var wg sync.WaitGroup
	wg.Add(numGoroutines)

	for i := 0; i < numGoroutines; i++ {
		go func(id int) {
			defer wg.Done()
			for j := 0; j < 100; j++ {
				newValue := int64(id*100 + j) // 증가하는 값 생성
				for {
					oldVal := atomic.LoadInt64(&maxVal) // 현재 최대값 읽기
					if newValue > oldVal {
						// 현재 최대값이 oldVal인 경우에만 새 값을 설정하려고 시도
						if atomic.CompareAndSwapInt64(&maxVal, oldVal, newValue) {
							// 성공적으로 업데이트됨
							// fmt.Printf("Goroutine %d: Updated max from %d to %d\n", id, oldVal, newValue)
							break
						}
						// CAS가 실패하면 다른 고루틴이 업데이트했음. 루프를 돌고 새 oldVal로 재시도.
					} else {
						// 새 값이 더 크지 않으므로 업데이트할 필요 없음
						break
					}
				}
			}
		}(i)
	}

	wg.Wait()
	fmt.Println("Final Max Value:", atomic.LoadInt64(&maxVal)) // 999여야 함
}

이렇게 하면 LoadInt64 호출 시점에 관찰된 값보다 새 값이 실제로 더 큰 경우에만 maxVal이 업데이트되며, CompareAndSwapInt64 연산은 이 확인 및 업데이트가 원자적으로 수행됨을 보장합니다.

6. `atomic.Pointer[T]` (Go 1.19+)

Go 1.19는 atomic.Pointer[T]를 도입했습니다. 이는 모든 유형 T(내부적으로 unsafe.Pointer 사용)의 값에 대한 원자 연산을 제공하는 제네릭 유형으로, 명시적인 unsafe.Pointer 캐스트를 추상화하여 제공합니다. 이는 포인터를 원자적으로 관리할 때 타입 안전성과 사용성을 크게 향상시킵니다.

메서드는 전역 atomic 함수와 동일합니다. Load(), Store(), Swap() 및 CompareAndSwap()입니다.

package main

import (
	"fmt"
	sync "sync"
	atomic "sync/atomic"
	time "time"
)

type Config struct {
	LogLevel string
	MaxConns int
	Timeout  time.Duration
}

func main() {
	// 초기 구성으로 atomic.Pointer 초기화
	var currentConfig atomic.Pointer[Config]
	currentConfig.Store(&Config{
		LogLevel: "INFO",
		MaxConns: 10,
		Timeout:  5 * time.Second,
	})

	var wg sync.WaitGroup
	wg.Add(2)

	// 구성을 업데이트하는 고루틴

go func() {
		defer wg.Done()
		time.Sleep(2 * time.Second)
		fmt.Println("Updater: Updating config...")
		newConfig := &Config{
			LogLevel: "DEBUG",
			MaxConns: 20,
			Timeout:  10 * time.Second,
		}
		currentConfig.Store(newConfig) // 새 구성을 원자적으로 저장
		fmt.Println("Updater: Config updated.")

		time.Sleep(2 * time.Second)
		newerConfig := &Config{
			LogLevel: "ERROR",
			MaxConns: 5,
			Timeout:  2 * time.Second,
		}
		// 포인터에 대한 CompareAndSwap 예제
		oldConfig := currentConfig.Load()
		if currentConfig.CompareAndSwap(oldConfig, newerConfig) {
			fmt.Printf("Updater: Successfully CASed config from %s to %s\n", oldConfig.LogLevel, newerConfig.LogLevel)
		} else {
			fmt.Println("Updater: CAS failed, config changed by someone else.")
		}
	}()

	// 구성을 읽는 고루틴

go func() {
		defer wg.Done()
		for i := 0; i < 5; i++ {
			cfg := currentConfig.Load() // 구성을 원자적으로 로드
			fmt.Printf("Reader: Current config - LogLevel: %s, MaxConns: %d\n", cfg.LogLevel, cfg.MaxConns)
			time.Sleep(1 * time.Second)
		}
	}()

	wg.Wait()
	fmt.Println("Final Config:", currentConfig.Load())
}

atomic.Pointer[T]는 구성의 핫 리로딩, 카피 온 라이트 데이터 구조 구현 또는 뮤텍스 없이 복잡한 객체를 고루틴 간에 안전하게 교환하는 것과 같은 시나리오에 유용합니다.

sync/atomic 대 sync.Mutex 사용 시기

원자 연산과 뮤텍스 선택은 공유 상태의 복잡성과 수행되는 연산에 따라 달라집니다.

sync/atomic 사용 시:
- 기본 정수 유형 또는 포인터에 대한 간단한 읽기, 쓰기, 더하기, 교환 또는 비교 후 교환 연산을 수행해야 하는 경우.
- 성능이 중요하고 연산이 하드웨어 수준에서 진정으로 원자적일 때.
- 뮤텍스 잠금/잠금 해제 오버헤드를 피하고 싶을 때.
- 잠금 없는 데이터 구조를 구현할 때 (하지만 이는 고급이며 오류가 발생하기 쉽음).
sync.Mutex (또는 sync.RWMutex) 사용 시:
- 공유 상태가 복잡한 데이터 구조(예: 맵, 슬라이스, 구조체)이며 여러 필드가 관련된 방식으로 수정되거나 여러 개별 읽기/쓰기를 단일 논리적 단위로 그룹화해야 할 때.
- 간단한 산술 또는 할당보다 복잡한 연산을 수행할 때 (예: 슬라이스에 추가, 맵에서 삭제).
- 단일 값이 아닌 전체 임계 구역을 보호해야 할 때.
- 복잡한 시나리오의 경우 최적화보다 단순성과 정확성이 우선시될 때. 뮤텍스는 일반적으로 이해하고 오류가 발생하기 쉽기 때문에 더 적합합니다.

또한 sync/atomic 연산은 단일 연산에 대한 원자성만 보장한다는 점을 기억하는 것이 중요합니다. 여러 원자 연산이 함께 수행되어야 하는 경우 뮤텍스나 더 정교한 동시성 기본 요소를 사용하여 집합적 원자성을 보장해야 할 수도 있습니다.

정렬 요구 사항

64비트 값(int64, uint64)을 포함하는 sync/atomic 함수는 변수의 메모리 주소가 64비트 정렬되어 있어야 합니다. Go의 가비지 수집기는 일반적으로 8바이트 이상의 변수가 올바르게 정렬되도록 보장합니다. 그러나 구조체 내에 64비트 값을 포함하는 경우 구조체의 레이아웃에 유의하고 원자 변수가 시작 부분에 있거나 정렬을 유지하기 위해 명시적으로 패딩되도록 해야 합니다. 그렇지 않으면 panic: atomic: store of unaligned 64-bit value가 발생할 수 있습니다.

Go 메모리 모델( sync/atomic과 관련됨)은 Load 및 Store와 같은 연산에 대한 보장도 제공합니다. 한 고루틴이 atomic.Store*를 사용하여 변수 x에 값 v를 쓰는 경우, 다른 고루틴이 나중에 atomic.Load*를 사용하여 x를 읽으면 읽기 작업은 v 또는 v 이후에 쓰인 값을 관찰합니다. 이는 정확성에 중요한 가시성 및 순서 보장을 제공합니다.

결론

sync/atomic 패키지는 Go의 동시성 도구 상자에서 강력한 도구로, 공유 상태 관리를 위한 매우 효율적이고 저수준의 기본 요소를 제공합니다. CPU 수준의 원자 명령어를 활용하여 세밀한 제어를 가능하게 하며, 뮤텍스에 비해 경합이 심한 간단한 연산의 성능을 크게 향상시킬 수 있습니다. 그러나 효과는 기본 유형 및 연산으로 제한됩니다. 더 복잡한 동시 액세스 패턴이나 데이터 구조의 경우, sync.Mutex와 채널은 여전히 Go의 기본적이고 더 일반적인 동시성 메커니즘으로 남아 있습니다. sync/atomic을 적절하게 사용하고 사용하는 시기를 이해하는 것은 강력하고 고성능의 동시 Go 애플리케이션을 작성하는 데 핵심입니다.

Go의 sync/atomic을 이용한 원자 연산 이해

Go의 sync/atomic을 이용한 원자 연산 이해

왜 원자 연산을 사용하는가?

핵심 원자 연산

1. `Add*` 함수

2. `Load*` 함수

3. `Store*` 함수

4. `Swap*` 함수

5. `CompareAndSwap*` (CAS) 함수

6. `atomic.Pointer[T]` (Go 1.19+)

sync/atomic 대 sync.Mutex 사용 시기

정렬 요구 사항

결론

Share this article

More Posts from Leapcell

Go 동시성 패턴 - 프로듀서-소비자, 팬아웃/팬인, 파이프라인 심층 분석

Go의 `select`를 활용한 동시성 마스터하기: 멀티플렉싱 및 타임아웃 처리

Popular Posts

Go의 sync/atomic을 이용한 원자 연산 이해

왜 원자 연산을 사용하는가?

핵심 원자 연산

1. Add* 함수

2. Load* 함수

3. Store* 함수

4. Swap* 함수

5. CompareAndSwap* (CAS) 함수

6. atomic.Pointer[T] (Go 1.19+)

sync/atomic 대 sync.Mutex 사용 시기

정렬 요구 사항

결론

Share this article

More Posts from Leapcell

Go 동시성 패턴 - 프로듀서-소비자, 팬아웃/팬인, 파이프라인 심층 분석

Go의 `select`를 활용한 동시성 마스터하기: 멀티플렉싱 및 타임아웃 처리

Popular Posts

1. `Add*` 함수

2. `Load*` 함수

3. `Store*` 함수

4. `Swap*` 함수

5. `CompareAndSwap*` (CAS) 함수

6. `atomic.Pointer[T]` (Go 1.19+)