Table of Contents
- Understanding Concurrency vs. Parallelism
- What Are Goroutines?
- How Goroutines Work Under the Hood
- Creating and Managing Goroutines
- Synchronization in Goroutines
- Common Pitfalls and Best Practices
- Real-World Example: Concurrent Web Scraper
- Conclusion
- References
1. Understanding Concurrency vs. Parallelism
Before diving into goroutines, it’s essential to clarify two often-confused terms: concurrency and parallelism.
Concurrency: “Dealing with Multiple Tasks”
Concurrency is about managing multiple tasks at the same time, even if they don’t execute simultaneously. Think of a chef juggling multiple orders: they might chop vegetables for one dish, then switch to stirring a sauce for another, then check on a baking pastry—interleaving tasks to make progress on all.
In programming, concurrency is achieved by switching between tasks (e.g., when one task is waiting for I/O, another runs). This gives the illusion of parallelism, even on a single-core CPU.
Parallelism: “Doing Multiple Tasks at Once”
Parallelism is about executing multiple tasks simultaneously, requiring multiple CPU cores. Using the chef analogy, this would be two chefs working side-by-side: one chopping, the other stirring—truly working in parallel.
Parallelism is a subset of concurrency. Go excels at both: goroutines enable efficient concurrency, and the Go runtime automatically distributes goroutines across multiple cores to leverage parallelism.
2. What Are Goroutines?
A goroutine is a lightweight execution unit managed by the Go runtime, not the operating system (OS). Unlike OS threads (which are heavyweight, with megabytes of stack space), goroutines are extremely lightweight, starting with a stack size of just 2KB (which grows/shrinks dynamically as needed).
This lightweight nature allows you to spawn thousands (or even millions) of goroutines without overwhelming the system. For comparison:
- An OS thread typically uses ~1MB of stack space.
- A goroutine uses ~2KB initially, scaling dynamically.
Goroutines are the building blocks of concurrency in Go. They enable you to write code that performs multiple tasks concurrently, with minimal overhead.
3. How Goroutines Work Under the Hood
To understand goroutines, we need to explore the Go runtime’s scheduler, which manages how goroutines are executed on OS threads. The scheduler uses a model called M:N scheduling, where:
- M (Machine): Represents an OS thread.
- G (Goroutine): Represents a goroutine.
- P (Processor): Acts as a “middleman” that binds M and G, and holds a queue of goroutines ready to run.
Key Concepts:
- P (Processor): A logical CPU core (controlled by
GOMAXPROCS, defaulting to the number of physical cores). Each P has a local queue of Gs. - M (Machine): An OS thread that runs Gs. An M must be bound to a P to execute Gs.
- Global Queue: A shared queue of Gs for all Ps. If a P’s local queue is empty, it “steals” Gs from other Ps’ queues (work stealing) to balance load.
Scheduling Flow:
- When you start a goroutine (
go func()), it’s added to the local queue of the current P. - The M bound to the P dequeues Gs from the P’s local queue and executes them.
- If a G blocks (e.g., on I/O, mutex, or channel), the M is unbound from the P, and the P is free to bind to another M (which can run other Gs from the P’s queue).
- When the blocked G resumes, it’s added back to a P’s queue (local or global) to be executed.
This design ensures efficient use of CPU cores and minimizes idle time, making goroutines far more scalable than OS threads.
4. Creating and Managing Goroutines
Creating a goroutine is simple: prefix a function call with the go keyword.
Basic Syntax:
func sayHello() {
fmt.Println("Hello, Goroutine!")
}
func main() {
go sayHello() // Start a new goroutine
fmt.Println("Hello, Main!")
}
What Happens Here?
main()is the entry point and runs in its own goroutine (the “main goroutine”).go sayHello()starts a new goroutine, which runssayHello()concurrently withmain().
Problem: Goroutine Termination
If you run the code above, you might not see “Hello, Goroutine!” printed. Why? Because the main goroutine exits immediately after printing “Hello, Main!”, and the program terminates—killing all other goroutines.
To fix this, we need synchronization to ensure the main goroutine waits for other goroutines to finish.
Example: Starting Multiple Goroutines
Let’s start 5 goroutines in a loop. Without synchronization, the main goroutine may exit before they run:
func printNumber(i int) {
fmt.Printf("Number: %d\n", i)
}
func main() {
for i := 0; i < 5; i++ {
go printNumber(i) // Start 5 goroutines
}
// Main exits here, killing all goroutines!
}
Output (unpredictable, may show nothing):
// No output, or partial output (e.g., "Number: 3")
To fix this, we need to synchronize the main goroutine with the spawned goroutines.
5. Synchronization in Goroutines
Goroutines often need to coordinate—e.g., waiting for others to finish, or sharing data safely. Go provides two primary tools for this: sync.WaitGroup and channels.
Using sync.WaitGroup
sync.WaitGroup tracks a set of goroutines and blocks until all have completed. It has three methods:
Add(n): Registerngoroutines to wait for.Done(): Decrement the count (called by a goroutine when it finishes).Wait(): Block until the count reaches zero.
Example: Waiting for Goroutines with WaitGroup
import (
"fmt"
"sync"
)
func printNumber(i int, wg *sync.WaitGroup) {
defer wg.Done() // Decrement count when function exits
fmt.Printf("Number: %d\n", i)
}
func main() {
var wg sync.WaitGroup
for i := 0; i < 5; i++ {
wg.Add(1) // Register a new goroutine
go printNumber(i, &wg) // Pass WaitGroup by pointer
}
wg.Wait() // Block until all goroutines call Done()
fmt.Println("All goroutines finished!")
}
Output (order may vary):
Number: 2
Number: 0
Number: 1
Number: 3
Number: 4
All goroutines finished!
Channels: Communication Between Goroutines
Channels are Go’s primary mechanism for safely communicating between goroutines. They enable goroutines to send and receive values, enforcing synchronization and avoiding race conditions (where multiple goroutines access shared data unsafely).
Basic Channel Syntax:
// Create an unbuffered channel (sends block until a receive is ready)
ch := make(chan int)
// Send value to channel (blocks until received)
go func() {
ch <- 42 // Send 42 to ch
}()
// Receive value from channel (blocks until sent)
value := <-ch
fmt.Println(value) // Output: 42
Buffered Channels:
A buffered channel has a fixed capacity. Sends block only when the buffer is full; receives block only when the buffer is empty.
ch := make(chan int, 2) // Buffer size 2
ch <- 1 // No block (buffer has space)
ch <- 2 // No block (buffer has space)
// ch <- 3 // Blocks (buffer is full)
fmt.Println(<-ch) // 1 (buffer now has 1)
fmt.Println(<-ch) // 2 (buffer now empty)
Closing Channels:
Close a channel with close(ch) to signal no more values will be sent. Receivers can check if a channel is closed using the two-value receive:
ch := make(chan int, 2)
ch <- 1
ch <- 2
close(ch)
// Iterate over channel until closed
for val := range ch {
fmt.Println(val) // Output: 1, 2
}
Example: Collecting Results with Channels
Channels are ideal for collecting results from concurrent tasks. Here’s how to fetch data from multiple goroutines and aggregate results:
func fetchData(id int, ch chan<- string) { // ch is send-only
result := fmt.Sprintf("Result from goroutine %d", id)
ch <- result // Send result to channel
}
func main() {
const numGoroutines = 3
results := make(chan string, numGoroutines) // Buffered channel
// Start goroutines
for i := 0; i < numGoroutines; i++ {
go fetchData(i, results)
}
// Close channel after all goroutines finish (use WaitGroup)
go func() {
var wg sync.WaitGroup
for i := 0; i < numGoroutines; i++ {
wg.Add(1)
go func(id int) {
defer wg.Done()
fetchData(id, results)
}(i)
}
wg.Wait()
close(results)
}()
// Collect results
for res := range results {
fmt.Println(res)
}
}
6. Common Pitfalls and Best Practices
1. Race Conditions
A race condition occurs when two goroutines access shared data without synchronization, leading to unpredictable behavior.
Example of a Race Condition:
var count int
func increment() {
count++ // Not thread-safe!
}
func main() {
var wg sync.WaitGroup
for i := 0; i < 1000; i++ {
wg.Add(1)
go func() {
defer wg.Done()
increment()
}()
}
wg.Wait()
fmt.Println("Count:", count) // May not be 1000!
}
Fix: Use channels or sync.Mutex to synchronize access:
var (
count int
mu sync.Mutex // Mutual exclusion lock
)
func increment() {
mu.Lock()
defer mu.Unlock()
count++ // Safe now!
}
Detect Race Conditions: Use go run -race yourfile.go or go test -race to detect races.
2. Leaking Goroutines
A goroutine leaks if it’s started but never exits (e.g., stuck in an infinite loop or waiting on a channel that’s never closed).
Example of a Leak:
func leakyGoroutine() {
for { // Infinite loop—never exits!
time.Sleep(time.Second)
}
}
func main() {
go leakyGoroutine() // Leaks!
time.Sleep(5 * time.Second)
}
Fix: Use context.Context to cancel goroutines:
func safeGoroutine(ctx context.Context) {
for {
select {
case <-ctx.Done():
return // Exit on cancellation
default:
time.Sleep(time.Second)
}
}
}
func main() {
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel() // Cancel after 5 seconds
go safeGoroutine(ctx)
time.Sleep(10 * time.Second) // Goroutine exits after 5s
}
3. Overusing Goroutines
Starting too many goroutines can overwhelm the scheduler and lead to overhead. Use goroutines judiciously—e.g., limit concurrency with a worker pool.
Best Practices:
- Prefer Channels Over Shared State: Use channels to communicate data between goroutines instead of sharing memory.
- Use
sync.WaitGroupfor Waiting: Simplifies waiting for multiple goroutines to finish. - Close Channels When Done: Prevents receivers from blocking indefinitely.
- Limit Concurrency with Worker Pools: Use a fixed number of goroutines to process tasks (e.g., for API requests to avoid overwhelming a server).
7. Real-World Example: Concurrent Web Scraper
Let’s build a concurrent web scraper that fetches multiple URLs in parallel, using goroutines and channels to coordinate.
Step 1: Define Dependencies
We’ll use net/http for HTTP requests and golang.org/x/net/html to parse HTML (install with go get golang.org/x/net/html).
Step 2: Scraper Code
package main
import (
"fmt"
"net/http"
"sync"
"golang.org/x/net/html"
)
// fetchURL fetches a URL and returns its title.
func fetchURL(url string, wg *sync.WaitGroup, results chan<- string) {
defer wg.Done() // Notify WaitGroup when done
resp, err := http.Get(url)
if err != nil {
results <- fmt.Sprintf("Error fetching %s: %v", url, err)
return
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
results <- fmt.Sprintf("%s returned status: %s", url, resp.Status)
return
}
// Parse HTML to extract title
doc, err := html.Parse(resp.Body)
if err != nil {
results <- fmt.Sprintf("Error parsing %s: %v", url, err)
return
}
var title string
var extractTitle func(*html.Node)
extractTitle = func(n *html.Node) {
if n.Type == html.ElementNode && n.Data == "title" && n.FirstChild != nil {
title = n.FirstChild.Data
return
}
for c := n.FirstChild; c != nil; c = c.NextSibling {
extractTitle(c)
}
}
extractTitle(doc)
results <- fmt.Sprintf("Title of %s: %q", url, title)
}
func main() {
urls := []string{
"https://golang.org",
"https://google.com",
"https://github.com",
}
const numWorkers = 3 // Limit concurrency to 3 goroutines
results := make(chan string, len(urls))
var wg sync.WaitGroup
// Start worker goroutines
for _, url := range urls {
wg.Add(1)
go fetchURL(url, &wg, results)
}
// Close results channel after all workers finish
go func() {
wg.Wait()
close(results)
}()
// Print results as they come in
for result := range results {
fmt.Println(result)
}
}
Explanation:
- Concurrency: We start a goroutine for each URL, limited by
numWorkers(adjust based on target server tolerance). - Synchronization:
sync.WaitGroupensures we wait for all fetchers to finish before closing the results channel. - Communication: The
resultschannel collects output from goroutines, which the main goroutine prints.
8. Conclusion
Goroutines are a powerful feature that makes Go a leader in concurrent programming. They are lightweight, efficient, and easy to use, enabling you to write scalable applications that leverage multi-core processors.
By mastering goroutines, synchronization with sync.WaitGroup and channels, and avoiding common pitfalls like race conditions and leaks, you’ll be able to build high-performance, concurrent systems with Go.
Start small: experiment with simple goroutines, then move to channels and synchronization. As you practice, you’ll develop an intuition for when and how to use concurrency effectively.