Working with files in Go might seem straightforward and Go provides a number of different approaches to reading a file. However, because of these varied approaches, there are subtleties that can lead to confusion, especially for the people who are just starting. In this tutorial, we will discuss different methods to read files in Go which are suitable for different use cases.
Read entire file into memory
The simplest way to read a file in Go is to load its entire content into memory at once using the ioutil
package:
package main
import (
"fmt"
"io/ioutil"
"log"
)
func main() {
data, err := ioutil.ReadFile("filepath/file.config")
if err != nil {
log.Fatal(err)
}
content := string(data)
fmt.Println(content)
}
This method uses the ioutil.ReadFile
function, which reads the entire file into a byte slice of the memory.
đź“‹
It’s convenient for simple cases like small config files, templates, and other files where size is manageable. However, it does not work efficiently for large files as it often leads to high memory usage leading to performance issues and at times, OutOfMemory errors.
Read file line by line
The bufio package
in Go offers a convenient way to read file line by line. Using the bufio.NewScanner
creates a scanner that reads the file line by line. The scanner.Scan()
method advances to the next line, and scanner.Text()
retrieves the current line. This approach is memory-efficient as it reads only one line at a time.
package main
import (
"bufio"
"fmt"
"log"
"os"
)
func main() {
file, err := os.Open("filepath/file.txt")
if err != nil {
log.Fatal(err)
}
defer file.Close()
scanner := bufio.NewScanner(file)
for scanner.Scan() {
line := scanner.Text()
fmt.Println(line)
}
if err := scanner.Err(); err != nil {
log.Fatal(err)
}
}
đź“‹
Some people consider it a better way to read a large file, but it only works for a few specific use cases, such as parsing logs or data files where each line is independent. It’s not suitable for reading binary files or cases where you need to process large blocks of data at once, as it reads only up to 64KB per line by default.
Read file with a buffer
A much more efficient way for reading both text and binary files in Go is using buffer.
This method uses buffered I/O to optimize file reading performance by reducing the number of system calls to read files. The bufio.NewReader(file)
creates a buffered reader around the file. This buffered reader reads data from the file in larger chunks internally. By default, bufio.Reader
uses a buffer size of 4 KB (4096 bytes) but you can easily change it as we have done in the example below:
package main
import (
"bufio"
"fmt"
"io"
"log"
"os"
)
func main() {
file, err := os.Open("filepath/file.txt")
if err != nil {
log.Fatal(err)
}
defer file.Close()
reader := bufio.NewReader(file)
buf := make([]byte, 1024)
for {
n, err := reader.Read(buf)
if err != nil {
if err == io.EOF {
break
}
log.Fatal(err)
}
fmt.Println(string(buf[:n]))
}
}
Here, we have used buf[:n]
for converting the buffer to a string to avoid printing any unused space in the buffer.
Read file in fixed-size chunks
When working with large files, we often need to read files in fixed-size chunks. In Go, you can do this using the io.ReadFull function.
package main
import (
"fmt"
"io"
"log"
"os"
)
func main() {
file, err := os.Open("filepath/file.txt")
if err != nil {
log.Fatal(err)
}
defer file.Close()
chunkSize := 1024
buf := make([]byte, chunkSize)
for {
n, err := io.ReadFull(file, buf)
if err != nil {
if err == io.EOF || err == io.ErrUnexpectedEOF {
fmt.Println(string(buf[:n]))
}
log.Fatal(err)
}
fmt.Println(string(buf))
}
}
In this example, we read the file in 1024-byte chunks. The io.ReadFull
function ensures that it reads exactly chunkSize
bytes, unless it encounters the end of the file or an error.
Stream file content with io.Reader
Theio.Reader
interface provides a common way to read data from various sources, including files. This abstraction allows you to process file content in a stream-oriented manner.
This interface has only the method Read()
, which takes as a parameter an array of bytes and returns an integer, which is the number of bytes read and an error, which will be nil, if everything goes well. The parameter array will be used as a buffer, that is, its size will be the maximum number of bytes read.
package main
import (
"fmt"
"io"
"log"
"os"
"strings"
)
func main() {
file, err := os.Open("filepath/file.txt")
if err != nil {
log.Fatal(err)
}
defer file.Close()
reader := strings.NewReader("This is a string reader.")
if _, err := io.Copy(os.Stdout, reader); err != nil {
log.Fatal(err)
}
}
In this example, I have used io.Reader with a string reader. The io.Copy function copies the content from the reader to standard output. This approach is useful for processing large streams of data without loading everything into memory.
Read file as JSON (encoding/json)
Since JSON requires deserialization into native data structures, the usual file reading methods are not enough, but the encoding/json
package in Go provides convenient functions to read JSON data from files. It also automatically maps JSON keys to struct fields, so you don’t have to do a lot of manual parsing.
package main
import (
"encoding/json"
"fmt"
"io/ioutil"
"log"
)
type Data struct {
Name string `json:"name"`
Value int `json:"value"`
}
func main() {
data, err := ioutil.ReadFile("filepath/data.json")
if err != nil {
log.Fatal(err)
}
var jsonData Data
err = json.Unmarshal(data, &jsonData)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%+vn", jsonData)
}
In this example, we read a JSON file and unmarshal the data into a Go struct. The json.Unmarshal
function parses the JSON data and populates the struct fields.
Reading large files using Memory-Mapped file
syscall.Mmap
package in Go allows you to read extremely large files in Go and with fast access. It’s using memory mapping that allows us to map a file directly into an application’s memory space, providing fast random access.
package main
import (
"fmt"
"log"
"os"
"syscall"
)
func main() {
file, err := os.Open("filepath/large_file.txt")
if err != nil {
log.Fatal(err)
}
defer file.Close()
fileInfo, err := file.Stat()
if err != nil {
log.Fatal(err)
}
fileSize := fileInfo.Size()
data, err := syscall.Mmap(int(file.Fd()), 0, int(fileSize), syscall.PROT_READ, syscall.MAP_SHARED)
if err != nil {
log.Fatal(err)
}
defer syscall.Munmap(data)
fmt.Println(string(data))
}
This example uses the syscall.Mmap
function to map the entire file into memory, making the file content accessible as a byte slice. The memory-mapped file is mapped with read-only permissions (syscall.PROT_READ
) and shared access (syscall.MAP_SHARED
). This allows the program to directly read the file’s contents without explicitly loading the whole file into memory at once. After processing, syscall.Munmap()
is called to unmap the file from memory.
Wrapping Up
Go offers various approaches to read files, whether it’s a small config file or a large data stream. Understanding these different methods can help you in picking the right approach according to your specific use case and help with optimizing for performance and resource usage.