Next: Exercise 1.4, Previous: Dup2, Up: Finding Duplicate Lines [Index]
The versions of dup above operate in a “streaming” mode in which input is
read and broken into lines as needed, so in principle these programs can handle
an arbitrary amount of input.
An alternative approach is to read the entire input into memory in one big gulp, split it into lines all at once, then process the lines.
The following version, dup3, operates in that fashion. It introduces the
function ReadFile (from the io/ioutil package), which reads the entire
contents of a named file, and strings.Split, which splits a string into a
slice of substrings. (Split is the opposite of strings.Join, which we saw
earlier.)
We’ve simplified dup3 somewhat.
ReadFile
requires a file name argument.
main, since it is now
needed in only one place.
// Dup3 reads the entire input files into memory, splits them into
// lines all at once, then processes the lines by counting each one
package main
import (
"fmt"
"io/ioutil"
"os"
"strings"
)
func main() {
counts := make(map[string]int)
for _, filename := range os.Args[1:] {
data, err := ioutil.ReadFile(filename)
if err != nil {
fmt.Fprintf(os.Stderr, "dup3: %v\n", err)
continue
}
for _, line := range strings.Split(string(data), "\n") {
counts[line]++
}
}
for line, n := range counts {
if n > 1 {
fmt.Printf("%d\t%s\n", n, line)
}
}
}
Listing 1.11: gopl.io/ch1/dup3
ReadFile returns a byte slice that must be converted into a ‘string’ so it
can be split by strings.Split.
Under the covers,
bufio.Scanner,
ioutil.ReadFile,
ioutil.WriteFile
use the Read and Write methods of *os.File, but it’s rare that most
programmers need to access these lower-level routines directly. The
higher-level functions like those from bufio and io/ioutil are easier to
use.
Next: Exercise 1.4, Previous: Dup2, Up: Finding Duplicate Lines [Index]