Next: Exercise 1.4, Previous: Dup2, Up: Finding Duplicate Lines [Index]
The versions of dup
above operate in a “streaming” mode in which input is
read and broken into lines as needed, so in principle these programs can handle
an arbitrary amount of input.
An alternative approach is to read the entire input into memory in one big gulp, split it into lines all at once, then process the lines.
The following version, dup3
, operates in that fashion. It introduces the
function ReadFile
(from the io/ioutil
package), which reads the entire
contents of a named file, and strings.Split
, which splits a string into a
slice of substrings. (Split
is the opposite of strings.Join
, which we saw
earlier.)
We’ve simplified dup3
somewhat.
ReadFile
requires a file name argument.
main
, since it is now
needed in only one place.
// Dup3 reads the entire input files into memory, splits them into // lines all at once, then processes the lines by counting each one package main import ( "fmt" "io/ioutil" "os" "strings" ) func main() { counts := make(map[string]int) for _, filename := range os.Args[1:] { data, err := ioutil.ReadFile(filename) if err != nil { fmt.Fprintf(os.Stderr, "dup3: %v\n", err) continue } for _, line := range strings.Split(string(data), "\n") { counts[line]++ } } for line, n := range counts { if n > 1 { fmt.Printf("%d\t%s\n", n, line) } } }
ReadFile
returns a byte slice that must be converted into a ‘string’ so it
can be split by strings.Split
.
Under the covers,
bufio.Scanner
,
ioutil.ReadFile
,
ioutil.WriteFile
use the Read
and Write
methods of *os.File
, but it’s rare that most
programmers need to access these lower-level routines directly. The
higher-level functions like those from bufio
and io/ioutil
are easier to
use.
Next: Exercise 1.4, Previous: Dup2, Up: Finding Duplicate Lines [Index]