Next: , Previous: , Up: Finding Duplicate Lines   [Index]


1.7.2 Dup2

Many programs read either from their ‘standard input’, as above, or from a sequence of named files. The next version of dup can read from the ‘standard input’ or handle a list of file names, using os.Open to open each one:

// Dup2 prints the count and text of lines that appear more than once
// in the input.  It reads from stdin or from a list of named files.
package main

import (
        "bufio"
        "fmt"
        "os"
)

func main() {
        counts := make(map[string]int)
        files := os.Args[1:]
        if len(files) == 0 {
                countLines(os.Stdin, counts)
        } else {
                for _, arg := range files {
                        f, err := os.Open(arg)
                        if err != nil {
                                fmt.Fprintf(os.Stderr, "dup2: %v\n", err)
                                continue
                        }
                        countLines(f, counts)
                        f.Close()
                }
        }
        for line, n := range counts {
                if n > 1 {
                        fmt.Printf("%d\t%s\n", n, line)
                }
        }
}

func countLines(f *os.File, counts map[string]int) {
        input := bufio.NewScanner(f)
        for input.Scan() {
                counts[input.Text()]++
        }
        // NOTE: ignoring potential errors from input.Err()
}

Listing 1.10: gopl.io/ch1/dup2

Function os.Open and File Type *os.File

The function os.Open returns two values.

  1. The first is an open file (*os.File) that is used in subsequent reads by the Scanner.
  2. The second result of os.Open is a value of the build-in error type. If err equals the special built-in value ‘nil’, the file was opened successfully. The file is read, and when the end of the input is reached, close closes the file and releases any resources.

    On the other hand, if err is not ‘nil’, something went wrong. In that case, the error value describes the problem. The error handling prints a message on the ‘standard error’ stream using Fprintf and the verb ‘%v’, which displays a value of any type in a default format, and dup then carries on with the next file; the continue statement goes to the next iteration of the enclosing for loop.

Notice that the call to countLines precedes its declaration. Functions and other package-level entities may be declared in any order.

A map Is a Reference

A ‘map’ is reference to the data structure created by make. When a ‘map’ is passed to a function, the function receives a copy of the reference, so any changes the called function makes to the underlying data structure will be visible through the caller’s ‘map’ reference too. In the example, the values inserted into the countsmap’ by countLines are seen by make.


Next: Dup3, Previous: Dup1, Up: Finding Duplicate Lines   [Index]