Search code examples
gotokenize

a better way to use Scanner for multiple tokens per line?


I'm trying to parse a file with lines that consist of a key, a space, a number and then a newline.

My code works, but it doesn't smell right to me. Is there a better way to use Scanner? Particularly, I don't like having the Scan() inside the for-loop without any protection on it.

func TestScanner(t *testing.T) {
    const input = `key1 62128128\n
key2 8337182720\n
key3 7834959872\n
key4 18001920\n
key5 593104896\n`
    scanner := bufio.NewScanner(strings.NewReader(input))
    scanner.Split(bufio.ScanWords)
    for scanner.Scan() {
        key := scanner.Text()
        scanner.Scan()
        value := scanner.Text();
        fmt.Printf("k: %v, v: %v\n", key, value)
    }
}

Solution

  • you should not use \n in input, and always check for errors.
    working sample code:

    package main
    
    import (
        "bufio"
        "fmt"
        "strings"
    )
    
    func main() {
        const input = `key1 62128128
    key2 8337182720
    key3 7834959872
    key4 18001920
    key5 593104896`
        scanner := bufio.NewScanner(strings.NewReader(input))
        scanner.Split(bufio.ScanWords)
        for scanner.Scan() {
            key := scanner.Text()
            if !scanner.Scan() {
                break
            }
            value := scanner.Text()
            fmt.Printf("k: %v, v: %v\n", key, value)
        }
    }
    

    output:

    k: key1, v: 62128128
    k: key2, v: 8337182720
    k: key3, v: 7834959872
    k: key4, v: 18001920
    k: key5, v: 593104896  
    

    Also you may use Fscan which scans to desired type, like this:

    package main
    
    import "fmt"
    import "strings"
    
    func main() {
        const input = `key1 62128128
    key2 8337182720
    key3 7834959872
    key4 18001920
    key5 593104896`
        rdr := strings.NewReader(input)
        for {
            k, v := "", 0
            n, _ := fmt.Fscan(rdr, &k, &v)
            if n != 2 {
                //fmt.Println(err)
                break
            }
            fmt.Printf("%T: %[1]v, %T: %[2]v\n", k, v)
        }
    }
    

    output:

    string: key1, int: 62128128
    string: key2, int: 8337182720
    string: key3, int: 7834959872
    string: key4, int: 18001920
    string: key5, int: 593104896