Search code examples
stringgotcpnewline

bufio scanner and handling new lines


I've got 2 processes communicating over TCP sockets. Side A sends a string to side B, which is sometimes encrypted using standard crypto/cipher package. The resulting string may include a new line character but Side B's bufio scanner is interpreting it as the end of the request. I want side B to continue accepting lines, append them and wait for a known end-of-command character before further processing it. Side B will return a response to Side A, so the connection remains open and therefore cannot use a close-connection event as a command delimiter.

Everything is working fine for single-line commands, but these new line characters in the encrypted output cause issues (about 10% of the time).

Side A will send in the following formats (the third is a legitimate example of a problem string I'm trying to process correctly):

callCommand()

callCommand("one","two","three")

callCommand("string","encrypted-data-to-follow","[7b��Cr��l��G���bH�@x��������� �(z�$�a��0��ڢ5Y7+��U�QT�ΐl�K�(�n�U��J����QK�BX�+�l\8H��-g�y.�.�1�f��I�C�Ȓ㳿���o�xz�8?��c�e ��Tb��?4�hD W��� �<���Е�gc�������N�V���ۓP8 �����O3")

We can fairly reliably say the end-of-command keys are a close parentheses ")" and a new line character.

Side A's function to send to side B:

func writer(text string) string {
    conn, err := net.Dial("tcp", TCPdest)
    t := time.Now()
    if err != nil {
        if _, t := err.(*net.OpError); t {
            fmt.Println("Some problem connecting.\r\n")
        } else {
            fmt.Println("Unknown error: " + err.Error()+"\r\n")
        }
    } else {
        conn.SetWriteDeadline(time.Now().Add(1 * time.Second))
        _, err = conn.Write([]byte(text+"\r\n"))
        if err != nil {
            fmt.Println("Error writing to stream.\r\n")
        } else {
            timeNow := time.Now()           
            if timeNow.Sub(t.Add(time.Duration(5*time.Second))).Seconds() > 5 {
                return "timeout"
            }
            scanner := bufio.NewScanner(conn)
            for {
                ok := scanner.Scan()
                if !ok {
                    break
                }
                if strings.HasPrefix(scanner.Text(), "callCommand(") && strings.HasSuffix(scanner.Text(), ")") {
                    conn.Close()
                    return scanner.Text()
                }
            }
        }
    }
    return "unspecified error"
}

Side B's handling of incoming connections:

src := "192.168.68.100:9000"
listener, _ := net.Listen("tcp", src)

defer listener.Close()

for {
    conn, err := listener.Accept()
    if err != nil {
        fmt.Println("Some connection error: %s\r\n", err)
    }
    go handleConnection(conn)
}   

func handleConnection(conn net.Conn) {
    remoteAddr := conn.RemoteAddr().String()
    fmt.Println("Client connected from " + remoteAddr + "\r\n")

    scanner := bufio.NewScanner(conn)
    wholeString := ""
    for {
        ok := scanner.Scan()

        if !ok {
            break
        }

        //Trying to find the index of a new-line character, to help me understand how it's being processed
        fmt.Println(strings.Index(scanner.Text(), "\n"))
        fmt.Println(strings.Index(wholeString, "\n"))

        //for the first line received, add it to wholeString
        if len(wholeString) == 0 {
            wholeString = scanner.Text()
        }

        re := regexp.MustCompile(`[a-zA-Z]+\(.*\)\r?\n?`)

        if re.Match([]byte(wholeString)) {
            fmt.Println("Matched command format")
            handleRequest(wholeString, conn)
        } else if len(wholeString) > 0 && !re.Match([]byte(wholeString)) {
            //Since we didn't match regex, we can assume there's a new-line mid string, so append to wholeString
            wholeString += "\n"+scanner.Text()
        }

    }
    conn.Close()
    fmt.Println("Client at " + remoteAddr + " disconnected.\r\n")
}

func handleRequest(request string, conn net.Conn) {
    fmt.Println("Received: "+request)
}

I'm not really sure this approach on Side B is correct but included my code above. I've seen a few implementations but a lot seem to rely on a close of connection to begin processing the request, which doesn't suit my scenario.

Any pointers appreciated, thanks.


Solution

  • Your communication "protocol" (one line being one message, not quite a protocol) clearly cannot handle binary data. If you want to send text data in your protocol, you could convert your binary data to text, using a Base64 encoding for example. You would also need some semantics to indicate that some text was converted from binary.

    Or you could change your protocol to handle binary data natively. You could prepend the length of the binary data to follow, so that you know you have to read this data as binary and not interpret a newline character as the end of the message.

    There are many protocols doing this very well, perhaps you don't need to come up with your custom one. If you want to send text messages, HTTP is very simple to use, you could format your data as JSON, using Base64 to convert your binary data to text:

    {
        "command": "string",
        "args": [
            "binaryDataAsBase64"
        ]
    }