Search code examples
gosocketshttpstcpproxy

how browser knows end of reading data


I am trying to make a https proxy with Golang. I know at first, browser sends header with ending to \r\n and socket blocks read() until reading those character. But when it is encrypted(ssl/tls) and HTTP 1.1(keep connection alive)

  • how Browser knows end of reading data?
  • Do they read byte by byte and some special character at end(Is this a good way for large data at all?)?
  • Or they send size of data first(as suggested in topics for tcp socket)?
  • As proxy, how can I understand end of data when streaming or loading simple html page?

This is only part of code, it works when i run it in local network, but in server(vps) will block on read until connection closed. complete code here

func write(client_to_proxy net.Conn, browser_to_client net.Conn) {
    defer client_to_proxy.Close()
    buffer := make([]byte, 1024)

    reader := bufio.NewReader(browser_to_client)

    for {
        length, err := reader.Read(buffer)
        if length > 0 {
            fmt.Println(time.Now().Format(time.Stamp) + " READ from client to browser: " + strconv.Itoa(length))
            //fmt.Println(string(buffer[:readLeng]))

            writeLength, err := client_to_proxy.Write(buffer[:length])
            if writeLength > 0 {
                fmt.Println(time.Now().Format(time.Stamp) + " WRITE from client to browser: " + strconv.Itoa(writeLength))
            }
            if err != nil {
                fmt.Println("ERR6 ", err)
                return
            }
        }
        if err != nil {
            fmt.Println("ERR5 ", err)
            return
        }
    }
}

func read(client_to_proxy net.Conn, browser_to_client net.Conn) {
    defer browser_to_client.Close()
    buffer := make([]byte, 1024)

    reader := bufio.NewReader(client_to_proxy)
    length, err := reader.Read(buffer)

    fmt.Println(time.Now().Format(time.Stamp) + " READ from proxy to client: " + strconv.Itoa(length))
    fmt.Println(string(buffer))

    if length > 0 {
        writeLength, err := browser_to_client.Write(buffer[:length])
        fmt.Println(time.Now().Format(time.Stamp) + " WRITE from client to browser: " + strconv.Itoa(writeLength))
        if err != nil {
            fmt.Println("ERR7 ", err)
            return
        }
    }
    if err != nil {
        return
    }

    go write(client_to_proxy, browser_to_client)

    for {
        length, err := reader.Read(buffer)
        fmt.Println(time.Now().Format(time.Stamp) + " READ from proxy to client: " + strconv.Itoa(length))
        //fmt.Println(string(buffer[:length]))
        if length > 0 {
            writeLength, err := browser_to_client.Write(buffer[:length])
            fmt.Println(time.Now().Format(time.Stamp) + " WRITE from client to browser: " + strconv.Itoa(writeLength))
            if err != nil {
                fmt.Println("ERR8 ", err)
                return
            }
        }
        if err != nil {
            return
        }
    }
}

EDIT 1: I use a client and server go app like this browser->client->proxy->so.com then so.com->prxoy->client->browser I don't want encrypted data! my problem is at 'client' app, I don't know know how much should read bytes to unblock read()!


Solution

  • How Browser knows end of reading data?

    By implementing the standard.

    For HTTP/1.1 see RFC 7230 section 3.3 Message Body Length for the details, how the length of the body is determined. Hint: based on information on the HTTP header, like Content-length and Transfer-Encoding.

    As proxy, how can I understand end of data when streaming or loading simple html page?

    A proxy behaves the same regarding plain HTTP messages.

    HTTPS traffic is passed through nd-to-end encrypted between client and server over the proxy using a tunnel. This tunnel gets created with a CONNECT request, which ends only with TCP connection close. The proxy has no insight into the connection and can thus also not determine where HTTP message header and body of the various requests and responses inside the encrypted traffic are.