I'd like to write a unit test in which I run an ephemeral gRPC server which is started in a separate Goroutine within the test and stopped after the test runs. To this end, I've tried adapting the 'Hello, world' example from this tutorial (https://grpc.io/docs/languages/go/quickstart/) to one in which instead of a server and a client with separate main.go
s, there is a single test function which starts the server asynchronously and subsequently makes the client connection using the grpc.WithBlock()
option.
I've put the simplified example in this repository, https://github.com/kurtpeek/grpc-helloworld; here is the main_test.go
:
package main
import (
"context"
"fmt"
"log"
"net"
"testing"
"time"
"github.com/stretchr/testify/require"
"google.golang.org/grpc"
"google.golang.org/grpc/examples/helloworld/helloworld"
)
const (
port = ":50051"
)
type server struct {
helloworld.UnimplementedGreeterServer
}
func (s *server) SayHello(ctx context.Context, in *helloworld.HelloRequest) (*helloworld.HelloReply, error) {
log.Printf("Received: %v", in.GetName())
return &helloworld.HelloReply{Message: "Hello " + in.GetName()}, nil
}
func TestHelloWorld(t *testing.T) {
lis, err := net.Listen("tcp", port)
require.NoError(t, err)
s := grpc.NewServer()
helloworld.RegisterGreeterServer(s, &server{})
go s.Serve(lis)
defer s.Stop()
log.Println("Dialing gRPC server...")
conn, err := grpc.Dial(fmt.Sprintf("localhost:%s", port), grpc.WithInsecure(), grpc.WithBlock())
require.NoError(t, err)
defer conn.Close()
c := helloworld.NewGreeterClient(conn)
ctx, cancel := context.WithTimeout(context.Background(), time.Second)
defer cancel()
log.Println("Making gRPC request...")
r, err := c.SayHello(ctx, &helloworld.HelloRequest{Name: "John Doe"})
require.NoError(t, err)
log.Printf("Greeting: %s", r.GetMessage())
}
The problem is that when I run this test, it times out:
> go test -timeout 10s ./... -v
=== RUN TestHelloWorld
2020/06/30 11:17:45 Dialing gRPC server...
panic: test timed out after 10s
I'm having trouble seeing why a connection is not made? It seems to me that the server is started correctly...
It seems the code you posted here has a typo:
fmt.Sprintf("localhost:%s", port)
If I run your test function without the grpc.WithBlock()
option, c.SayHello
gives the following error:
rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: address localhost::50051: too many colons in address"
The culprit seems to be localhost::50051
After removing the extra colon from the const
declaration (or from fmt.Sprintf("localhost:%s", port)
, if you prefer), the test passes.
const (
port = "50051" // without the colon
)
Output:
2020/06/30 23:59:01 Dialing gRPC server...
2020/06/30 23:59:01 Making gRPC request...
2020/06/30 23:59:01 Received: John Doe
2020/06/30 23:59:01 Greeting: Hello John Doe
However, from the documentation of grpc.WithBlock()
Without this, Dial returns immediately and connecting the server happens in background.
It follows that with this option, any connection errors should be returned straight away from the grpc.Dial
call:
conn, err := grpc.Dial("bad connection string", grpc.WithBlock()) // can't connect
if err != nil {
panic(err) // should panic, right?
}
So why does your code hang?
By looking at the source code of grpc
package (I built the test against v1.30.0
):
// A blocking dial blocks until the clientConn is ready.
if cc.dopts.block {
for {
s := cc.GetState()
if s == connectivity.Ready {
break
} else if cc.dopts.copts.FailOnNonTempDialError && s == connectivity.TransientFailure {
if err = cc.connectionError(); err != nil {
terr, ok := err.(interface {
Temporary() bool
})
if ok && !terr.Temporary() {
return nil, err
}
}
}
if !cc.WaitForStateChange(ctx, s) {
// ctx got timeout or canceled.
if err = cc.connectionError(); err != nil && cc.dopts.returnLastError {
return nil, err
}
return nil, ctx.Err()
}
}
So s
at this point is indeed in TransientFailure
state, but the FailOnNonTempDialError
option defaults to false
, and WaitForStateChange
is false when the context expires, which doesn't happen because Dial
runs with the background context:
// Dial creates a client connection to the given target.
func Dial(target string, opts ...DialOption) (*ClientConn, error) {
return DialContext(context.Background(), target, opts...)
}
At this point I don't know if this is intended behavior, as some of these APIs as of v1.30.0
are marked as experimental.
Anyway, ultimately to make sure you catch this kind of errors on Dial
you can also rewrite your code as:
conn, err := grpc.Dial(
"localhost:50051",
grpc.WithTransportCredentials(insecure.NewCredentials()),
grpc.FailOnNonTempDialError(true),
grpc.WithBlock(),
)
Which, in case of a bad connection string, fails immediately with the appropriate error message.