Using Fiber v2.38.1, which in turn use fasthttp/websocket v1.5.0.
App crashed with following error:
2022-10-15T04:05:42.983563+00:00 app[web.1]: time="2022-10-15T04:05:42Z" level=info msg="close msg received: &{418 close}"
2022-10-15T04:05:42.983564+00:00 app[web.1]: conn is nil: false, msg: &{418 close <nil> conn 0}
2022-10-15T04:05:42.986035+00:00 app[web.1]: panic: runtime error: invalid memory address or nil pointer dereference
2022-10-15T04:05:42.986035+00:00 app[web.1]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x78 pc=0x8db967]
2022-10-15T04:05:42.986036+00:00 app[web.1]:
2022-10-15T04:05:42.986037+00:00 app[web.1]: goroutine 86 [running]:
2022-10-15T04:05:42.986037+00:00 app[web.1]: github.com/fasthttp/websocket.(*Conn).beginMessage(0x0, 0xc000348f60, 0x1)
2022-10-15T04:05:42.986037+00:00 app[web.1]: /tmp/codon/tmp/cache/go-path/pkg/mod/github.com/fasthttp/websocket@v1.5.0/conn.go:479 +0x27
2022-10-15T04:05:42.986038+00:00 app[web.1]: github.com/fasthttp/websocket.(*Conn).NextWriter(0x0, 0x1)
2022-10-15T04:05:42.986038+00:00 app[web.1]: /tmp/codon/tmp/cache/go-path/pkg/mod/github.com/fasthttp/websocket@v1.5.0/conn.go:520 +0x45
2022-10-15T04:05:42.986039+00:00 app[web.1]: github.com/fasthttp/websocket.(*Conn).WriteJSON(0xe79a80?, {0xc143a0, 0xc00041e600})
2022-10-15T04:05:42.986039+00:00 app[web.1]: /tmp/codon/tmp/cache/go-path/pkg/mod/github.com/fasthttp/websocket@v1.5.0/json.go:24 +0x45
2022-10-15T04:05:42.986039+00:00 app[web.1]: github.com/kuchaguangjie/go-fit/model.(*WsClient).WriteMsg(0xc00056a280, 0xc00041e600)
2022-10-15T04:05:42.986040+00:00 app[web.1]: /tmp/build_0b2fe533/model/ws_model.go:97 +0xf9
2022-10-15T04:05:42.986040+00:00 app[web.1]: github.com/kuchaguangjie/go-fit/controller.(*ContentController).Ws.func1.2()
2022-10-15T04:05:42.986041+00:00 app[web.1]: /tmp/build_0b2fe533/controller/content_ctl.go:623 +0x171
2022-10-15T04:05:42.986041+00:00 app[web.1]: created by github.com/kuchaguangjie/go-fit/controller.(*ContentController).Ws.func1
2022-10-15T04:05:42.986042+00:00 app[web.1]: /tmp/build_0b2fe533/controller/content_ctl.go:608 +0x10cc
2022-10-15T04:05:43.113930+00:00 heroku[web.1]: Process exited with status 2
It crashed at conn.go:479
, in function beginMessage()
:
if c.writer != nil {
I've checked that c
is not nil (as shown in log), so how it crashed ...
BTW, it's deployed on heroku, I saw following log on heroku:
2022-10-15T04:59:30.344791+00:00 heroku[router]: at=error code=H15 desc="Idle connection"
Not sure is it relevent.
content_ctl.go
): // handle close - from a channel,
// TODO: use goroutine pool ?
go func() {
if r := recover(); r != nil { // TODO: is this proper ?,
fmt.Printf("recover from panic: %v\n", r)
}
closeMsg := <-(*client).CloseChan // handle close,
log.Infof("close msg received: %v", closeMsg)
// send close msg,
closeResp := &model.WsResp{
Status: closeMsg.HttpCode,
Source: model.RespSourceConn,
Msg: closeMsg.Msg,
}
log.Infof("is conn nil: %v, msg: %v\n", client.WsConn == nil, closeResp)
err = client.WriteMsg(closeResp)
if err != nil {
log.Errorf("error - on write close msg: %v", closeResp)
}
ctl.cwm.CloseAndCleanup(client.Id) // close & clean up,
}()
WsClient
struct, and its method WriteMsg()
(part of ws_model.go
): // websocket client,
type WsClient struct {
WsUser
ContentUuid string `json:"contentUuid"` // content uuid id,
WsConn *websocket.Conn `json:"-"` // websocket connection,
CloseChan chan *CloseMsg `json:"-"` // close channel,
WriteLock *sync.Mutex // write lock, to avoid concurrent write to ws connection, which will cause panic, acquire lock when write or close,
}
// write ws msg, with lock,
func (wsc *WsClient) WriteMsg(msg *WsResp) error {
if wsc.WsConn == nil {
return errutil.ErrNoConn
}
wsc.WriteLock.Lock()
err := wsc.WsConn.WriteJSON(msg)
wsc.WriteLock.Unlock()
return err
}
CloseMsg
struct (part of ws_model.go
): // close msg sent to close chan,
type CloseMsg struct {
HttpCode int `json:"httpCode"`
Msg string `json:"msg"`
}
Problem status:
httpCode
in the CloseMsg
, if it's 418
, then I won't send the close msg.Some background:
CloseMsg
's HttpCode
.
418
is the default one, it happens only when unexpected io issue happen on the connection (e.g heroku terminate the connection due to idle ?).418
.418
might not be the best http code to use, (it's from a joke), but that doesn't effect the problem I guess.WsClient
's WriteMsg()
method, I've always checked whether the conn nil, before send msg.2022-10-16T05:19:03.088622+00:00 heroku[router]: at=error code=H15 desc="Idle connection" method=GET path="/api/v1/content/ws/RgPx8znal7AJb
2022-10-16T05:19:03.065493+00:00 app[web.1]: time="2022-10-16T05:19:03Z" level=info msg="error - on read: websocket: close 1006 (abnormal closure): unexpected EOF"
2022-10-16T05:20:57.758379+00:00 app[web.1]: time="2022-10-16T05:20:57Z" level=info msg="cleanup client, id = 'wscid-ee670cc5-4100-49d6-9857-8284d93a6d33'"
2022-10-16T05:20:57.758505+00:00 app[web.1]: time="2022-10-16T05:20:57Z" level=info msg="close msg received: &{418 close}"
Seems heroku will kill the websocket connection if there is no msg for a while. That's when 418
appear.
And, I've asked a question about heroku killing the websocket connection on SO: heroku keep killing idle connections (websocket)
This is a fasthttp bug, and fixed in v1.5.1-rc.8
refer: https://github.com/fasthttp/websocket/issues/30#issuecomment-1326860975