I'm developing a middleware in Go that intercepts HTTP requests, reads JSON data from the request body, and indexes it as a document in Elasticsearch based on this documentation.
However, although the document appears to be indexed in Elasticsearch, the process returns an Invalid JSON format: EOF
error. This error prevents the middleware from proceeding to the main handler that performs additional database operations.
Here is the relevant part of my middleware code:
package middlewares
import (
"encoding/json"
"net/http"
"github.com/elastic/go-elasticsearch/v8"
"github.com/elastic/go-elasticsearch/v8/typedapi/types"
"github.com/google/uuid"
)
// IndexDocumentMiddleware creates a middleware to index documents into Elasticsearch
func IndexDocumentMiddleware(es *elasticsearch.TypedClient) func(http.Handler) http.Handler {
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
ctx := r.Context()
// Read and decode the request body into a generic map to determine the type of document
var doc map[string]interface{}
if err := json.NewDecoder(r.Body).Decode(&doc); err != nil {
http.Error(w, "Error parsing request body", http.StatusBadRequest)
return
}
var indexName string
if typeName, ok := doc["type"].(string); ok {
indexName = typeName
} else {
http.Error(w, "Error: 'type' is not a string or is missing", http.StatusBadRequest)
return
}
existsRes, err := es.Indices.Exists(indexName).Do(ctx)
if err != nil {
http.Error(w, "Error existsRes: "+err.Error(), http.StatusInternalServerError)
return
}
if !existsRes {
_, err := es.Indices.Create(indexName).Mappings(types.NewTypeMapping()).Do(ctx)
if err != nil {
http.Error(w, "Error creating index: "+err.Error(), http.StatusInternalServerError)
return
}
}
docID := uuid.New().String()
_, err = es.Index(indexName).
Id(docID).
Document(doc).Do(ctx)
if err != nil {
http.Error(w, "Error indexing document: "+err.Error(), http.StatusInternalServerError)
return
}
next.ServeHTTP(w, r)
})
}
}
Any insights or suggestions to address this issue would be greatly appreciated!
.Decode(&doc)
→ Decoding Makes Body EmptyThis issue arises because when you read from http.Request.Body
, which is an io.ReadCloser
, the data is consumed and is not available for subsequent reads. Apparently, this is a common mistake, especially when multiple handlers or middleware need to access the request body.
To fix this, you need to read the entire body first, and then restore it so that it can be re-read by subsequent handlers or middleware. Here’s the essential piece of code that accomplishes this:
// Read the entire body
bodyBytes, err := io.ReadAll(r.Body)
if err != nil {
http.Error(w, "Error reading request body", http.StatusInternalServerError)
return
}
// Restore the io.ReadCloser to its original state
r.Body = io.NopCloser(bytes.NewBuffer(bodyBytes))
//code continues...
// Read and decode the request body into a generic map to determine the type of document
var doc map[string]interface{} //...
Explanation:
Consuming the Body: The json.NewDecoder(r.Body).Decode(&doc)
operation reads the entire contents of the io.ReadCloser
stream. Once read, the stream is empty because io.ReadCloser
does not support rewinding. This effectively "consumes" the body, making it empty for any subsequent attempts to read it.
Restoring the Body: After reading, r.Body
is re-assigned using io.NopCloser(bytes.NewBuffer(bodyBytes))
. This line creates a new io.ReadCloser
from the bodyBytes
we read earlier, effectively duplicating the original content. io.NopCloser
is used to convert an io.Reader
(returned by bytes.NewBuffer
) into an io.ReadCloser
without adding any functionality to close, as the buffer does not need to be closed.
This approach ensures that any middleware or handler that runs after this code will still have access to the full request body as if it were untouched.
Why This Is Important:
Not handling the request body correctly can lead to subtle bugs, especially in larger applications where multiple parts of the middleware chain may need to inspect or modify the request. This technique ensures that all parts of your HTTP server can operate correctly without interfering with each other.