Search code examples
gopointerscasting

What is the smallest number of times I can copy the data in order to return the contents of an io.Reader as a string pointer?


A coworker implemented a function that makes an HTTP call and returns the response body as a string. Simplifying a bit for brevity (no, we're not really ignoring all the errors):

func getStuff(id string) string {
    response, _ := http.Get(fmt.Sprintf("/some/url/%s", id))
    body, _ := ioutil.ReadAll(response.Body)
    return string(body)
}

The response is typically fairly large, so I want to avoid unnecessary copying. As I understand it, as written, we are making three copies of the response data:

  1. io.ReadAll copies the data from the incoming HTTP connection to a byte slice.
  2. string(body) copies the byte slice into a string.
  3. return makes a new copy of the string for use in the calling function.

So, first of all, do I understand the current state correctly?

The easy first step is to return a pointer:

response, _ := http.Get(fmt.Sprintf("/some/url/%s", id))
body, _ := ioutil.ReadAll(response.Body)
result := string(body)
return &result

That avoids the third copy. Cool. But I'm still making two copies of the data, and I'd like to make just one.

I could have him change the return type to *[]byte, and then we can just return &body. But then all of the callers would need to convert the result to string themselves, and then all I've accomplished is to spread the logic that makes the second copy around to multiple other places instead of keeping it consolidated here.

I could use strings.Builder and io.Copy:

builder := new(strings.Builder)
_, _ := io.Copy(buf, response.Body)
result buf.String()
return &result

And that might be a tiny bit more efficient (I don't really know; is it?), but I still end up with two copies of the data.

Is it possible to do this with just a single copy of the data?

I think it's not; just wondering if I'm wrong!


Solution

  • Copying a string only copies the string header, which contains two words: pointer to the array containing string data, and the length. It does not copy the string contents. Thus, returning a string from a function will not copy the string.

    If you are passing that string to something like json unmarshaling, you can return the []byte, or even, the reader from the body, and process it. If you need it as a string, then two-copies is the best you can have: once to read it from the body, and second, to convert it into a string.