Search code examples
c++c++11segmentation-faultstdstring

Why is this segfaulting?


I have the following builder that builds Tokens:

// token_builder.h
class TokenBuilder
{
    Keyword m_keyword;
    TokenType m_type;
    std::string m_symbol;

public:
    TokenBuilder& set_keyword(Keyword);
    TokenBuilder& set_toktype(TokenType);
    TokenBuilder& set_symbol(std::string);
    std::unique_ptr<Token>&& build();
};
// token_builder.cpp
TokenBuilder& TokenBuilder::set_keyword(Keyword k)
  {
      m_keyword = k;
      return *this;
  }

  TokenBuilder& TokenBuilder::set_toktype(TokenType t)
  {
      m_type = t;
      return *this;
  }

  TokenBuilder& TokenBuilder::set_symbol(std::string sym)
  {
      m_symbol = sym; // This is the point where it segfaults
      return *this;
  }

  std::unique_ptr<Token>&& TokenBuilder::build()
  {
      std::unique_ptr<Token> token;
      token->m_keyword = m_keyword;
      token->m_type = m_type;
      token->m_symbol = m_symbol;
      return std::move(token);
  }

The program halts with SIGSEGV at the point where I am trying to initialize m_symbol by calling TokenBuilder::set_symbol. Here is how main looks like:

int main()
{
    auto token = Token::builder()
            .set_keyword(Keyword::IF)
            .set_toktype(TokenType::ID)
            .set_symbol("if")
            .build();

    std::cout << token->get_symbol() << std::endl;
}

As you can see, call to set_keyword and set_toktype don't result in any errors. But passing the literal "if" (or a std::string object) to set_symbol causes the program to crash. Passing by value is intentional. Stepping into set_symbol and analyzing the state of TokenBuilder doesn't reveal anything. As expected, m_keyword is set to Keyword::IF and m_type to TokenType::ID. This is the line that crashes: m_symbol = sym.

The program above might have other, non-trivial issues (am I using std::move correctly? returning to C++ after a long time) but this kind of feels like a silly mistake a rookie would make. I am using MinGW-w64 compiler on Windows 10 (64-bit).

Here is the entire code for conveniently copy pasting to your machine:

// src/include/token_builder.h
#ifndef TOKEN_BUILDER_H
#define TOKEN_BUILDER_H

#include <memory>
#include "token.h"

namespace funk
{
class Token;

class TokenBuilder
{
    Keyword m_keyword;
    TokenType m_type;
    std::string m_symbol;

public:
    TokenBuilder& set_keyword(Keyword);
    TokenBuilder& set_toktype(TokenType);
    TokenBuilder& set_symbol(std::string);
    std::unique_ptr<Token>&& build();
};

}

#endif
// src/token_builder.cpp
#include "include/token.h"
#include "include/token_builder.h"

namespace funk
{
  TokenBuilder& TokenBuilder::set_keyword(Keyword k)
  {
      m_keyword = k;
      return *this;
  }

  TokenBuilder& TokenBuilder::set_toktype(TokenType t)
  {
      m_type = t;
      return *this;
  }

  TokenBuilder& TokenBuilder::set_symbol(std::string sym)
  {
      m_symbol = sym;
      return *this;
  }

  std::unique_ptr<Token>&& TokenBuilder::build()
  {
      std::unique_ptr<Token> token;

      token->m_is_keyword = m_keyword != Keyword::_NONE_;
      token->m_keyword = m_keyword;
      token->m_type = m_type;
      token->m_symbol = m_symbol;

      return std::move(token);
  }
}
// src/include/token.h
#ifndef TOKEN_H
#define TOKEN_H
#include <memory>
#include "keyword.h"
#include "token_type.h"
#include "token_builder.h"

namespace funk
{
  class TokenBuilder;

  class Token
  {
      friend class TokenBuilder;

      Keyword m_keyword;
      bool m_is_keyword;
      std::string m_symbol;
      TokenType m_type;

      Token();

  public:

      Keyword& get_keyword ();
      bool is_keyword();
      std::string& get_symbol();
      TokenType& get_type();
      static TokenBuilder&& builder();
  };
}

#endif // END TOKEN_H
// src/token.cpp
#include <string>
#include <memory>
#include "include/keyword.h"
#include "include/token_builder.h"
#include "include/token_type.h"

namespace funk
{
  Keyword& Token::get_keyword()
  {
    return m_keyword;
  }

  bool Token::is_keyword()
  {
    return m_is_keyword;
  }

  std::string& Token::get_symbol()
  {
    return m_symbol;
  }

  TokenType& Token::get_type()
  {
    return m_type;
  }

  TokenBuilder&& Token::builder()
  {
    TokenBuilder builder;
    return std::move(builder);
  }
}

// src/main.cpp
#include <iostream>
#include "include/buffered_reader.h"
#include "include/token.h"
#include "include/token_builder.h"

int main()
{
    using namespace funk;

    auto token = Token::builder()
            .set_keyword(Keyword::IF)
            .set_toktype(TokenType::ID)
            .set_symbol("if")
            .build();

    std::cout << token->get_symbol() << std::endl;
}

Note: I looked up for similar questions on StackOverflow as best as I could but didn't find one that matches my case. If you think it is similar to a question you previously encountered, feel free to mark duplicate and post the link to the original question. Thanks!


Solution

  •   std::unique_ptr<Token>&& TokenBuilder::build()
      {
          std::unique_ptr<Token> token;
          // ...
          return std::move(token);
      }
    

    You return reference to a local variable. At the end of the function, the local variable is destroyed, and the returned reference will be left referring to an object outside of its lifetime. When you later attempt to access the non-existing object through the dangling reference, the behaviour of the program is undefined.