Search code examples
regexoptimizationrustcopyclone

How do I avoid calling ".clone()" on same String in match multiple times?


Background:

I have some code (Rust) that finds (Regex) matches and assigns the found values to fields in a struct named Article (where all fields are of type String):

pub struct Article {
  // user facing data
  title: String,
  category: String,
  subcategory: String,
  genre: String,
  published: String,
  estimated_read_time: String,
  description: String,
  tags: String,
  keywords: String,
  image: String,
  artwork_credit: String,
  // meta data
  metas: String,
  // location
  path: String,
  slug: String,
  // file data
  content: String
}

A regular expression ("//\- define (.*?): (.*?)\n") is used to extract comments from the article's template that define data for that article:

// iterate through HTML property pattern matches
for capture in re_define.captures_iter(&file_content as &str) {
  // remove the declaration from the the HTML output
  article_content = article_content.replace(&capture[0].to_string(), "");
  // get the property value
  let property_value: &String = &capture[2].to_string();
  // determine what field to assign the property to and assign it
  match capture[1].to_lowercase().as_str() {
    "title" => article.title = property_value.clone(),
    "category" => article.category = property_value.clone(),
    "subcategory" => article.subcategory = property_value.clone(),
    "genre" => article.genre = property_value.clone(),
    "published" => article.published = property_value.clone(),
    "estimated_read_time" => article.estimated_read_time = property_value.clone(),
    "description" => article.description = property_value.clone(),
    "tags" => article.tags = property_value.clone(),
    "keywords" => article.keywords = property_value.clone(),
    "image" => article.image = property_value.clone(),
    unknown_property @ _ => {
      println!("Ignoring unknown property: {}", &unknown_property);
    }
  }
}

Note: article is an instance of Article.

Issue:

The code works but what I'm concerned about the following part:

"title" => article.title = property_value.clone(),
"category" => article.category = property_value.clone(),
"subcategory" => article.subcategory = property_value.clone(),
"genre" => article.genre = property_value.clone(),
"published" => article.published = property_value.clone(),
"estimated_read_time" => article.estimated_read_time = property_value.clone(),
"description" => article.description = property_value.clone(),
"tags" => article.tags = property_value.clone(),
"keywords" => article.keywords = property_value.clone(),
"image" => article.image = property_value.clone(),

It calls .clone() on the same String (property_value) for every match (10 matches per article template), for every article template (a couple dozen templates in total), and I don't think it's the most efficient way to do it.

Note: I'm not sure if match is cloning for non-matches.

What I tried:

I tried referencing the property_value String, but I got an error for each reference.

Error from IDE (VS Code):

mismatched types
expected struct `std::string::String`, found `&&std::string::String`
expected due to the type of this binding
try using a conversion method: `(`, `).to_string()`

Error from cargo check:

error[E0308]: mismatched types
  --> src/article.rs:84:38
   |
84 |           "image" => article.image = &property_value,
   |                      -------------   ^^^^^^^^^^^^^^^ expected struct `std::string::String`, found `&&std::string::String`
   |                      |
   |                      expected due to the type of this binding
   |
help: try using a conversion method
   |
84 |           "image" => article.image = (&property_value).to_string(),
   |                                      +               +++++++++++++

I did try using .to_string(), but I'm not sure if converting a String to the same type is the most efficient to do it either.

Question:

How do I avoid calling .clone() on property_value so many times?


Solution

  • Going by the types, you should just be able to drop the borrow in property_value and then you don't need the .clone()s.

    let property_value: &String = &capture[2].to_string();
    // change to
    let property_value: String = capture[2].to_string();
    // or just simply
    let property_value = capture[2].to_string();
    

    I'm assuming this was added as capture[2] returns a str (non-sized type) which would require the & but with to_string() it converts to the owned type String which is fine on it's own. This wont have any performance effect as to_string() copies anyway.