I'm attempting to get all non-whitespace characters from a string using regex, but I keep coming back to the same error.
extern crate regex; // 1.0.2
use regex::Regex;
use std::vec::Vec;
pub fn string_split<'a>(s: &'a String) -> Vec<&'a str> {
let mut returnVec = Vec::new();
let re = Regex::new(r"\S+").unwrap();
for cap in re.captures_iter(s) {
returnVec.push(&cap[0]);
}
returnVec
}
pub fn word_n(s: &String, n: i32) -> &str {
let bytes = s.as_bytes();
let mut num = 0;
let mut word_start = 0;
for (i, &item) in bytes.iter().enumerate() {
if item == b' ' || item == b'\n' {
num += 1;
if num == n {
return &s[word_start..i].trim();
}
word_start = i;
continue;
}
}
&s[..]
}
The error:
error[E0597]: `cap` does not live long enough
--> src/main.rs:11:25
|
11 | returnVec.push(&cap[0]);
| ^^^ borrowed value does not live long enough
12 | }
| - borrowed value only lives until here
|
note: borrowed value must be valid for the lifetime 'a as defined on the function body at 6:1...
--> src/main.rs:6:1
|
6 | pub fn string_split<'a>(s: &'a String) -> Vec<&'a str> {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Plus more information:
$ rustc --explain E0597
This error occurs because a borrow was made inside a variable which has a
greater lifetime than the borrowed one.
Example of erroneous code:
```
struct Foo<'a> {
x: Option<&'a u32>,
}
let mut x = Foo { x: None };
let y = 0;
x.x = Some(&y); // error: `y` does not live long enough
```
In here, `x` is created before `y` and therefore has a greater lifetime. Always
keep in mind that values in a scope are dropped in the opposite order they are
created. So to fix the previous example, just make the `y` lifetime greater than
the `x`'s one:
```
struct Foo<'a> {
x: Option<&'a u32>,
}
let y = 0;
let mut x = Foo { x: None };
x.x = Some(&y);
```
At this point I've tried several methods of extending the lifetime of the cap
variable, but I'm not able to get anything to work after reading the borrowing and lifetime section of the Rust book.
The documentation of impl<'t> Index<usize> for Captures<'t>
(this is the cap[0]
in your code) says:
The text can't outlive the Captures object if this method is used, because of how Index is defined (normally a[i] is part of a and can't outlive it); to do that, use get() instead.
So with get
it works (note that I have replaced the &'a String
argument by &'a str
):
use regex::Regex;
pub fn string_split<'a>(s: &'a str) -> Vec<&'a str> {
let mut return_vec = Vec::new();
let re = Regex::new(r"\S+").unwrap();
for cap in re.captures_iter(s) {
return_vec.push(cap.get(0).unwrap().as_str());
};
return_vec
}
fn main() {
println!("{:?}", string_split("Hello, world!"));
}