I am creating a project in Go that parses Solidity code. In my project, I created a function analyzeFile() which for each smart contract (.sol) will detect statically issues with regexp:
func analyzeFile(issues []Issue, file string) (map[string][]Finding, error) {
findings := make(map[string][]Finding)
readFile, err := os.Open(file)
if err != nil {
return nil, err
}
defer readFile.Close()
contents, _ := ioutil.ReadFile(file)
scanner := bufio.NewScanner(readFile)
lineNumber := 0
for scanner.Scan() {
lineNumber++
line := scanner.Text()
for _, issue := range issues {
if issue.ParsingMode == "SingleLine" {
matched, _ := regexp.MatchString(issue.Pattern, line)
if matched {
findings[issue.Identifier] = append(findings[issue.Identifier], Finding{
IssueIdentifier: issue.Identifier,
File: file,
LineNumber: lineNumber,
LineContent: strings.TrimSpace(line),
})
}
}
}
}
When the regexes have to control the code on a single line, everything is fine. However, I also need to check things in the .sol files that occur on multiple lines, for instance detect this piece of code:
require(
_disputeID < disputeCount &&
disputes[_disputeID].status == Status.Active,
"Disputes::!Resolvable"
);
I tried to add the following code in the analyzeFile() function:
contents, _ := ioutil.ReadFile(file)
for _, issue := range issues {
if issue.ParsingMode == "MultiLine" {
contents_to_string := string(contents)
//s := strings.ReplaceAll(contents_to_string, "\n", " ")
//sr := strings.ReplaceAll(s, "\r", " ")
r := regexp.MustCompile(`((require)([(])\n.*[&&](?s)(.*?)([;]))`)
finds := r.FindStringSubmatch(contents_to_string)
for _, find := range finds {
findings[issue.Identifier] = append(findings[issue.Identifier], Finding{
IssueIdentifier: issue.Identifier,
File: file,
LineContent: (find),
})
}
}
}
But I get wrong results because when transforming the source code to string, I get all the code on one line with line break \n character which makes any regex check crash.
One word around solution could split the whole string with multiline with \n
after caputer group (?s)require\((.*?)\);
func main() {
var re = regexp.MustCompile(`(?s)require\((.*?)\);`)
var str = `require(
_disputeID < disputeCount &&
disputes[_disputeID].status == Status.Active,
"Disputes::!Resolvable"
);`
matches := re.FindAllStringSubmatch(str, -1)
for _, match := range matches {
lines := strings.Split(match[1], "\n")
for _, line := range lines {
fmt.Println(line)
}
}
}
https://go.dev/play/p/Omn5ULHun_-
In order to match multiple lines, the (?m)^[^\S\r\n]*(.*)[^\S\r\n](\S+)$
could be used. We could do the multiline matching to the content between require(
and )
func main() {
var re = regexp.MustCompile(`(?s)require\((.*?)\);`)
var str = `require(
_disputeID < disputeCount &&
disputes[_disputeID].status == Status.Active,
"Disputes::!Resolvable"
);`
var multilineRe = regexp.MustCompile(`(?m)^[^\S\r\n]*(.*)[^\S\r\n](\S+)$`)
matches := re.FindAllStringSubmatch(str, -1)
for _, match := range matches {
submathes := multilineRe.FindAllStringSubmatch(match[1], -1)
for _, submatch := range submathes {
fmt.Println(submatch[0])
}
}
}