Search code examples

matching new line in Scala regex, when reading from file

For processing a file with SQL statements such as:

ALTER TABLE ONLY the_schema.the_big_table
    ADD CONSTRAINT the_schema_the_big_table_pkey PRIMARY KEY (the_id);

I am using the regex:

 val primaryKeyConstraintNameCatchingRegex: Regex = "([a-z]|_)+\\.([a-z]|_)+\n\\s*(ADD CONSTRAINT)\\s*([a-z]|_)+\\s*PRIMARY KEY”.r

Now the problem is that this regex does not return any results, despite the fact that both the regex

val alterTableRegex = “ALTER TABLE ONLY\\s+([a-z]|_)+\\.([a-z]|_)+”.r


val addConstraintRegex = “ADD CONSTRAINT\\s*([a-z]|_)+\\s*PRIMARY KEY”.r

match the intended sequences.

I thought the problem could be with the new line, and, so far, I have tried writing \\s+, \\W+, \\s*, \\W*, \\n*, \n*, \n+, \r+, \r*, \r\\s*, \n*\\s*, \\s*\n*\\s*, and other combinations to match the white space between the table name and add constraint to no avail.

I would appreciate any help with this.


This is the code I am using:

import scala.util.matching.Regex


object Hello extends Greeting with App {

  val primaryKeyConstraintNameCatchingRegex: Regex = "([a-z]|_)+\\.([a-z]|_)+\r\\s*(ADD CONSTRAINT)\\s*([a-z]|_)+\\s*PRIMARY KEY".r


  def readFile: Unit = {
    val fname = "dump.sql"
    val fSource = Source.fromFile(fname)

    for (line <- fSource.getLines) {
      val matchExp = primaryKeyConstraintNameCatchingRegex.findAllIn(line).foreach(
        segment => println(segment)



Edit 2

Another strange behavior is that when matching with


the matches happen and they include A, but when I use


which is only different in DD, no sequence is matched.


  • Your problem is that you read the file line by line (see for (line <- fSource.getLines) code part).

    You need to grab the contents as a single string to be able to match across line breaks.

    val fSource = Source.fromFile(fname).mkString
    val matchExps = primaryKeyConstraintNameCatchingRegex.findAllIn(fSource)

    Now, fSource will contain the whole text file contents as one string and matchExps will contain all found matches.