Search code examples
c#.netstringbuilder

Merging Lines of text where string ends in a specific condition


I've recently been provided a new, custom built tool for manipulating text data into a database and one feature has me stumped as I have no experience with C# and my colleagues who do haven't been able to come up with a solution due to time constraints.

The tool I've been given has an Expression Builder in order to apply rules to clean up plain text. This is the extent of the instruction I have been provided:

Use C# code to write your expressions. Use the helper 'Text' string variable to refer to the whole text or the helper 'Lines' string[] variable to refer to the individual text lines. You can also use the 'Builder' (StringBuilder) helper variable to build your output. The expression should either return a string value or a string array.

I'm creating rules to clean up data with specific keywords found at the end of the line and need to write an expression/rule that will allow me to merge the line with the specific keyword with the next line. I have a functioning rule for moving lines UP if the line starts with a specific keyword but I need to create one to merge down where lines end in a keyword.

Sample Input Data

Mr. John and
Mrs. Mary Smith
The Foundation for
the Lord's Children
Widgets Incorporated
Loyal Order of
Bullwinkle the Moose

Expected Output

Mr. John and Mrs. Mary Smith
The Foundation for the Lord's Children
Widgets Incorporated
Loyal Order of Bullwinkle the Moose

For further background, here's a working expression that merges lines starting with a keyword up (line[i-1] and line[i]) with the prior line:

for (
var i = 0; i < Lines.Length; i++) {
    if (!Lines[i].StartsWith(" "))
    if (!Lines[i].StartsWith("and "))
    if (!Lines[i].StartsWith("of "))
    if (!Lines[i].StartsWith("for "))
    if (!Lines[i].StartsWith("at "))
    if (!Lines[i].StartsWith("the "))
    if (i > 0) Builder.AppendLine();
    Builder.Append(" ").Append(Lines[i]);
}
return Builder.ToString();

With the following sample data and expect output

Sample Input:
John
and Mary
and Andy Smith
Loyal Order
of Moose
Cineplex Movie Theater
Center
for the Blind

Expected Output:
John and Mary and Andy Smith
Loyal Order of Moose
Cineplex Movie Theater
Center for the Blind

I hope this is a simple problem but note that this is a simple expression builder and I don't know the full limitations or capabilities and it was custom built for our company so I have little in the way of details I can provide. I will provide any clarification that I can but have no 'sample' solution to provide as I haven't been able to make any headway modifying the working query to work in, fundamentally, the other direction.

To clarify the question: How do I write a loop that examines all lines, then merges/concatenates line[i] and line[i+1] when line[i] ends in a specific string (for sample purposes the examples would be " and", " the", " of", " at")

Any and all help is greatly appreciated!

EDIT: The question was closed for not being clear enough however a solution was eventually delivered. In case anyone else has similar issues, here is a working solution.

for (var i = 0; i < Lines.Length; i++) {
    Builder.Append(Lines[i]);
    
    if (Lines[i].EndsWith(" and") || Lines[i].EndsWith(" of") ||
        Lines[i].EndsWith(" for") || Lines[i].EndsWith(" at") ||
        Lines[i].EndsWith(" the")) {
        
        if (i < (Lines.Length - 1)) {
            Builder.Append(" ").Append(Lines[i + 1]);
            i++;
        }
    }
        
    Builder.AppendLine("");
}
return Builder.ToString();

Solution

  • Here is a working solution that was developed

    for (var i = 0; i < Lines.Length; i++) {
        Builder.Append(Lines[i]);
        
        if (Lines[i].EndsWith(" and") || Lines[i].EndsWith(" of") ||
            Lines[i].EndsWith(" for") || Lines[i].EndsWith(" at") ||
            Lines[i].EndsWith(" the")) {
            
            if (i < (Lines.Length - 1)) {
                Builder.Append(" ").Append(Lines[i + 1]);
                i++;
            }
        }
            
        Builder.AppendLine("");
    }
    return Builder.ToString();