I have a text file that contains some hymns in a particular format.Example below.
1 Praise to the Lord
1
Praise to the Lord, the Almighty, the King of creation!
O my soul, praise Him, for He is thy health and salvation!
All ye who hear, now to His temple draw near;
Join ye in glad adoration!
2
Praise to the Lord, Who o'er all things so wondrously reigneth,
Shieldeth thee under His wings, yea, so gently sustaineth!
Hast thou not seen how thy desires e'er have been
Granted in what He ordaineth?
3
Praise to the Lord, who doth prosper thy work and defend thee;
Surely His goodness and mercy here daily attend thee.
Ponder anew what the Almighty can do,
If with His love He befriend thee.
I want to extract these hymns, place them into objects and then insert them into an SQLite Database. I am trying to split them up accordingly but I am not getting anywhere so far. This is my attempt.
Main function
//Fileinfo object wraps the file path.
var hymns = new FileInfo(@"C:HymnWords.txt");
//StreamReader reads from the existing file.
var reader = hymns.OpenText();
string line;
int number = 0;
var hymns = new List<Hymn>();
var check = false; //this is set to indicate that all the lines that are follwoing will be apart of the hymn.
while ((line = reader.ReadLine())!=null)
{
if (line.Any(char.IsLetter) && line.Any(char.IsDigit))
{
}
if (check)
{
if (line.Any(c => char.IsDigit(c) && c != 0) && !line.Any(char.IsLetter))
{
}
}
}
Model for the hymn
public class Hymn
{
public string Name { set; get; }
public List<String> Verses { set; get; }
}
When storing the verses. I need to preserve the line breaks. Is inserting a
/n
after each line before inserting the verse into object or database the best way to do this?
This is the final solution that I arrived with the help of Shawn McLean.
namespace HymmParser { class Program { const string TITLE_REGEX = @"\s*\d+\s{2,}[a-zA-Z]+"; static void Main(string[] args) { var hymns = new List<Hymn>(); //read the file string[] lines = System.IO.File.ReadAllLines(@"C:\HymnWords.txt"); for (int i = 0; i < lines.Count(); i++) { //regex to check for a white space, a number, 2 or more white spaces then words after. if (Regex.IsMatch(lines[i], TITLE_REGEX)) { var hymn = new Hymn { //TODO: Add your title parse logic here. Title = lines[i] }; //find verses under this hymn for (i++; i < lines.Count(); i++) { //ensure this line is not a title, else break out of it. if (Regex.IsMatch(lines[i], TITLE_REGEX)) { break; } //if number only found, this is the start of a verse if (Regex.IsMatch(lines[i], @"^\s*\d+$")) { var verse = new Verse(int.Parse(lines[i])); //gather up verse lines for (i++; i < lines.Count(); i++) { //if number only, break. if (Regex.IsMatch(lines[i], @"\s*\d+")) { //backup and break, outer loop will increment and miss this new verse i--; break; } else if (string.IsNullOrWhiteSpace(lines[i])) { //if whitespace, then we may have finished the verse, break out break; } else { verse.VerseLines.Add(lines[i]); } } hymn.Verses.Add(verse); } } hymns.Add(hymn); } } foreach (var hymn in hymns) { Console.WriteLine(hymn.Title); foreach (var verse in hymn.Verses) { Console.WriteLine(verse.VerseNumber); foreach (var line in verse.VerseLines) { Console.WriteLine(line); } } Console.WriteLine("\n"); } Console.WriteLine("Hymns Found: {0}", hymns.Count); Console.ReadLine(); } } public class Hymn { public Hymn() { Verses = new List<Verse>(); } public string Title { set; get; } public List<Verse> Verses { set; get; } } public class Verse { public Verse(int verseNumber) { VerseNumber = verseNumber; VerseLines = new List<string>(); } public int VerseNumber { get; private set; } public List<string> VerseLines { set; get; } }
}