I am currently working on a feature for a discord bot which allows the user to grab the lyrics of the currently playing song. Sadly discord has a maximum limit of 1024 characters for every embed so songs with a big amount of lyrics get cut off/throw an error.
To avoid this I tried to split the lyrics into seperate pages using 200 words per page. (Obviously this still has room for error with long words and just isn't really optimized for this use case)
def create_embed(lyrics, song):
words = re.findall(r"\S+|\n", lyrics)
num_pages = (len(words) // 200) + 1
n = 200
pages = [" ".join(words[i:i + n]) for i in range(0, len(words), n)]
The problem with this is, since I use this for lyrics, the text gets split in really awkward positions like in the middle of the sentence, making it hard to read.
What I want to do is set my n = 200 as a maximum range in which I search for the next linebreak. Let's say I have this text:
Shadows fall over my heart \n I black out the moon \n
And I have n = 10 leaving me with
Shadows fall over my heart \n I black out the
but instead I want it to stop at the last linebreak in this string meaning:
Shadows fall over my heart \n
What is the simplest way to implement something like this? Would I need to search using a for_loop with negative steps? It would seem that this would be a rather forced approach.
So I revisited that problem a while ago and this is what I finally came up with, although there is probably a much faster/easier way, but I still wanted to share if anyone else had that problem.
start_idx = 0
length = 1023
end_idx = 0
while end_idx < len(lyrics):
print(f"end_idx:{end_idx} | len{len(lyrics)}")
end_idx = lyrics.rfind("\n", start_idx, length + start_idx) + 1
print(lyrics[start_idx:end_idx])
start_idx = end_idx
Basically it loops through the text in chunks of the length 1023 and finds the last occurence of "\n" using python .rfind(), which the algorithm then uses as the start_idx.
Just make sure you have "\n" appended to the end, else the loop will never end as it always searches for the next linebreak as long as the end_idx is smaller than the length of the string.