I'm currently working on a bot to read twitch chats using python selenium and chromedriver. I'm having a problem where the list of elements containing all twitch chats freezes after reaching 150 elements in length. Here's some of my code shortened:
while True:
# Scans for the class name of each chat element
chat_messages = driver.find_elements_by_class_name(chatbox_classname)
if len(chat_messages) > prev_chats_len:
for chat in chat_messages[prev_chats_len:]:
# Extract author name and message contents from element and print them
prev_chats_len = len(chat_messages)
The problems begin to occur once len(chat_messages)
reaches 150. The scan does not detect more than 150 chat elements at a time. My previous fix was to use driver.refresh()
to remove all previous chat elements, but in doing so, I miss any messages sent during the time it takes to refresh. I also tried deleting the seen chat elements, but this deleted the new ones as well.
Is there a fix for this issue? I've scoured the internet in search of a possible fix to no avail. Any help is greatly appreciated!
This appears to be a feature of Twitch, rather than a bug with Selenium. The standard Twitch chatbox element (not popped out) has a message or line limit around 150, meaning that you cannot scroll back for than 150 messages ago, so your code is likely seeing new messages but not aware of it. Twitch offers (and recommends) an API for chatbots and chat scraping, but the Selenium solution would be something like this:
last_message = None
while True:
# Scans for the class name of each chat element
chat_messages = driver.find_elements_by_class_name(chatbox_classname)
if chat_messages:
if last_message is not None:
try:
start_position = chat_messages.index(last_message) + 1
except IndexError:
start_position = 0
else:
start_position = 0
for chat in chat_messages[start_position:]:
# Extract author name and message contents from element and print them
last_message = chat_messages[-1]