I've written a little program in Python that basically does the following:
I'd like to add the ability to execute commands over a network as follows (and learn to use Twisted on the way):
Note: Entering commands locally (using the code below) and remotely should be possible.
After some thinking I couldn't come up with any other way to implement this other than:
Since I don't have that much experience with threads and none with network programming, I couldn't think of any other scheme that makes sense to me.
Is the scheme stated above overly complicated? I would appreciate some insight before trying to implement it this way.
The code for the python program (without the client) is:
The main (which is the start() method):
class Controller:
def __init__(self,listener, executor):
self.listener = listener
self.executor = executor
def start(self):
while True:
text = self.listener.listen_for_hotword()
if self.executor.is_hotword(text):
text = self.listener.listen_for_command()
if self.executor.has_matching_command(text):
self.executor.execute_command(text)
else:
tts.say("No command found. Please try again")
The Listener (gets input from the user):
class TextListener(Listener):
def listen_for_hotword(self):
text = raw_input("Hotword: ")
text =' '.join(text.split()).lower()
return text
def listen_for_command(self):
text = raw_input("What would you like me to do: ")
text = ' '.join(text.split()).lower()
return text
The executor (the class that executes the given command):
class Executor:
#TODO: Define default path
def __init__(self,parser, audio_path='../Misc/audio.wav'):
self.command_parser = parser
self.audio_path = audio_path
def is_hotword(self,hotword):
return self.command_parser.is_hotword(hotword)
def has_matching_command(self,command):
return self.command_parser.has_matching_command(command)
def execute_command(self,command):
val = self.command_parser.getCommand(command)
print val
val = os.system(val) #So we don't see the return value of the command
The command file parser:
class KeyNotFoundException(Exception):
pass
class YAMLParser:
THRESHOLD = 0.6
def __init__(self,path='Configurations/commands.yaml'):
with open(path,'r') as f:
self.parsed_yaml = yaml.load(f)
def getCommand(self,key):
try:
matching_command = self.find_matching_command(key)
return self.parsed_yaml["Commands"][matching_command]
except KeyError:
raise KeyNotFoundException("No key matching {}".format(key))
def has_matching_command(self,key):
try:
for command in self.parsed_yaml["Commands"]:
if jellyfish.jaro_distance(command,key) >=self.THRESHOLD:
return True
except KeyError:
return False
def find_matching_command(self,key):
for command in self.parsed_yaml["Commands"]:
if jellyfish.jaro_distance(command,key) >=0.5:
return command
def is_hotword(self,hotword):
return jellyfish.jaro_distance(self.parsed_yaml["Hotword"],hotword)>=self.THRESHOLD
Example configuration file:
Commands:
echo : echo hello
Hotword: start
I'm finding it extremely difficult to follow the background in your questions, but I'll take a stab at the questions themselves.
As you noted in your question, the typical way to write a "walk and chew-gum" style app is to design your code in a threaded or an event-loop style.
Given you talking about threading and Twisted (which is event-loop style) I'm worried that you may be thinking about mixing the two.
I view them as fundamentally different styles of programming (each with places they excel) and that mixing them is generally a path to hell.
Let me give you some background to explain
How to think of the concept:
I have multiple things I need to do at the same time, and I want my operating system to figure how and when to run those separate tasks.
Pluses:
'The' way to let one program use multiple processor cores at the same time
In the posix world the only way to let one process run on multiple CPU cores at the same time is via threads (with the typical ideal number of threads being no more then the cores in a given machine)
Easier to start with
The same code that you were running inline can be tossed off into a thread usually without needing a redesign (without GIL some locking would be required but more on that later)
Much easier to use with tasks that will eat all the CPU you can give at them
I.E. in most cases math is way easier to deal with using threading solutions then using event/async frameworks
Minuses:
Python has a special problem with threads
In CPython the global interpreter lock(GIL) can negate threads ability to multitask (making threads nearly useless). Avoiding the GIL is messy and can undo all the ease of use of working in threads
As you add threads (and locks) things get complicated fast, see this SO: Threading Best Practices
Rarely optimal for IO/user-interacting tasks
While threads are very good at dealing with small numbers of tasks that want to use lots of CPU (Ideally one thread per core), they are far less optimal at dealing with large counts of tasks that spend most of their time waiting.
Best use:
Computationally expensive things.
If you have big chucks of math that you want to run concurrently, its very unlikely that your going to be able to schedule the CPU utilization more intelligently then the operation system.
( Given CPythons GIL problem, threading shouldn't manually be used for math, instead a library that internally threads (like NumPy) should be used )
How to think of the concept:
I have multiple things I need to do at the same time, but I (the programer) want direct control/direct-implimentation over how and when my sub-tasks are run
How you should be thinking about your code:
Think of all your sub-tasks in one big intertwined whole, your mind should always have the thought of "will this code run fast enough that it doesn't goof up the other sub-tasks I'm managing"
Pluses:
Extraordinarily efficient with network/IO/UI connections, including large counts of connections
Event-loop style programs are one of the key technologies that solved the c10k problem. Frameworks like Twisted can literally handle tens-of-thousands of connections in one python process running on a small machine.
Predictable (small) increase in complexity as other connections/tasks are added
While there is a fairly steep learning curve (particularly in twisted), once the basics are understood new connection types can be added to projects with a minimal increase in the overall complexity. Moving from a program that offers a keyboard interface to one that offers keyboard/telnet/web/ssl/ssh connections may just be a few lines of glue code per-interface (... this is highly variable by framework, but regardless of the framework event-loops complexity is more predictable then threads as connection counts increase)
Minuses:
Harder to get started.
Event/async programming requires a different style of design from the first line of code (though you'll find that style of design is somewhat portable from language to language)
One event-loop, one core
While event-loops can let you handle a spectacular number of IO connections at the same time, they in-and-of-themselves can't run on multiple cores at the same time. The conventional way to deal with this is to write programs in such a way that multiple instances of the program can be run at the same time, one for each core (this technique bypasses the GIL problem)
Multiplexing high CPU tasks can be difficult
Event programing requires cutting high CPU tasks into pieces such that each piece takes an (ideally predictably) small amount of CPU, otherwise the event system ceases to multitask whenever the high CPU task is run.
Best use:
IO based things
While your application doesn't seem to be exclusively IO based, none of it seems to be CPU based (it looks like your currently playing audio via a system
call, system
spins off an independent process every time its called, so its work doesn't burn your processes CPU - though system
blocks, so its a no-no in twisted - you have to use different calls in twisted).
Your question also doesn't suggest your concerned about maxing out multiple cores.
Therefor, given you specifically talked about Twisted, and an event-loop solution seems to be the best match for your application, I would recommend looking at Twisted and -not- threads.
Given the 'Best use' listed above you might be tempted to think that mixing twisted and threads is the way-to-got, but when doing that if you do anything even slightly wrong you will disable the advantages of both the event-loop (you'll block) and threading (GIL won't let the threads multitask) and have something super complex that provides you no advantage.
The 'scheme' you gave is:
After some thinking I couldn't come up with any other way to implement this other than:
- Have the above program run as process #1 (the program that runs locally as I've written at the beginning).
- A Twisted client will be run as process #2 and receive the commands from remote clients. Whenever a command is received, the Twisted client will initialize a thread that'll parse the command, check for its validity and execute it if it's valid.
Since I don't have that much experience with threads and none with network programming, I couldn't think of any other scheme that makes sense to me.
In answer to "Is the scheme ... overly complicated", I would say almost certainly yes because your talking about twisted and threads. (see tr; dr above)
Given my certainly incomplete (and confused) understanding of what you want to build, I would imagine a twisted solution for you would look like:
If, as you state in your question, you really need a server, you could write a second twisted program to provide that (you'll see examples of all that in the krondo guide). Though I'm guessing when you understand twisted's library support you'll realize you don't have to build any extra servers, that you can just include whichever protocols you need in your base code.