At the moment, I am struggling to understand how I bring a python-script to do multiple tasks along to each other.
For this case, I set my self a target:
make a script that took an URL, passed by a HTTP-GET, download a Video behind the URL, convert it to an mp3-file and perform some "post-download things" like setting mp3-tags. The challenge here should be, to accept new "download-requests" while another download/convert/post-download process is active.
If this usage makes sense or not, should not be the point of this question (since I know there is already software available to achieve video-to-mp3-downloads). I am just trying to understand how I use python to serve a certain service (httpd) while performing other tasks (download/convert/post-download).
For the start, I tried to be as basic as possible. So I decided to use BaseHttpServer and youtube-dl. BaseHttpServer let me serve and handle HTTP inside my script and youtube-dl manages the download/convert. Handling the post-download-actions is my problem.
At the moment I am able to accept multiple "download-requests" and start multiple child-process', but ... how can I start "post-download" things (like setting mp3-tags) after one download/convert has finished. Since I have no clue how I get to the information that the download/convert on a specific file has finished successfully.
Here is my code so far
#!/usr/bin/env python
from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer
import SocketServer
import subprocess
class S(BaseHTTPRequestHandler):
def _set_headers(self):
self.send_response(200)
self.send_header('Content-type', 'text/html')
self.end_headers()
def do_GET(self):
# set youtube-dl command and arguments
args = ['youtube-dl', '--extract-audio', '--audio-format', 'mp3', '--output', '%(title)s.%(ext)s', '--no-playlist', '--quiet']
# building HTTP Header and extract path from it
self._set_headers()
passed = self.path # catch the passed url
url = passed[1:] # cutoff leading /
if url:
# append the url as the last argument to args
args.append(url)
# download
subprocess.Popen(args)
else:
print('empty request')
def run(server_class=HTTPServer, handler_class=S, port=8000):
server_address = ('', port)
httpd = server_class(server_address, handler_class)
print 'Starting httpd...'
httpd.serve_forever()
if __name__ == "__main__":
from sys import argv
if len(argv) == 2:
run(port=int(argv[1]))
else:
run()
This enables me to download a Video and store it as mp3, while accepting another download-request, but I do not know to perform further operations on the file after it has been downloaded/converted while accepting and starting new downloads/converts.
Using subprocess.call()
and wait until youtube-dl has finished would break the option to accept another download, parallel to the current.
And writing a second script which I start with .Popen()
to handle download/convert/post-download-things together does not seem to be the right way ^^
Right now this is a Chicken and Egg situation for me... Hope you can enlighten me...
The tip from Sanket Sudake did the trick for me!
I use Celery as a task manager with defined tasks which i call with the help of async-chaining to process my defined tasks (download & convert & post_download-stuff) after another in dedicated tasks.
Works pretty pretty well!
and Celery has SO MUCH MORE functions and techniques to experiment with! But for my self defined case the async-chain of tasks works fine and is the solution for me here!
I added a systemd-config to use this as daemonized-service manageable with systemd.