I have recently saved a model into s3 using joblib
model_doc is the model object
import subprocess
import joblib
save_d2v_to_s3_current_doc2vec_model(model_doc,"doc2vec_model")
def save_d2v_to_s3_current_doc2vec_model(model,fname):
model_name = fname
joblib.dump(model,model_name)
s3_base_path = 's3://sd-flikku/datalake/current_doc2vec_model'
path = s3_base_path+'/'+model_name
command = "aws s3 cp {} {}".format(model_name,path).split()
print('saving...'+model_name)
subprocess.call(command)
It was successful, but after that when i try to load the model back from s3 it gives me an error
model = load_d2v("doc2vec_model")
def load_d2v(fname):
model_name = fname
s3_base_path='s3://sd-flikku/datalake/current_doc2vec_model'
path = s3_base_path+'/'+model_name
command = "aws s3 cp {} {}".format(path,model_name).split()
print('loading...'+model_name)
subprocess.call(command)
model=joblib.load(model_name)
return model
This is the error i get:
loading...doc2vec_model
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 7, in load_d2v
File "C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 339, in call
with Popen(*popenargs, **kwargs) as p:
File "C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 800, in __init__
restore_signals, start_new_session)
File "C:\Users\prane\AppData\Local\Programs\Python\Python37\lib\subprocess.py", line 1207, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
I don't even understand why it is saying File not found, this was the path i used to save the model but now i'm unable to get the model back from s3. Please help me!!
I suggest that rather than your generic print()
lines, showing your intent, you should print the actual command
you've composed, to verify that it makes sense upon observation.
If it does, then also try that exact same aws ...
command directly, at the command prompt where you had been launching your python
code, to make sure it runs that way. If it doesn't, you may get a more clear error.
Note that the error you're getting doesn't particularly look like it's coming from the aws
command, of from the S3 service - which might talk about 'paths' or 'objects'. Rather, it's from the Python subprocess
system & Popen' call. I think those are via your call to
subprocess.call(), but for some reason your line-of-code isn't shown. (How are you running the block of code with the
load_d2v()`?)
That suggests the file that's no found might be the aws
command itself. Are you sure it's installed & runnable from the exact working-directory/environment that your Python is running in, and invoking via subprocess.call()
?
(BTW, if my previous answer got you over your sklearn.externals.joblib
problem, it'd be good for you to mark the answer as accepted, to save other potential answerers from thinking that's still an unsolved question that's blocking you.)