I was following a tutorial in Youtube, when wanted to load Llama3 8B:
model_name = "meta-llama/Meta-Llama-3-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name, use_auth_token=hugging_face_key)
model = AutoModelForCausalLM.from_pretrained(model_name, use_auth_token=hugging_face_key)
Got:
Your session has failed because all available RAM has been used
Tried: model = AutoModelForCausalLM.from_pretrained(model_name, use_auth_token=hugging_face_key, low_cpu_mem_usage=True)
But again the same error
Given that it’s an 8-billion-parameter model, requiring approximately 16GB of space, free Colab notebooks lack the capacity to load it.