While trying to load hugging face models i.e. in this case
model = LlavaNextForConditionalGeneration.from_pretrained("llava-hf/llava-v1.6-mistral-7b-hf",
torch_dtype=torch.float16,low_cpu_mem_usage=True,
cache_dir=CACHE_DIR, load_in_8bit=True)
Loading this model when the weights are stored in /scratch1
takes extremely long i.e. about 4-5 minutes, whereas on local compute it doesn’t take that long. Not sure if there is some bug causing the issue.