Huggingface transformers model loading slow

akommine · April 12, 2024, 9:47pm

While trying to load hugging face models i.e. in this case

model = LlavaNextForConditionalGeneration.from_pretrained("llava-hf/llava-v1.6-mistral-7b-hf",
                            torch_dtype=torch.float16,low_cpu_mem_usage=True,
                            cache_dir=CACHE_DIR, load_in_8bit=True)

Loading this model when the weights are stored in /scratch1 takes extremely long i.e. about 4-5 minutes, whereas on local compute it doesn’t take that long. Not sure if there is some bug causing the issue.

dbose · April 12, 2024, 10:00pm

Adding to the previous post, I have also faced delays of 15-16 minutes while loading the same model (weights stored in /scratch1) in A100 instances. Once the weights are loaded, the inference operation works fine.
Is this an IO issue with /scratch1?

haoji · April 24, 2024, 8:05pm

Hi there was some issues with filesystem during that time, now the issue should have been resolved.