About the Discovery Cluster category
|
|
0
|
665
|
July 21, 2020
|
Batch jobs running inside matlab not finished
|
|
2
|
77
|
March 6, 2024
|
Tmux session keep being killed periodically
|
|
17
|
141
|
February 29, 2024
|
Newly created file/directory with wrong group id
|
|
6
|
394
|
February 16, 2024
|
GPUs in Debug Partition Cannot Be Allocated
|
|
2
|
61
|
February 14, 2024
|
Faulty node on Endeavour cluster's ISI partition
|
|
3
|
86
|
February 13, 2024
|
SSH FS suddenly not working
|
|
0
|
41
|
February 12, 2024
|
What is the policy of maximum jobs I can submit?
|
|
1
|
69
|
January 29, 2024
|
Non-GPU tasks keep using `gpu` partition
|
|
1
|
127
|
January 4, 2024
|
Unable to login
|
|
1
|
135
|
December 11, 2023
|
Discovery slurm job logs are missing when using epyc-64 partition
|
|
2
|
354
|
December 9, 2023
|
QOSMaxCpuPerUserLimit?
|
|
0
|
221
|
November 13, 2023
|
Batch job submission failed
|
|
1
|
219
|
November 7, 2023
|
Trouble to login into Discovery
|
|
0
|
146
|
October 23, 2023
|
Cannot allocate GPUs even though GPUs are available
|
|
6
|
622
|
October 14, 2023
|
Creating an anaconda environment
|
|
1
|
205
|
September 18, 2023
|
Slurm jobs are stuck in pending, despite GPUs being idle
|
|
10
|
4723
|
September 14, 2023
|
OSError: [Errno 70] Communication error on send
|
|
1
|
436
|
September 11, 2023
|
Slurmstepd: error: TaskProlog failed status=1
|
|
0
|
271
|
August 22, 2023
|
Synchronizing local files with CARC using VS-code
|
|
0
|
183
|
August 2, 2023
|
Cannot execute 'cc1' under conda env
|
|
1
|
430
|
July 13, 2023
|
VSCode disconnects every 10 minutes
|
|
1
|
1999
|
June 22, 2023
|
Lammps and i-PI communicating via socket
|
|
2
|
461
|
June 3, 2023
|
Extremely slow to check the content of the directory with ls cmd
|
|
0
|
382
|
June 2, 2023
|
Cannot access compute node during training
|
|
2
|
299
|
May 16, 2023
|
Additional files getting added to folders
|
|
1
|
411
|
April 19, 2023
|
Copy from project to scratch is extremely slow
|
|
0
|
492
|
April 18, 2023
|
Extremely slow to log into the server
|
|
0
|
447
|
April 10, 2023
|
Signalling a job before time limit is reached
|
|
3
|
3683
|
March 28, 2023
|
Slurm Job Not Running Properly
|
|
0
|
265
|
March 3, 2023
|