About the Discovery Cluster category
|
|
0
|
659
|
July 21, 2020
|
Batch jobs running inside matlab not finished
|
|
2
|
57
|
March 6, 2024
|
Tmux session keep being killed periodically
|
|
17
|
110
|
February 29, 2024
|
Newly created file/directory with wrong group id
|
|
6
|
390
|
February 16, 2024
|
GPUs in Debug Partition Cannot Be Allocated
|
|
2
|
56
|
February 14, 2024
|
Faulty node on Endeavour cluster's ISI partition
|
|
3
|
78
|
February 13, 2024
|
SSH FS suddenly not working
|
|
0
|
37
|
February 12, 2024
|
What is the policy of maximum jobs I can submit?
|
|
1
|
63
|
January 29, 2024
|
Non-GPU tasks keep using `gpu` partition
|
|
1
|
122
|
January 4, 2024
|
Unable to login
|
|
1
|
130
|
December 11, 2023
|
Discovery slurm job logs are missing when using epyc-64 partition
|
|
2
|
347
|
December 9, 2023
|
QOSMaxCpuPerUserLimit?
|
|
0
|
202
|
November 13, 2023
|
Batch job submission failed
|
|
1
|
206
|
November 7, 2023
|
Trouble to login into Discovery
|
|
0
|
140
|
October 23, 2023
|
Cannot allocate GPUs even though GPUs are available
|
|
6
|
600
|
October 14, 2023
|
Creating an anaconda environment
|
|
1
|
198
|
September 18, 2023
|
Slurm jobs are stuck in pending, despite GPUs being idle
|
|
10
|
4501
|
September 14, 2023
|
OSError: [Errno 70] Communication error on send
|
|
1
|
411
|
September 11, 2023
|
Slurmstepd: error: TaskProlog failed status=1
|
|
0
|
258
|
August 22, 2023
|
Synchronizing local files with CARC using VS-code
|
|
0
|
179
|
August 2, 2023
|
Cannot execute 'cc1' under conda env
|
|
1
|
402
|
July 13, 2023
|
VSCode disconnects every 10 minutes
|
|
1
|
1913
|
June 22, 2023
|
Lammps and i-PI communicating via socket
|
|
2
|
453
|
June 3, 2023
|
Extremely slow to check the content of the directory with ls cmd
|
|
0
|
372
|
June 2, 2023
|
Cannot access compute node during training
|
|
2
|
291
|
May 16, 2023
|
Additional files getting added to folders
|
|
1
|
400
|
April 19, 2023
|
Copy from project to scratch is extremely slow
|
|
0
|
481
|
April 18, 2023
|
Extremely slow to log into the server
|
|
0
|
434
|
April 10, 2023
|
Signalling a job before time limit is reached
|
|
3
|
3629
|
March 28, 2023
|
Slurm Job Not Running Properly
|
|
0
|
260
|
March 3, 2023
|