Greetings HPC Users,

We would like to give some quick updates on what is going to happen in the HPC in the upcoming weeks.

NVIDIA End-To-End LLM Bootcamp

This event will be physically held in UM, Faculty of Computer Science from 16 July 2024 to 18 July 2024. All the DGX A100 nodes will be reserved from 15 July 2024 to 18 July 2024 in order to accommodate for the event, so no other jobs will be able to run using the DGX A100 nodes.

Removal of MIG GPUs after Bootcamp

Over the past two months, we have observed that the utilization for MIG GPUs was pretty low and unsatisfying. Our internal discussion has come to a final decision to disable the MIG on all the DGX nodes, so all the A100 GPUs will be running without MIG after the Bootcamp event above. This also means that it will be now possible to link up to 16 A100 GPUs for even larger workloads after this.

Reminder for Resource Usage

We would like to remind again, please ensure all the submitted jobs are utilising the resources properly and make sure no resources are being wasted.

Thank you.

Categories: HPCNews