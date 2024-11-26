Version 6.3 of the open source software stack aims to provide better performance and scalability for AI and high-performance computing workloads on AMD Instinct GPUs, like the new MI325X .

Among the new additions to ROCm 6.3 include support for SGLang, a framework that helps optimise inference workloads for generative AI models.

AMD suggested the introduction of SGLang to its ROCm software can help customers using its hardware achieve up to 6x higher performance on large language model inferencing compared to existing systems.

“Whether you're building customer-facing AI solutions or scaling AI workloads in the cloud, SGLang delivers the performance and ease-of-use needed to meet enterprise demands,” a company blog post reads.

ROCm 6.3 extends support for FlashAttention-2, a tool capable of significantly speeding up AI training times by optimising work partitioning and parallelism.

According to AMD, ROCm users leveraging the newly added FlashAttention-2 can benefit from 3x speeds on training workloads, which for businesses would reduce time-to-market for enterprise AI solutions.

FlashAttention-2 can be integrated into existing training workflows through ROCm’s PyTorch container with Composable Kernel (CK) as the backend, enabling fast deployments

AMD’s software stack now also contains support for a compiler tool aimed at supporting enterprises running legacy Fortran-based HPC applications.

The new Fortran compiler in ROCm 6.3 enables legacy users to leverage AMD Instinct GPUs that run atop existing Fortran code.

AMD said it provides users of legacy systems the ability to “realise the power of GPU acceleration without the need for extensive code overhauls previously required.”

Other updates to ROCm 6.3 include enhanced computer vision libraries, such as rocDecode, rocJPEG, and rocAL, and new Multi-node FFT support.

