Skip to content

Conversation

@wooway777
Copy link
Collaborator

resolves #988

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds support for Ali PPU (Processing Unit) accelerators to the InfiniTensor framework, following the same integration pattern used for other vendor accelerators like NVIDIA, ILUVATAR, QY, and HYGON.

Changes:

  • Added build configuration and compilation support for Ali PPU through xmake build system
  • Integrated Ali PPU device type (ID: 10) across all framework layers including infinicore, infinirt, infiniop, and infiniccl
  • Extended test infrastructure and utilities to support Ali PPU device testing

Reviewed changes

Copilot reviewed 46 out of 46 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
xmake/ali.lua New build configuration file for Ali PPU compilation using CUDA toolchain
xmake.lua Added ali-ppu build option and integrated ali.lua targets
include/infinicore.h Added INFINI_DEVICE_ALI enum value (10)
include/infinicore/device.hpp Added ALI device type to C++ Device class enum
src/infinicore/device.cc Added string representation for ALI device type
src/infinicore/pybind11/device.hpp Exposed ALI device type to Python bindings
src/infinicore/nn/embedding.cc Extended device check to support Ali PPU for on-device embedding
src/infinicore/nn/rmsnorm.cc Added Ali PPU support for add_rms_norm_inplace operation
src/infinirt/infinirt.cc Added Ali API dispatch case to device API macro
src/infinirt/cuda/infinirt_cuda.cuh Added Ali namespace with ENABLE_ALI_API conditional compilation
src/infinirt/cuda/infinirt_cuda.cu Added Ali namespace definition
src/infiniop/devices/handle.cc Added Ali device handle creation and destruction
src/infiniop/devices/nvidia/nvidia_handle.h Added Ali Handle class declaration
src/infiniop/devices/nvidia/nvidia_common.cu Implemented Ali Handle constructor and create method
src/infiniop/devices/nvidia/nvidia_kernel_common.cuh Excluded long double exp function for Ali (like ILUVATAR, QY, HYGON)
src/infiniop/ops/*/operator.cc Added ENABLE_ALI_API support to 15+ operator implementations
src/infiniop/ops/gemm/nvidia/gemm_nvidia.cu Extended Ali support in GEMM compute type selection
src/infiniop/ops/paged_attention_prefill/cuda/kernel_v2.cuh Added ENABLE_ALI_API to conditional compilation
src/infiniccl/infiniccl.cc Added Ali device cases for CCL operations (InitAll, Destroy, AllReduce)
src/infiniccl/cuda/infiniccl_cuda.h Added ENABLE_ALI_API to CUDA CCL conditional compilation
python/infinicore/device.py Mapped ALI device type to "cuda" for PyTorch compatibility
test/infiniop/libinfiniop/devices.py Added ALI device enum (10) and name mapping
test/infiniop/libinfiniop/utils.py Added --ali command-line argument for testing
test/infinicore/framework/devices.py Added ALI device enum and name mapping
test/infinicore/framework/config.py Added --ali argument and Ali device test setup
test/infinicore/test.py Added ALI device to relationship test list
src/infiniop-test/src/main.cpp Added --ali device option to test CLI
src/infinirt-test/main.cc Added --ali device option to test CLI
src/infinicore-test/main.cc Added --ali device option and usage documentation
src/infinicore-test/README.md Documented --ali command-line option
src/infiniccl-test/main.cpp Added --ali device option to CCL test CLI
Comments suppressed due to low confidence (1)

src/infiniop/ops/layer_norm/operator.cc:176

  • The DELETE section of this operator is missing ENABLE_ILUVATAR_API support, which is present in the CREATE, GET, and CALCULATE sections (lines 46, 79, 132). While adding ALI support, this existing inconsistency should be fixed to prevent potential bugs when using ILUVATAR devices. The missing block should be added after line 167 following the same pattern as other device types.
#ifdef ENABLE_ALI_API
        DELETE(INFINI_DEVICE_ALI, nvidia);
#endif
#ifdef ENABLE_QY_API
        DELETE(INFINI_DEVICE_QY, nvidia);
#endif
#ifdef ENABLE_METAX_API
        DELETE(INFINI_DEVICE_METAX, metax);
#endif

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants