Skip to main content
NVIDIA
Explore
Models
Skills
Blueprints
GPUs
Docs
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

3 results for

Filters

  • NVIDIA
    3
  • Developer
    2
  • Hpc Developer
    2
  • AI Engineer
    1
  • Application Developer
    1
  • Data Scientist
    1
  • Accelerated Computing
    2
  • CUDA Tile
    1
  • cuPyNumeric
    1
  • Read and write large cuPyNumeric arrays to HDF5 with Legate's parallel, distributed HDF5 I/O (legate.io.hdf5: to_file, from_file, from_file_batched). Use when a developer needs to save a cuPyNumeric array to an .h5/.hdf5 file, load an HDF5 dataset into a
    Skill
    Developer
    470
    16d
    DGX Station
    2 HRS

    Profiler-Driven Kernel Optimization for Fine-Tuning

    Use torch.profiler to find training bottlenecks, then write custom Triton kernels to optimize LLaMA 8B fine-tuning
    Playbook
    Training
    Items per page
    of 1 pages
    21d

    Expert cuTile programming assistant. Write high-performance GPU kernels using cuTile's tile-based programming model with proper validation and optimization. Supports deep agent orchestration for complex multi-kernel tasks.
    Skill
    Developer
    240
    4d