-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Labels
Description
Use cases, pain points, and background
Description:
What should we do?
Design:
What files should be touched? What logic should be written?
Out of scope:
What are some items that this issue could be mistaken to cover that this issue should explicitly NOT cover?
Acceptance Criteria:
- Follow the training doc and fix things along the way to get things to work well https://docs.nvidia.com/nemo/gym/latest/tutorials/rl-training-with-nemo-rl.html
- Train every released training environment for 150+ steps with Qwen 3 4B Instruct