beam
Discover
PulseActivityAnalyticsBest forMapOrgs
Niches
AgentsMCPRAGCoding AssistantsInference & ServingVector DBs
Personal
WatchlistCompare
A
Hi, Adam
—
beam
LIVE──────── · ──:──:── UTCabout
beam
Discover
PulseActivityAnalyticsBest forMapOrgs
Niches
AgentsMCPRAGCoding AssistantsInference & ServingVector DBs
Personal
WatchlistCompare
A
Hi, Adam
—
beam
Back to Pulse
Tool profile

NVlabs/GDPO

stable

Official implementation of GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

agentic-aigrpollmreasoningrltrlverl
Velocity score
0.02/ 10
[STARS]
452
[FORKS]
31
[CONTRIBUTORS]
2
[LAST_COMMIT]
2mo ago
OPEN_ON_GITHUB
Velocity class: stable
30-day stars
0.02/ 10 score
last 33d
[SIGNAL_TRACE / 33_PT]
Score breakdown
345/ 1000
agents · NVlabs/GDPO
Velocity50%
Adoption30%
Maintenance15%
Community5%
[CODE_GROWTH]
304
[INSTALL_VEL]
497
[ACTIVITY]
197
[COMMUNITY_SIGNAL]
279

Terminal score: 0–1000 raw, weighted across 4 dimensions. Public score: 0–10 normalized (shown in the 30-day stars chart above).

LIVE──────── · ──:──:── UTCabout