DEV Community

# reinforcementlearning

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Four Models in One Training Loop: Architecting SDAR on AWS (Before Renting a Single GPU)

Four Models in One Training Loop: Architecting SDAR on AWS (Before Renting a Single GPU)

Comments
5 min read
How to Add Live Telemetry and Failure Diagnosis to Isaac Lab, MuJoCo, or Gazebo Training in Under 5 Minutes

How to Add Live Telemetry and Failure Diagnosis to Isaac Lab, MuJoCo, or Gazebo Training in Under 5 Minutes

Comments
4 min read
Why robotics RL training pipelines fail at scale

Why robotics RL training pipelines fail at scale

Comments
4 min read
ARTIST: RL-Powered Tool Use for LLM Agents Explained

ARTIST: RL-Powered Tool Use for LLM Agents Explained

Comments
9 min read
Q-Learning for Games: Teaching an Agent Tic-Tac-Toe Through Self-Play

Q-Learning for Games: Teaching an Agent Tic-Tac-Toe Through Self-Play

Comments
14 min read
Your RL Agent Failed a 12-Step Task. Which Step Was Wrong? (The Supervision Problem in Agentic RL)

Your RL Agent Failed a 12-Step Task. Which Step Was Wrong? (The Supervision Problem in Agentic RL)

Comments 2
5 min read
Value Iteration vs Q-Learning: Dynamic Programming Meets RL

Value Iteration vs Q-Learning: Dynamic Programming Meets RL

Comments
12 min read
Solving CartPole Without Gradients: Simulated Annealing

Solving CartPole Without Gradients: Simulated Annealing

Comments
13 min read
The Cross-Entropy Method: Solving RL Without Gradients

The Cross-Entropy Method: Solving RL Without Gradients

1
Comments
12 min read
Self-Learning AI Agents; Architectures and Challenges

Self-Learning AI Agents; Architectures and Challenges

1
Comments 1
3 min read
Policy Gradients: REINFORCE from Scratch with NumPy

Policy Gradients: REINFORCE from Scratch with NumPy

Comments
16 min read
Deep Q-Networks: Experience Replay and Target Networks

Deep Q-Networks: Experience Replay and Target Networks

Comments
18 min read
Q-Learning from Scratch: Navigating the Frozen Lake

Q-Learning from Scratch: Navigating the Frozen Lake

Comments
11 min read
Evolution Is Back: A New Way to Fine‑Tune LLMs

Evolution Is Back: A New Way to Fine‑Tune LLMs

1
Comments
7 min read
Why Most Game NPCs Feel Dead (And How Emotion and Memory Fix It)

Why Most Game NPCs Feel Dead (And How Emotion and Memory Fix It)

1
Comments
4 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.