Reinforcement Learning

Ai2's new Olmo 3.1 extends reinforcement learning training for stronger reasoning benchmarks

Ai2 updates its Olmo 3 family of models to Olmo 3.1 following additional extended RL training to boost performance.

A look under the hood of DeepSeek’s AI models doesn’t provide all the answers

A peer-reviewed paper about Chinese startup DeepSeek's models explains their training approach but not how they work through ...

4don MSN

New model frames human reinforcement learning in the context of memory and habits

Humans and most other animals are known to be strongly driven by expected rewards or adverse consequences. The process of ...

Baseten Acquires Parsed to Enable Companies to Own Their Intelligence

The acquisition adds world-class reinforcement learning and post-training expertise to deliver superior inference quality and performance for Baseten customers via specialized intelligence SAN ...

The Information

Inference Provider Baseten Acquires Reinforcement Learning Startup Parsed

AI firms are getting more interested in AI that continues to learn even after it’s been trained, otherwise known as continual ...

Forbes

The Rise And Rise Of Reinforcement Learning: AI’s Quiet Revolution

Forbes contributors publish independent expert analyses and insights. Author, Researcher and Speaker on Technology and Business Innovation. Apr 19, 2025, 03:24am EDT Apr 21, 2025, 10:40am EDT ...

Nature

Reinforcement learning improves behaviour from evaluative feedback

Reinforcement-learning algorithms 1,2 are inspired by our understanding of decision making in humans and other animals in which learning is supervised through the use of reward signals in response to ...

Pocket Gamer.biz

How Mo.co used AI to test the player experience

Balancing player experience before a game launches can be done with AI bots, trained to test a title and its content, ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results