Toots from 2025-06-11#
Every time I see this service mentioned my first thought is that it’s a service that generates de…#
Every time I see this service mentioned my first thought is that it’s a service that generates deadlines.
https://aws.amazon.com/about-aws/whats-new/2025/06/aws-deadline-cloud-worker-performance-monitoring/
Mastodon Source 🐘#
“We question the current evaluation paradigm of LRMs on established math benchmarks and design a …#
“We question the current evaluation paradigm of LRMs on established math benchmarks and design a controlled experimental testbed by leveraging algorithmic puzzle environments that enable controllable experimentation with respect to problem complexity.”
https://machinelearning.apple.com/research/illusion-of-thinking
Mastodon Source 🐘#
Starting with skepticism and a more rigorous experimental setup tempers some of the AGI hype:
“Our findings reveal fundamental
limitations in current models: despite sophisticated self-reflection mechanisms, these models fail to develop generalizable reasoning capabilities beyond certain complexity thresholds.”
Mastodon Source 🐘#
@thudfactor Guessing it was also known as “project is late”?#
@thudfactor Guessing it was also known as “project is late”?
Mastodon Source 🐘#
What expression do you all use to refer to the notion of “minimum viable workflow”? It’s short of…#
What expression do you all use to refer to the notion of “minimum viable workflow”? It’s short of an MVP and I’ve also heard it called Steel Thread or Tracer Bullet. It’s related to the “make it work, make it right, make it fast” mantra.
