<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Recursive Models on Husky&#39;Log</title>
    <link>https://huskydoge.github.io/husky-blog/tags/recursive-models/</link>
    <description>Recent content in Recursive Models on Husky&#39;Log</description>
    <image>
      <title>Husky&#39;Log</title>
      <url>https://huskydoge.github.io/avatar.png</url>
      <link>https://huskydoge.github.io/avatar.png</link>
    </image>
    <generator>Hugo -- 0.144.2</generator>
    <language>en</language>
    <lastBuildDate>Sun, 19 Apr 2026 12:00:00 -0400</lastBuildDate>
    <atom:link href="https://huskydoge.github.io/husky-blog/tags/recursive-models/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Loop-Model FLOPs and Memory in an Ablation Chain</title>
      <link>https://huskydoge.github.io/husky-blog/posts/recursive_models/loop-cost/</link>
      <pubDate>Sun, 19 Apr 2026 12:00:00 -0400</pubDate>
      <guid>https://huskydoge.github.io/husky-blog/posts/recursive_models/loop-cost/</guid>
      <description>&lt;figure class=&#34;align-center align-center&#34;&gt;&lt;img loading=&#34;lazy&#34; src=&#34;https://huskydoge.github.io/husky-blog/posts/recursive_models/loop-cost/teaser.png#center&#34;
             alt=&#34;Loop-model FLOPs and memory teaser schematic&#34;/&gt;
&lt;/figure&gt;

&lt;h2 id=&#34;introduction&#34;&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Loop models are becoming popular lately, with exciting results &lt;span class=&#34;hugo-cite-intext&#34; itemprop=&#34;citation&#34;&gt;
  &lt;span class=&#34;hugo-cite-group&#34; style=&#34;display: inline-block;&#34;&gt;
    &lt;a href=&#34;#&#34; style=&#34;text-decoration: none; border-bottom: 1px dotted #ccc;&#34;&gt;
      [1,2,3,4,5]
    &lt;/a&gt;
    &lt;span class=&#34;hugo-cite-citation&#34;&gt;&lt;span class=&#34;hugo-cite-citation-entry&#34;&gt;&lt;strong&gt;Less is More: Recursive Reasoning with Tiny Networks&lt;/strong&gt;&lt;br&gt;A. Jolicoeur-Martineau, (2025)&lt;br&gt;&lt;a href=&#34;https://arxiv.org/abs/2510.04871&#34; target=&#34;_blank&#34;&gt;Link&lt;/a&gt;&lt;/span&gt;&lt;span class=&#34;hugo-cite-citation-entry&#34;&gt;&lt;strong&gt;Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach&lt;/strong&gt;&lt;br&gt;J. Geiping, S. McLeish, N. Jain, J. Kirchenbauer, S. Singh, B. Bartoldson, B. Kailkhura, A. Bhatele, T. Goldstein, (2025)&lt;br&gt;&lt;a href=&#34;https://arxiv.org/abs/2502.05171&#34; target=&#34;_blank&#34;&gt;Link&lt;/a&gt;&lt;/span&gt;&lt;span class=&#34;hugo-cite-citation-entry&#34;&gt;&lt;strong&gt;Parcae: Scaling Laws For Stable Looped Language Models&lt;/strong&gt;&lt;br&gt;H. Prairie, Z. Novack, T. Berg-Kirkpatrick, D. Fu, (2026)&lt;br&gt;&lt;a href=&#34;https://arxiv.org/abs/2604.12946&#34; target=&#34;_blank&#34;&gt;Link&lt;/a&gt;&lt;/span&gt;&lt;span class=&#34;hugo-cite-citation-entry&#34;&gt;&lt;strong&gt;Scaling latent reasoning via looped language models&lt;/strong&gt;&lt;br&gt;R. Zhu, Z. Wang, K. Hua, T. Zhang, Z. Li, H. Que, B. Wei, Z. Wen, F. Yin, H. Xing,  others, (2025)&lt;br&gt;&lt;/span&gt;&lt;span class=&#34;hugo-cite-citation-entry&#34;&gt;&lt;strong&gt;Hierarchical Reasoning Model&lt;/strong&gt;&lt;br&gt;G. Wang, J. Li, Y. Sun, X. Chen, C. Liu, Y. Wu, M. Lu, Y. Yadkori, (2025)&lt;br&gt;&lt;a href=&#34;https://arxiv.org/abs/2506.21734&#34; target=&#34;_blank&#34;&gt;Link&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;
  &lt;/span&gt;
&lt;/span&gt;&amp;#32;. Once we decide to reuse the same block across multiple layers, however, one practical question becomes unavoidable: &lt;strong&gt;what does the loop cost during training?&lt;/strong&gt;&lt;/p&gt;</description>
    </item>
  </channel>
</rss>
