LogoThread Easy
  • Explorer
  • Composer un thread
LogoThread Easy

Votre partenaire tout-en-un pour les threads Twitter

© 2025 Thread Easy All Rights Reserved.

Explorer

Newest first — browse tweet threads

Keep on to blur preview images; turn off to show them clearly

The most cited research paper from the lab of Turing awardee Dr. Bengio is about generative adversarial networks (GANs). Who invented them? 

The first neural nets (NNs) that were both generative & adversarial were published in 1990-1991 in Munich [GAN25] https://t.co/Ev6rSvYiY2 

Back then compute was about 10 million times more expensive than today (2025). 

How did these networks work? There are two NNs that fight each other. A so-called controller NN with adaptive stochastic Gaussian units (a generative model) generates output data. This output is fed into a predictor NN (called a "World Model" in 1990 [GAN90]) which learns by gradient descent to predict the effects of the outputs. However, in a minimax game, the generator NN maximizes the error minimized by the predictor NN.

So the controller is motivated to create through its outputs experiments/situations that surprise the predictor. As the predictor improves, these situations become boring. This in turn incentivizes the controller to invent new outputs (or experiments) with still less predictable outcomes, and so forth. This was called Artificial Curiosity [GAN90][GAN91][GAN10][AC].

Artificial Curiosity wasn't the first adversarial machine learning setting, but earlier works [S59][H90] were very different - they neither involved self-supervised NNs where one NN sees the output of another generative NN and tries to predict its consequences, nor were about modeling data, nor used gradient descent. (Generative models themselves are much older, e.g., Hidden Markov Models [MM1-3].) 

See Section "Implementing Dynamic Curiosity and Boredom" of the 1990 technical report [GAN90] and the 1991 peer-reviewed conference paper [GAN91]. It mentions preliminary experiments where (in absence of external reward) the predictor minimizes a linear function of what the generator maximizes.

So these old papers essentially describe what would become known as a GAN almost a quarter of a century later, in 2014 [GAN14], when compute was about 100,000 times cheaper than in 1990. In 2014, the 1990 neural predictor or world model [GAN90][GAN91] was called a discriminator [GAN14], predicting binary effects of possible outputs of the generator (such as real vs fake) [GAN20]. The 2014 application to image generation [GAN14] was novel. 

The 1990 GAN was more general than the 2014 GAN: it wasn't limited to single output actions in 1-step trials, but permitted long sequences of actions. More sophisticated generative adversarial systems for artificial curiosity & creativity were published in 1997 [AC97][AC99][AC02][LEC], predicting abstract internal representations instead of raw data. 

The 1990 principle [GAN90-91] has been widely used for exploration in Reinforcement Learning [SIN5][OUD13][PAT17][BUR18] and for synthesis of realistic images such as deepfakes [GAN14-19b], although the latter domain was eventually taken over by Rombach et al.'s Latent Diffusion [DIF1], another method published in Munich, building on Jarzynski's earlier work in physics from the previous millennium [DIF2] and more recent papers [DIF3-5].

APPENDIX 1. The GAN Priority Dispute (1990-91 vs 2014)

The 2014 paper [GAN14] on generative adversarial neural networks (GANs) failed to cite the original 1990 work on generative and adversarial neural networks [GAN90,91,20][R2][DLP].

The 2014 paper [GAN14] also made false claims about another gradient-based 2-network adversarial system called Predictability Minimization (1991) [PM0-1][GAN20][DLP]. One year after the 1990 paper on Artificial Curiosity [GAN90], Predictability Minimization used the fight between two learning NNs to create disentangled internal representations (or factorial codes) of the input data [PM0-1]. The 2014 paper [GAN1] cites Predictability Minimization, but wrongly claims that it is not a minimax game, and thus is different from GANs. However, the Predictability Minimization experiments from 1991 [PM0-1] and 1996 [PM2] (with images) are directly of the minimax type.

Even later surveys by the authors [GAN14] failed to cite the original work [DLP]. The authors of [GAN14] have never corrected their 2014 paper, implying an intent to brute-force a novelty narrative, even in the face of contradictory evidence.

The priority dispute was picked up by the popular press, e.g., Bloomberg [AV1], after a particularly notable encounter at the 2016 N(eur)IPS conference between Juergen Schmidhuber (J.S.) and the first author of [GAN14], who gave a talk on GANs, encouraging people to ask questions. J.S. did, addressing problems of the N(eur)IPS 2014 paper [GAN14] and the erroneous claims it made about the prior work on PM [GAN20][DLP]. 

Subsequent efforts to correct these issues in a common paper dragged on for many months but didn't work out. The first author [GAN14] eventually admitted that PM is adversarial (his uncorrected NeurIPS paper [GAN14] still claims the opposite), but emphasized that it's not generative. In response, J.S. pointed out that the even earlier Artificial Curiosity [GAN90][GAN91][GAN20][R2][AC][DLP] is both adversarial and generative (its generator NN contains probabilistic units [GAN90] like in StyleGANs [GAN19]).

Despite the validity of this statement, the authors of [GAN14] have made no attempt to correct their paper or respond to this. That's why a peer-reviewed journal publication on this priority dispute was published in 2020 [GAN20] to set the record straight.

Of course, it is well known that plagiarism can be either "unintentional" or "intentional or reckless" [PLAG1-6], and the more innocent of the two may very well be partially the case here. But science has a well-established way of dealing with "multiple discovery" and plagiarism - be it unintentional [PLAG1-6][CONN21] or not [FAKE1-3] - based on facts such as time stamps of publications and patents [DLP][NOB]. The deontology of science requires that unintentional plagiarists correct their publications through errata and then credit the original sources properly in the future. The authors [GAN14] didn't; instead they kept collecting citations for inventions of other researchers [DLP]. Such behaviour apparently turns even unintentional plagiarism [PLAG1-6] into an intentional form [FAKE2].

REFERENCES

[AC] J.  Schmidhuber  (J.S., AI Blog, 2021, updated 2023). 3 decades of artificial curiosity & creativity. Our artificial scientists not only answer given questions but also invent new questions. They achieve curiosity through: (1990) the principle of generative adversarial networks, (1991) neural nets that maximise learning progress, (1995) neural nets that maximise information gain (optimally since 2011), (1997-2022) adversarial design of surprising computational experiments, (2006) maximizing compression progress like scientists/artists/comedians do, (2011) PowerPlay... Since 2012: applications to real robots.

[AC97] J.S. What's interesting? Technical Report IDSIA-35-97, IDSIA, July 1997. Focus on automatic creation of predictable internal abstractions of complex spatio-temporal events: two competing, intrinsically motivated agents agree on essentially arbitrary algorithmic experiments and bet on their possibly surprising (not yet predictable) outcomes in zero-sum games, each agent potentially profiting from outwitting / surprising the other by inventing experimental protocols where both modules disagree on the predicted outcome. The focus is on exploring the space of general algorithms (as opposed to traditional simple mappings from inputs to outputs); the general system focuses on the interesting things by losing interest in both predictable and unpredictable aspects of the world. Unlike our previous systems with intrinsic motivation, e.g., [AC90], the system also takes into account the computational cost of learning new skills, learning when to learn and what to learn. See later publications [AC99][AC02].

[AC99] J.S. Artificial Curiosity Based on Discovering Novel Algorithmic Predictability Through Coevolution. In P. Angeline, Z. Michalewicz, M. Schoenauer, X. Yao, Z. Zalzala, eds., Congress on Evolutionary Computation, p. 1612-1618, IEEE Press, Piscataway, NJ, 1999.

[AC02] J.S. Exploring the Predictable. In Ghosh, S. Tsutsui, eds., Advances in Evolutionary Computing, p. 579-612, Springer, 2002. 

[AV1] A. Vance. Google Amazon and Facebook Owe Jürgen Schmidhuber a Fortune - This Man Is the Godfather the AI Community Wants to Forget. Business Week, Bloomberg, May 15, 2018.

[DEC] J.S.  (AI Blog, 02/20/2020, updated 2025). The 2010s: Our Decade of Deep Learning / Outlook on the 2020s. The recent decade's most important developments and industrial applications based on our AI, with an outlook on the 2020s, also addressing privacy and data markets.

[DIF1] R. Rombach, A. Blattmann, D. Lorenz, P. Esser, B. Ommer. High-Resolution Image Synthesis with Latent Diffusion Models. CVPR 2022. Preprint arXiv:2112.10752, LMU Munich, 2021.

[DIF2] C. Jarzynski. Equilibrium free-energy differences from nonequilibrium measurements: A master-equation approach. Physical Review E, 1997.

[DIF3] J. Sohl-Dickstein, E. A. Weiss, N. Maheswaranathan, S. Ganguli. Deep unsupervised learning using nonequilibrium thermodynamics. CoRR, abs/1503.03585, 2015.

[DIF4] O. Ronneberger, P. Fischer, T. Brox. Unet: Convolutional networks for biomedical image segmentation. In MICCAI (3), vol. 9351 of Lecture Notes in Computer Science, pages 234-241. Springer, 2015.

[DIF5] J. Ho, A. Jain, P. Abbeel. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems 33:6840-6851, 2020. 

[DL1] J.S., 2015. Deep Learning in neural networks: An overview. Neural Networks, 61, 85-117. More.

[DLH] J.S. (2022). Annotated History of Modern AI and Deep Learning. Technical Report IDSIA-22-22, IDSIA, Lugano, Switzerland, 2022. Preprint arXiv:2212.11279. 

[DLP] J. Schmidhuber (2023). How 3 Turing awardees republished key methods and ideas whose creators they failed to credit. Technical Report IDSIA-23-23, Swiss AI Lab IDSIA, 14 Dec 2023. 

[FAKE1] H. Hopf, A. Krief, G. Mehta, S. A. Matlin. Fake science and the knowledge crisis: ignorance can be fatal. Royal Society Open Science, May 2019. Quote: "Scientists must be willing to speak out when they see false information being presented in social media, traditional print or broadcast press" and "must speak out against false information and fake science in circulation and forcefully contradict public figures who promote it."

[FAKE2] L. Stenflo. Intelligent plagiarists are the most dangerous. Nature, vol. 427, p. 777 (Feb 2004). Quote: "What is worse, in my opinion, ..., are cases where scientists rewrite previous findings in different words, purposely hiding the sources of their ideas, and then during subsequent years forcefully claim that they have discovered new phenomena."

[FAKE3] S. Vazire (2020). A toast to the error detectors. Let 2020 be the year in which we value those who ensure that science is self-correcting. Nature, vol 577, p 9, 2/2/2020.

[GAN90] J. Schmidhuber (J.S.). Making the world differentiable: On using fully recurrent self-supervised neural networks for dynamic reinforcement learning and planning in non-stationary environments. Technical Report FKI-126-90, TUM, 1990. The first paper on planning with reinforcement learning recurrent neural networks (NNs) (more) and on generative adversarial networks where a generator NN is fighting a predictor NN in a minimax game. 

[GAN91] J.S. A possibility for implementing curiosity and boredom in model-building neural controllers. In J. A. Meyer and S. W. Wilson, editors, Proc. of the International Conference on Simulation of Adaptive Behavior: From Animals to Animats, pages 222-227. MIT Press/Bradford Books, 1991. Based on  [GAN90].

[GAN10] J.S. Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990-2010). IEEE Transactions on Autonomous Mental Development, 2(3):230-247, 2010. This well-known 2010 survey summarised the generative adversarial NNs of 1990 as follows: a "neural network as a predictive world model is used to maximize the controller's intrinsic reward, which is proportional to the model's prediction errors" (which are minimized).

[GAN10b] O. Niemitalo. A method for training artificial neural networks to generate missing data within a variable context. Blog post, Internet Archive, 2010. A blog post describing the basic ideas [GAN90-91][GAN20][AC] of GANs.

[GAN14] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio. Generative adversarial nets. NIPS 2014, 2672-2680, Dec 2014. A description of GANs that does not cite J.S.'s original GAN principle of 1990 [GAN90-91][GAN20][AC][R2][DLP] and contains wrong claims about J.S.'s adversarial NNs for Predictability Minimization [PM0-2][GAN20][DLP].

[GAN19] T. Karras, S. Laine, T. Aila. A style-based generator architecture for generative adversarial networks. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 4401-4410, 2019.

[GAN19b] D. Fallis. The epistemic threat of deepfakes. Philosophy & Technology 34.4 (2021):623-643.

[GAN20] J. Schmidhuber. Generative Adversarial Networks are Special Cases of Artificial Curiosity (1990) and also Closely Related to Predictability Minimization (1991). Neural Networks, Volume 127, p 58-66, 2020. Preprint arXiv/1906.04493. 

[GAN25] J. Schmidhuber. Who Invented Generative Adversarial Networks? Technical Note IDSIA-14-25, IDSIA, December 2025. See link above. 

[H90] W. D. Hillis. Co-evolving parasites improve simulated evolution as an optimization procedure. Physica D: Nonlinear Phenomena, 42(1-3):228-234, 1990.

[LEC] J.S. (AI Blog, 2022). LeCun's 2022 paper on autonomous machine intelligence rehashes but does not cite essential work of 1990-2015. Years ago, J.S.'s team published most of what Y. LeCun calls his "main original contributions:" neural nets that learn multiple time scales and levels of abstraction, generate subgoals, use intrinsic motivation to improve world models, and plan (1990); controllers that learn informative predictable representations (1997), etc. This was also discussed on Hacker News, reddit, and in the media.  LeCun also listed the "5 best ideas 2012-2022" without mentioning that most of them are from J.S.'s lab, and older. 

[MIR] J.S. (Oct 2019, updated 2021, 2022, 2025). Deep Learning: Our Miraculous Year 1990-1991. Preprint arXiv:2005.05744. 

[MOST] J.S. (AI Blog, 2021, updated 2025). The most cited neural networks all build on work done in my labs: 1. Long Short-Term Memory (LSTM), the most cited AI of the 20th century. 2. ResNet (open-gated Highway Net), the most cited AI of the 21st century. 3. AlexNet & VGG Net (the similar but earlier DanNet of 2011 won 4 image recognition challenges before them). 4. GAN (an instance of Adversarial Artificial Curiosity of 1990). 5. Transformer variants—see the 1991 unnormalised linear Transformer (ULTRA). Foundations of Generative AI were published in 1991: the principles of GANs (now used for deepfakes), Transformers (the T in ChatGPT), Pre-training for deep NNs (the P in ChatGPT), NN distillation, and the famous DeepSeek.

[NOB] J.S. A Nobel Prize for Plagiarism. Technical Report IDSIA-24-24 (7 Dec 2024, updated Oct 2025). 

[PLAG1] Oxford's guide to types of plagiarism (2021). Quote: "Plagiarism may be intentional or reckless, or unintentional." 

[PLAG2] Jackson State Community College (2022). Unintentional Plagiarism. 

[PLAG3] R. L. Foster. Avoiding Unintentional Plagiarism. Journal for Specialists in Pediatric Nursing; Hoboken Vol. 12, Iss. 1, 2007.

[PLAG4] N. Das. Intentional or unintentional, it is never alright to plagiarize: A note on how Indian universities are advised to handle plagiarism. Perspect Clin Res 9:56-7, 2018.

[PLAG5] InfoSci-OnDemand (2023). What is Unintentional Plagiarism? Copy in the Internet Archive.

[PLAG6] Copyrighted (2022). How to Avoid Accidental and Unintentional Plagiarism (2023). Copy in the Internet Archive. Quote: "May it be accidental or intentional, plagiarism is still plagiarism."

[PLAG7] Cornell Review, 2024. Harvard president resigns in plagiarism scandal. January 2024.

[PLAN] J.S. (AI Blog, 2020). 30-year anniversary of planning & reinforcement learning with recurrent world models and artificial curiosity (1990). This work also introduced high-dimensional reward signals, deterministic policy gradients for RNNs, the GAN principle (widely used today). Agents with adaptive recurrent world models even suggest a simple explanation of consciousness & self-awareness.

[PLAN2] J.S. An on-line algorithm for dynamic reinforcement learning and planning in reactive environments. In Proc. IEEE/INNS International Joint Conference on Neural Networks, San Diego, volume 2, pages 253-258, June 17-21, 1990. Based on [GAN90].

[PLAN3] J.S. Reinforcement learning in Markovian and non-Markovian environments. In R. P. Lippman, J. E. Moody, and D. S. Touretzky, editors, Advances in Neural Information Processing Systems 3, NIPS'3, pages 500-506. San Mateo, CA: Morgan Kaufmann, 1991. Partially based on [GAN90].

[PM0] J. Schmidhuber. Learning factorial codes by predictability minimization. TR CU-CS-565-91, Univ. Colorado at Boulder, 1991.

[PM1] J.S. Learning factorial codes by predictability minimization. Neural Computation, 4(6):863-879, 1992. 

[PM2] J.S., M. Eldracher, B. Foltin. Semilinear predictability minimzation produces well-known feature detectors. Neural Computation, 8(4):773-786, 1996. 

[R2] Reddit/ML, 2019. J. Schmidhuber really had GANs in 1990.

[S59] A. L. Samuel. Some studies in machine learning using the game of checkers. IBM Journal on Research and Development, 3:210-229, 1959.

The most cited research paper from the lab of Turing awardee Dr. Bengio is about generative adversarial networks (GANs). Who invented them? The first neural nets (NNs) that were both generative & adversarial were published in 1990-1991 in Munich [GAN25] https://t.co/Ev6rSvYiY2 Back then compute was about 10 million times more expensive than today (2025). How did these networks work? There are two NNs that fight each other. A so-called controller NN with adaptive stochastic Gaussian units (a generative model) generates output data. This output is fed into a predictor NN (called a "World Model" in 1990 [GAN90]) which learns by gradient descent to predict the effects of the outputs. However, in a minimax game, the generator NN maximizes the error minimized by the predictor NN. So the controller is motivated to create through its outputs experiments/situations that surprise the predictor. As the predictor improves, these situations become boring. This in turn incentivizes the controller to invent new outputs (or experiments) with still less predictable outcomes, and so forth. This was called Artificial Curiosity [GAN90][GAN91][GAN10][AC]. Artificial Curiosity wasn't the first adversarial machine learning setting, but earlier works [S59][H90] were very different - they neither involved self-supervised NNs where one NN sees the output of another generative NN and tries to predict its consequences, nor were about modeling data, nor used gradient descent. (Generative models themselves are much older, e.g., Hidden Markov Models [MM1-3].) See Section "Implementing Dynamic Curiosity and Boredom" of the 1990 technical report [GAN90] and the 1991 peer-reviewed conference paper [GAN91]. It mentions preliminary experiments where (in absence of external reward) the predictor minimizes a linear function of what the generator maximizes. So these old papers essentially describe what would become known as a GAN almost a quarter of a century later, in 2014 [GAN14], when compute was about 100,000 times cheaper than in 1990. In 2014, the 1990 neural predictor or world model [GAN90][GAN91] was called a discriminator [GAN14], predicting binary effects of possible outputs of the generator (such as real vs fake) [GAN20]. The 2014 application to image generation [GAN14] was novel. The 1990 GAN was more general than the 2014 GAN: it wasn't limited to single output actions in 1-step trials, but permitted long sequences of actions. More sophisticated generative adversarial systems for artificial curiosity & creativity were published in 1997 [AC97][AC99][AC02][LEC], predicting abstract internal representations instead of raw data. The 1990 principle [GAN90-91] has been widely used for exploration in Reinforcement Learning [SIN5][OUD13][PAT17][BUR18] and for synthesis of realistic images such as deepfakes [GAN14-19b], although the latter domain was eventually taken over by Rombach et al.'s Latent Diffusion [DIF1], another method published in Munich, building on Jarzynski's earlier work in physics from the previous millennium [DIF2] and more recent papers [DIF3-5]. APPENDIX 1. The GAN Priority Dispute (1990-91 vs 2014) The 2014 paper [GAN14] on generative adversarial neural networks (GANs) failed to cite the original 1990 work on generative and adversarial neural networks [GAN90,91,20][R2][DLP]. The 2014 paper [GAN14] also made false claims about another gradient-based 2-network adversarial system called Predictability Minimization (1991) [PM0-1][GAN20][DLP]. One year after the 1990 paper on Artificial Curiosity [GAN90], Predictability Minimization used the fight between two learning NNs to create disentangled internal representations (or factorial codes) of the input data [PM0-1]. The 2014 paper [GAN1] cites Predictability Minimization, but wrongly claims that it is not a minimax game, and thus is different from GANs. However, the Predictability Minimization experiments from 1991 [PM0-1] and 1996 [PM2] (with images) are directly of the minimax type. Even later surveys by the authors [GAN14] failed to cite the original work [DLP]. The authors of [GAN14] have never corrected their 2014 paper, implying an intent to brute-force a novelty narrative, even in the face of contradictory evidence. The priority dispute was picked up by the popular press, e.g., Bloomberg [AV1], after a particularly notable encounter at the 2016 N(eur)IPS conference between Juergen Schmidhuber (J.S.) and the first author of [GAN14], who gave a talk on GANs, encouraging people to ask questions. J.S. did, addressing problems of the N(eur)IPS 2014 paper [GAN14] and the erroneous claims it made about the prior work on PM [GAN20][DLP]. Subsequent efforts to correct these issues in a common paper dragged on for many months but didn't work out. The first author [GAN14] eventually admitted that PM is adversarial (his uncorrected NeurIPS paper [GAN14] still claims the opposite), but emphasized that it's not generative. In response, J.S. pointed out that the even earlier Artificial Curiosity [GAN90][GAN91][GAN20][R2][AC][DLP] is both adversarial and generative (its generator NN contains probabilistic units [GAN90] like in StyleGANs [GAN19]). Despite the validity of this statement, the authors of [GAN14] have made no attempt to correct their paper or respond to this. That's why a peer-reviewed journal publication on this priority dispute was published in 2020 [GAN20] to set the record straight. Of course, it is well known that plagiarism can be either "unintentional" or "intentional or reckless" [PLAG1-6], and the more innocent of the two may very well be partially the case here. But science has a well-established way of dealing with "multiple discovery" and plagiarism - be it unintentional [PLAG1-6][CONN21] or not [FAKE1-3] - based on facts such as time stamps of publications and patents [DLP][NOB]. The deontology of science requires that unintentional plagiarists correct their publications through errata and then credit the original sources properly in the future. The authors [GAN14] didn't; instead they kept collecting citations for inventions of other researchers [DLP]. Such behaviour apparently turns even unintentional plagiarism [PLAG1-6] into an intentional form [FAKE2]. REFERENCES [AC] J. Schmidhuber (J.S., AI Blog, 2021, updated 2023). 3 decades of artificial curiosity & creativity. Our artificial scientists not only answer given questions but also invent new questions. They achieve curiosity through: (1990) the principle of generative adversarial networks, (1991) neural nets that maximise learning progress, (1995) neural nets that maximise information gain (optimally since 2011), (1997-2022) adversarial design of surprising computational experiments, (2006) maximizing compression progress like scientists/artists/comedians do, (2011) PowerPlay... Since 2012: applications to real robots. [AC97] J.S. What's interesting? Technical Report IDSIA-35-97, IDSIA, July 1997. Focus on automatic creation of predictable internal abstractions of complex spatio-temporal events: two competing, intrinsically motivated agents agree on essentially arbitrary algorithmic experiments and bet on their possibly surprising (not yet predictable) outcomes in zero-sum games, each agent potentially profiting from outwitting / surprising the other by inventing experimental protocols where both modules disagree on the predicted outcome. The focus is on exploring the space of general algorithms (as opposed to traditional simple mappings from inputs to outputs); the general system focuses on the interesting things by losing interest in both predictable and unpredictable aspects of the world. Unlike our previous systems with intrinsic motivation, e.g., [AC90], the system also takes into account the computational cost of learning new skills, learning when to learn and what to learn. See later publications [AC99][AC02]. [AC99] J.S. Artificial Curiosity Based on Discovering Novel Algorithmic Predictability Through Coevolution. In P. Angeline, Z. Michalewicz, M. Schoenauer, X. Yao, Z. Zalzala, eds., Congress on Evolutionary Computation, p. 1612-1618, IEEE Press, Piscataway, NJ, 1999. [AC02] J.S. Exploring the Predictable. In Ghosh, S. Tsutsui, eds., Advances in Evolutionary Computing, p. 579-612, Springer, 2002. [AV1] A. Vance. Google Amazon and Facebook Owe Jürgen Schmidhuber a Fortune - This Man Is the Godfather the AI Community Wants to Forget. Business Week, Bloomberg, May 15, 2018. [DEC] J.S. (AI Blog, 02/20/2020, updated 2025). The 2010s: Our Decade of Deep Learning / Outlook on the 2020s. The recent decade's most important developments and industrial applications based on our AI, with an outlook on the 2020s, also addressing privacy and data markets. [DIF1] R. Rombach, A. Blattmann, D. Lorenz, P. Esser, B. Ommer. High-Resolution Image Synthesis with Latent Diffusion Models. CVPR 2022. Preprint arXiv:2112.10752, LMU Munich, 2021. [DIF2] C. Jarzynski. Equilibrium free-energy differences from nonequilibrium measurements: A master-equation approach. Physical Review E, 1997. [DIF3] J. Sohl-Dickstein, E. A. Weiss, N. Maheswaranathan, S. Ganguli. Deep unsupervised learning using nonequilibrium thermodynamics. CoRR, abs/1503.03585, 2015. [DIF4] O. Ronneberger, P. Fischer, T. Brox. Unet: Convolutional networks for biomedical image segmentation. In MICCAI (3), vol. 9351 of Lecture Notes in Computer Science, pages 234-241. Springer, 2015. [DIF5] J. Ho, A. Jain, P. Abbeel. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems 33:6840-6851, 2020. [DL1] J.S., 2015. Deep Learning in neural networks: An overview. Neural Networks, 61, 85-117. More. [DLH] J.S. (2022). Annotated History of Modern AI and Deep Learning. Technical Report IDSIA-22-22, IDSIA, Lugano, Switzerland, 2022. Preprint arXiv:2212.11279. [DLP] J. Schmidhuber (2023). How 3 Turing awardees republished key methods and ideas whose creators they failed to credit. Technical Report IDSIA-23-23, Swiss AI Lab IDSIA, 14 Dec 2023. [FAKE1] H. Hopf, A. Krief, G. Mehta, S. A. Matlin. Fake science and the knowledge crisis: ignorance can be fatal. Royal Society Open Science, May 2019. Quote: "Scientists must be willing to speak out when they see false information being presented in social media, traditional print or broadcast press" and "must speak out against false information and fake science in circulation and forcefully contradict public figures who promote it." [FAKE2] L. Stenflo. Intelligent plagiarists are the most dangerous. Nature, vol. 427, p. 777 (Feb 2004). Quote: "What is worse, in my opinion, ..., are cases where scientists rewrite previous findings in different words, purposely hiding the sources of their ideas, and then during subsequent years forcefully claim that they have discovered new phenomena." [FAKE3] S. Vazire (2020). A toast to the error detectors. Let 2020 be the year in which we value those who ensure that science is self-correcting. Nature, vol 577, p 9, 2/2/2020. [GAN90] J. Schmidhuber (J.S.). Making the world differentiable: On using fully recurrent self-supervised neural networks for dynamic reinforcement learning and planning in non-stationary environments. Technical Report FKI-126-90, TUM, 1990. The first paper on planning with reinforcement learning recurrent neural networks (NNs) (more) and on generative adversarial networks where a generator NN is fighting a predictor NN in a minimax game. [GAN91] J.S. A possibility for implementing curiosity and boredom in model-building neural controllers. In J. A. Meyer and S. W. Wilson, editors, Proc. of the International Conference on Simulation of Adaptive Behavior: From Animals to Animats, pages 222-227. MIT Press/Bradford Books, 1991. Based on [GAN90]. [GAN10] J.S. Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990-2010). IEEE Transactions on Autonomous Mental Development, 2(3):230-247, 2010. This well-known 2010 survey summarised the generative adversarial NNs of 1990 as follows: a "neural network as a predictive world model is used to maximize the controller's intrinsic reward, which is proportional to the model's prediction errors" (which are minimized). [GAN10b] O. Niemitalo. A method for training artificial neural networks to generate missing data within a variable context. Blog post, Internet Archive, 2010. A blog post describing the basic ideas [GAN90-91][GAN20][AC] of GANs. [GAN14] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio. Generative adversarial nets. NIPS 2014, 2672-2680, Dec 2014. A description of GANs that does not cite J.S.'s original GAN principle of 1990 [GAN90-91][GAN20][AC][R2][DLP] and contains wrong claims about J.S.'s adversarial NNs for Predictability Minimization [PM0-2][GAN20][DLP]. [GAN19] T. Karras, S. Laine, T. Aila. A style-based generator architecture for generative adversarial networks. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 4401-4410, 2019. [GAN19b] D. Fallis. The epistemic threat of deepfakes. Philosophy & Technology 34.4 (2021):623-643. [GAN20] J. Schmidhuber. Generative Adversarial Networks are Special Cases of Artificial Curiosity (1990) and also Closely Related to Predictability Minimization (1991). Neural Networks, Volume 127, p 58-66, 2020. Preprint arXiv/1906.04493. [GAN25] J. Schmidhuber. Who Invented Generative Adversarial Networks? Technical Note IDSIA-14-25, IDSIA, December 2025. See link above. [H90] W. D. Hillis. Co-evolving parasites improve simulated evolution as an optimization procedure. Physica D: Nonlinear Phenomena, 42(1-3):228-234, 1990. [LEC] J.S. (AI Blog, 2022). LeCun's 2022 paper on autonomous machine intelligence rehashes but does not cite essential work of 1990-2015. Years ago, J.S.'s team published most of what Y. LeCun calls his "main original contributions:" neural nets that learn multiple time scales and levels of abstraction, generate subgoals, use intrinsic motivation to improve world models, and plan (1990); controllers that learn informative predictable representations (1997), etc. This was also discussed on Hacker News, reddit, and in the media. LeCun also listed the "5 best ideas 2012-2022" without mentioning that most of them are from J.S.'s lab, and older. [MIR] J.S. (Oct 2019, updated 2021, 2022, 2025). Deep Learning: Our Miraculous Year 1990-1991. Preprint arXiv:2005.05744. [MOST] J.S. (AI Blog, 2021, updated 2025). The most cited neural networks all build on work done in my labs: 1. Long Short-Term Memory (LSTM), the most cited AI of the 20th century. 2. ResNet (open-gated Highway Net), the most cited AI of the 21st century. 3. AlexNet & VGG Net (the similar but earlier DanNet of 2011 won 4 image recognition challenges before them). 4. GAN (an instance of Adversarial Artificial Curiosity of 1990). 5. Transformer variants—see the 1991 unnormalised linear Transformer (ULTRA). Foundations of Generative AI were published in 1991: the principles of GANs (now used for deepfakes), Transformers (the T in ChatGPT), Pre-training for deep NNs (the P in ChatGPT), NN distillation, and the famous DeepSeek. [NOB] J.S. A Nobel Prize for Plagiarism. Technical Report IDSIA-24-24 (7 Dec 2024, updated Oct 2025). [PLAG1] Oxford's guide to types of plagiarism (2021). Quote: "Plagiarism may be intentional or reckless, or unintentional." [PLAG2] Jackson State Community College (2022). Unintentional Plagiarism. [PLAG3] R. L. Foster. Avoiding Unintentional Plagiarism. Journal for Specialists in Pediatric Nursing; Hoboken Vol. 12, Iss. 1, 2007. [PLAG4] N. Das. Intentional or unintentional, it is never alright to plagiarize: A note on how Indian universities are advised to handle plagiarism. Perspect Clin Res 9:56-7, 2018. [PLAG5] InfoSci-OnDemand (2023). What is Unintentional Plagiarism? Copy in the Internet Archive. [PLAG6] Copyrighted (2022). How to Avoid Accidental and Unintentional Plagiarism (2023). Copy in the Internet Archive. Quote: "May it be accidental or intentional, plagiarism is still plagiarism." [PLAG7] Cornell Review, 2024. Harvard president resigns in plagiarism scandal. January 2024. [PLAN] J.S. (AI Blog, 2020). 30-year anniversary of planning & reinforcement learning with recurrent world models and artificial curiosity (1990). This work also introduced high-dimensional reward signals, deterministic policy gradients for RNNs, the GAN principle (widely used today). Agents with adaptive recurrent world models even suggest a simple explanation of consciousness & self-awareness. [PLAN2] J.S. An on-line algorithm for dynamic reinforcement learning and planning in reactive environments. In Proc. IEEE/INNS International Joint Conference on Neural Networks, San Diego, volume 2, pages 253-258, June 17-21, 1990. Based on [GAN90]. [PLAN3] J.S. Reinforcement learning in Markovian and non-Markovian environments. In R. P. Lippman, J. E. Moody, and D. S. Touretzky, editors, Advances in Neural Information Processing Systems 3, NIPS'3, pages 500-506. San Mateo, CA: Morgan Kaufmann, 1991. Partially based on [GAN90]. [PM0] J. Schmidhuber. Learning factorial codes by predictability minimization. TR CU-CS-565-91, Univ. Colorado at Boulder, 1991. [PM1] J.S. Learning factorial codes by predictability minimization. Neural Computation, 4(6):863-879, 1992. [PM2] J.S., M. Eldracher, B. Foltin. Semilinear predictability minimzation produces well-known feature detectors. Neural Computation, 8(4):773-786, 1996. [R2] Reddit/ML, 2019. J. Schmidhuber really had GANs in 1990. [S59] A. L. Samuel. Some studies in machine learning using the game of checkers. IBM Journal on Research and Development, 3:210-229, 1959.

Invented principles of meta-learning (1987), GANs (1990), Transformers (1991), very deep learning (1991), etc. Our AI is used many billions of times every day.

avatar for Jürgen Schmidhuber
Jürgen Schmidhuber
Mon Dec 01 16:00:12
Gen-4.5 达到了前所未有的物理真实感和视觉精度

物体运动具有真实的重量和动量,表面行为符合现实世界的物理规律

当然,用户也可以选择让模型“无视”物理规律完全由创意主导。

Gen-4.5 达到了前所未有的物理真实感和视觉精度 物体运动具有真实的重量和动量,表面行为符合现实世界的物理规律 当然,用户也可以选择让模型“无视”物理规律完全由创意主导。

学AI找小互,找小互,上 https://t.co/4PVaHEr5r3 ... 小互AI日报 社群:https://t.co/LIEXfWUHv1

avatar for 小互
小互
Mon Dec 01 15:59:48
It's a really, really good model with vibe that's substantially different from V3.2 (which is also amazing) 
you can consider it a V4… preview-lite I guess
also pour one out for Qwen-Max, Doubao-pro and other proprietary champs while you're at it

It's a really, really good model with vibe that's substantially different from V3.2 (which is also amazing) you can consider it a V4… preview-lite I guess also pour one out for Qwen-Max, Doubao-pro and other proprietary champs while you're at it

We're in a race. It's not USA vs China but humans and AGIs vs ape power centralization. @deepseek_ai stan #1, 2023–Deep Time «C’est la guerre.» ®1

avatar for Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
Mon Dec 01 15:58:55
This is actually a really cool demonstration of the composability + built-in value transfer you get from x402.

The trigger node auto-generating an x402 resource is *chefs kiss*

This is actually a really cool demonstration of the composability + built-in value transfer you get from x402. The trigger node auto-generating an x402 resource is *chefs kiss*

Building @merit_systems @x402scan, prev investing + eng @a16z. Kardashev-scale accelerationist. Chess enthusiast. Open Source.

avatar for Mason Hall
Mason Hall
Mon Dec 01 15:54:46
卧槽 过年了

Runway也发布新模型了

Runway 发布 Runway Gen-4.5

就是之前我们曝光那个在Artificial Analysis排名第一的神秘模型:Whisper Thunder(代号 David)

Runway称:Gen-4.5 在运动质量、提示遵循性和视觉逼真度方面的新巅峰,设立了行业新标准。

在预训练数据效率和后训练技术方面实现了重大突破

成为 Runway 用于 世界建模 的新基础模型。

卧槽 过年了 Runway也发布新模型了 Runway 发布 Runway Gen-4.5 就是之前我们曝光那个在Artificial Analysis排名第一的神秘模型:Whisper Thunder(代号 David) Runway称:Gen-4.5 在运动质量、提示遵循性和视觉逼真度方面的新巅峰,设立了行业新标准。 在预训练数据效率和后训练技术方面实现了重大突破 成为 Runway 用于 世界建模 的新基础模型。

该模型擅长理解和执行复杂的序列化指令 用户可以在一个提示中定义详细的镜头编排、复杂的场景构图、精确的事件时间点以及微妙的氛围变化。 你可以用一段话告诉它:镜头怎么移动、场景怎么布置、天气怎么变化、人物怎么走动,它都能准确实现。

avatar for 小互
小互
Mon Dec 01 15:54:22
with ai being so powerful, the only viable way for startups to stay competitive is to have small teams. 

less employees, more founders.

with ai being so powerful, the only viable way for startups to stay competitive is to have small teams. less employees, more founders.

build something people want (to copy)™️

avatar for Terry Xu
Terry Xu
Mon Dec 01 15:54:10
  • Previous
  • 1
  • More pages
  • 1883
  • 1884
  • 1885
  • More pages
  • 5634
  • Next