探索
撰写 Thread

探索

最新在前，按卡片方式浏览线程

作者账号

起始日期

结束日期

模糊预览图

开启时会模糊预览图，关闭后正常显示

Cursor的 Composer1 模型限免了大家可以用用看是我最近主力模型优势就是快几乎不用等就完成了准确率还行当然有些场景有点智障换替补GPT5就搞定了

独立开发者自由职业作品 - 简单简历 https://t.co/xMu5JFIGnr 五分钟打造程序员的金牌简历课程 - 慕课网精英讲师 https://t.co/NTyFFrvHwL 经历 - 不上班的1000天 https://t.co/bonuLQCCsY 视频 - https://t.co/aQYLgujIyC

Viking

Fri Nov 07 14:29:44

How much doe?

Founder | Author | Speaker Building @beltstripe. Healtech/EdTech/Agric I'm Not The Man Of Your Dreams. Your Imagination Wasn't This Great.

Sani Yusuf

Fri Nov 07 14:28:57

RT @kimmonismus: The worlds smartest AI agent system is 1) opensouce and- weight 2) from china Let that sink in.

Co-founder & CEO @HuggingFace 🤗, the open and collaborative platform for AI builders

clem 🤗

Fri Nov 07 14:19:19

The paper: https://t.co/zYNA51w6fv 2/2

Research Scientist @meta (FAIR), Prof. @Unige_en, co-founder @nc_shape. I like reality.

François Fleuret

Fri Nov 07 14:18:03

There is a paper from 2017 that introduced a trick that I love but never seen used. Consider two linear layers f and g that you initialize with the same parameters, and then you use h(x)=f(relu(x))+g(-relu(-x)) Then at initialization, h is linear! 1/2

Research Scientist @meta (FAIR), Prof. @Unige_en, co-founder @nc_shape. I like reality.

François Fleuret

Fri Nov 07 14:18:02

There is a paper from 2017 that introduced a trick that I love but never seen used. Consider two linear layers f and g that you initialize with the same parameters, and then you use h(x)=f(relu(x))+g(-relu(-x)) Then at initialization, h is linear! 1/2

The paper: https://t.co/zYNA51w6fv 2/2

François Fleuret

Fri Nov 07 14:18:02

Previous
1
561
562
563
2117
Next

探索

最新在前，按卡片方式浏览线程

探索

最新在前，按卡片方式浏览线程

Cursor的 Composer1 模型限免了 大家可以用用看 是我最近主力模型 优势就是快 几乎不用等就完成了 准确率还行 当然有些场景有点智障 换替补GPT5就搞定了

How much doe?

RT @kimmonismus: The worlds smartest AI agent system is 1) opensouce and- weight 2) from china Let that sink in.

The paper: https://t.co/zYNA51w6fv 2/2

There is a paper from 2017 that introduced a trick that I love but never seen used. Consider two linear layers f and g that you initialize with the same parameters, and then you use h(x)=f(relu(x))+g(-relu(-x)) Then at initialization, h is linear! 1/2

There is a paper from 2017 that introduced a trick that I love but never seen used. Consider two linear layers f and g that you initialize with the same parameters, and then you use h(x)=f(relu(x))+g(-relu(-x)) Then at initialization, h is linear! 1/2

Cursor的 Composer1 模型限免了大家可以用用看是我最近主力模型优势就是快几乎不用等就完成了准确率还行当然有些场景有点智障换替补GPT5就搞定了