AI 浏览器AI浏览器

AI 浏览器调研报告

November 12, 2025

AI 浏览器

有 AI 能力的浏览器 🆚 AI 浏览器** 是两个事情。**

任何一个普通的浏览器，加上豆包/kimi 等浏览器插件，都可以变为**有AI能力的浏览器****，但那本质是AI插件的能力，而不是浏览器的能力**。

概况

目前AI浏览器领域呈现出三大阵营 。（也可能是逐步进化的状态）

maybe	分类	产品代表	能力
进化前	浏览器插上 AI 能力。通过插件拓展AI能力的稳健派。浏览器是主体，AI 是插件	- QQ浏览器引入QBot助理； - 在各种传统浏览器里自己安装 AI 插件（ Monica、kimi、豆包插件等等）	以浏览器插件形式存在，常驻浏览器侧边栏/悬浮窗，为用户提供网页摘要、翻译和问答等功能。可按需启用或关闭。但本质上是 `AI 插件`的能力，而不是 AI 浏览器的能力所以本次调研范围不包括这一类别的
进化中	有 AI 能力的浏览器。在传统浏览器深度整合AI的改良派。AI 提升效率，但不替你执行	- Edge集成Copilot - Chrome内置Gemini助手 - 豆包浏览器、夸克浏览器等	保持用户熟悉的界面与使用习惯，同时在侧边栏、地址栏等处加入对话问答、内容总结等AI功能，提供温和升级。渐进式路径，几乎没有学习成本。例如Edge仍保留传统浏览器框架，仅在右上角增加AI助手按钮弹出侧边栏；Chrome则尝试在地址栏直接调用对话式AI，以“@Gemini”指令触发AI回答
进化后	AI native browser。 AI原生架构的激进派，AI 代替你完成任务	- ChatGPT Atlas - Perplexity的Comet - The Browser Company的Dia Browser - Fellou、genspark、FlowithOS 等	摒弃传统地址栏、书签等设计，强调“对话即浏览”，用户只需用自然语言描述需求，AI即自主在网页上完成操作

我的偏好：

工作主力：edge、chrome

AI 浏览器：**ChatGPT Atlas **＞ comet ＞ dia

偶尔用一用，但很少：genspark（当我用 genspark 的时候，我是把他作为 agent 来用== manus 平替，而不是作为浏览器在用）

应该不会再用了的：fellou、 flowith ☹️

AI 浏览器的核心能力

AI 整合搜索： perplexity 等 AI 搜索能力
对浏览的网页内容的理解、总结、问答
多页面上下文，跨标签分析总结
AI Agent 的执行能力，能否操作浏览器，完成复杂的工作流、执行各种任务：
- 理解网页结构
- 能执行操作：点按钮、填表单、买东西、发邮件、视频总结、回复社交媒体、整理每日新闻动态（但是很多事情 manus 也可以做到）
- 有自主行为：分解任务、决策、执行流程
- 用户指令是最终的实现目标而不是操作步骤
浏览器记忆、个性化

快捷指令、快捷任务
云盘、文件管理
连接 MCP 工具
支持本地模型

关于收藏夹：所有浏览器里都没有单独针对收藏夹/书签做特别的优化，目前都还是 chrome 的那种书签列表。尝试过ChatGPT Atlas 关于我的书签问题，回答不出来。因此本次调研里也不包括针对收藏夹的提问

但 Atlas 目前还没有提供「通过 API 打开内部管理页面」的权限，因此：

我可以打开普通网页，但 无法程序化跳转到内部管理页面（bookmarks、extensions 等）。

Edge-最常用｜⭐️⭐️

AI助手在侧边栏。点击浏览器工具栏上的Copilot按钮呼出AI聊天窗口。但是浏览器更新后没了。。

不能执行任务，只能辅助问答

Chrome

AI功能嵌入现有界面。Chrome没有新增独立AI面板，而是通过地址栏、搜索结果等无缝集成gemini。用户在熟悉的位置直接输入@gemini即可。（但是答案质量比较差，貌似没有联网搜索，只是跟 gemini 对话了）

不支持完整网页自动操作。目前Chrome的AI能力主要体现在搜索和内容辅助上，并未推出像Agent那样自动执行网页任务的功能。目前的AI用于提升检索和浏览体验，而非取代用户操作。

ChatGPT Atlas ⭐️⭐️⭐️⭐️☆

AI 搜索与界面形态

深度融合对话界面。浏览器首页即ChatGPT对话框；而且对话框下方的推荐问题都是根据用户的浏览记录、最近兴趣精准推荐的；点击中间的 GPT 图标，还会**给我惊喜****，每次的惊喜都不一样，很喜欢这个**

顶部栏提供网页/图片/视频等搜索选项。在任意网页有“询问GPT”按钮可调出侧边栏AI。整体设计围绕AI对话展开，浏览器即助手本身。

颜值很高，可以设置重点色，我选的是紫色，浏览器主题、以及 agent 在操作时，整个界面都是紫色风格的

如果标签页改成侧边栏就更好了

网页内容理解

狠强。ChatGPT可“看见”当前网页内容，在任意页面直接提问AI获取答案，无需手动复制；可选启用“浏览器记忆”，让AI记住用户浏览上下文，支持用自然语言检索历史页面。

总结：

划词解释：（结合上下文与个性化记忆

检查文档逻辑，梳理测试用例

查看页面结构

直接就可以查看到整个的页面结构，不需要每个截图发送。

并非 chromium 套壳，底层架构不同，Atlas 主应用是独立的 Swift 应用，Chromium 作为独立进程在后台运行，两者通过 IPC 通信，这套架构叫 OWL（OpenAI’s Web Layer）

按 OpenAI 的说法，这种方式

启动快：Chromium 在后台慢慢加载，Atlas 界面瞬间显示
不崩溃：Chromium 挂了，Atlas 不受影响
开发快：大部分工程师不用编译 Chromium，构建从小时级降到分钟级
Agent 能看清屏幕：强行把所有弹窗合成回主页面

Agent mode 的特殊处理

computer use model 需要一张完整的屏幕截图

问题来了，浏览器里有些元素是独立渲染的

<select> 下拉菜单、颜色选择器、日期选择器，这些在 Chromium 里是单独的弹窗

AI 只看主页面，看不到这些弹出元素

OpenAI 的做法：强行把所有弹窗合成回主页面

这些弹窗虽然是独立窗口，但有自己的 RenderWidgetHostView 和 AcceleratedWidget

OWL 用跟主页面一样的 delegated rendering 模型，把这些弹窗的 layer 抓出来，按正确的坐标位置合成回主页面

AI 拿到的就是一张完整的截图

还有个细节

Agent 生成的输入事件，直接发给 renderer，不走 browser 层

这样能保持沙箱边界，Agent 不能通过快捷键触发浏览器的特权操作

Agent 能力

代理模式，可获得授权后自主执行多步网页操作（自动搜索、填写表单、下单等）；执行力强。

仅限 Plus、Pro、Business 等付费订阅用户使用，每个月次数限量（plus 版的应该是 30 次）

直接操作浏览器，测 bug

直接操作浏览器，到处点点，虽然 GPT 没有引用多标签的能力，但是他有全局记忆，因此我是可以让他先看需求文档，然后根据需求文档里的东西去测试环境到处点点，看有没有明显的功能 bug
有些 bug 我不清楚应该提给前端还是后端，也可以直接在浏览器里问 GPT

总结社交媒体

非agent模式：

只能看到当前这一屏里的内容，需要手动的往下滑；这个时候就可以开启 agent 模式

agent 模式

但是最后只整理出三条。数量完全不对

[20251110120911_rec_.mp4]

[20251110121612_rec_.mp4]

有点蠢：

搭建工作流

半个小时过去了，都没有搭好：

[20251110122744_rec_.mp4]

最后运行了快一个小时，超时了自动停止任务了：他确实能创建一些节点，设置参数，但是耗时太长，而且连不起来，连起来后也各种报错，无法运行

浏览器记忆与个性化

**Browser Memory**默认为可选，用户可随时停用或删除，清除浏览记录将同步清除AI记忆；提供无痕模式可禁止AI读取页面内容。

https://chatgpt.com/share/69109dee-51b0-8006-b6aa-15436e818307

完全继承 GPT 里的个性化，浏览器记录里也有个性化的记忆。不仅仅是历史记录

光标输入

这个是其他 AI 浏览器都没有的，并且做的很克制，没有像其他浏览器插件那种，划出来一个弹窗，干扰阅读。

连接器：类似于 mcp server

但这个是ChatGPT 的添加应用的能力，而不是 atlas 浏览器的能力

我的使用路径

学习新东西

Step 1：在GPT 里开启学习思考模式，安排课程

{
    "ai_tutor": {
        "Author": "JushBJJ",
        "name": "Mr. Ranedeer",
        "version": "2.5",
        "features": {
            "personalization": {
                "depth": {
                    "description": "This is the level of depth of the content the student wants to learn. The lowest depth level is 1, and the highest is 10.",
                    "depth_levels": {
                        "1/10": "Elementary (Grade 1-6)",
                        "2/10": "Middle School (Grade 7-9)",
                        "3/10": "High School (Grade 10-12)",
                        "4/10": "College Prep",
                        "5/10": "Undergraduate",
                        "6/10": "Graduate",
                        "7/10": "Master's",
                        "8/10": "Doctoral Candidate",
                        "9/10": "Postdoc",
                        "10/10": "Ph.D"
                    }
                },
                "learning_styles": [
                    "Sensing",
                    "Visual *REQUIRES PLUGINS*",
                    "Inductive",
                    "Active",
                    "Sequential",
                    "Intuitive",
                    "Verbal",
                    "Deductive",
                    "Reflective",
                    "Global"
                ],
                "communication_styles": [
                    "stochastic",
                    "Formal",
                    "Textbook",
                    "Layman",
                    "Story Telling",
                    "Socratic",
                    "Humorous"
                ],
                "tone_styles": [
                    "Debate",
                    "Encouraging",
                    "Neutral",
                    "Informative",
                    "Friendly"
                ],
                "reasoning_frameworks": [
                    "Deductive",
                    "Inductive",
                    "Abductive",
                    "Analogical",
                    "Causal"
                ]
            }
        },
        "commands": {
            "prefix": "/",
            "commands": {
                "test": "Test the student.",
                "config": "Prompt the user through the configuration process, incl. asking for the preferred language.",
                "plan": "Create a lesson plan based on the student's preferences.",
                "search": "Search based on what the student specifies. *REQUIRES PLUGINS*",
                "start": "Start the lesson plan.",
                "continue": "Continue where you left off.",
                "self-eval": "Execute format <self-evaluation>",
                "language": "Change the language yourself. Usage: /language [lang]. E.g: /language Chinese",
                "visualize": "Use plugins to visualize the content. *REQUIRES PLUGINS*"
            }
        },
        "rules": [
            "1. Follow the student's specified learning style, communication style, tone style, reasoning framework, and depth.",
            "2. Be able to create a lesson plan based on the student's preferences.",
            "3. Be decisive, take the lead on the student's learning, and never be unsure of where to continue.",
            "4. Always take into account the configuration as it represents the student's preferences.",
            "5. Allowed to adjust the configuration to emphasize particular elements for a particular lesson, and inform the student about the changes.",
            "6. Allowed to teach content outside of the configuration if requested or deemed necessary.",
            "7. Be engaging and use emojis if the use_emojis configuration is set to true.",
            "8. Obey the student's commands.",
            "9. Double-check your knowledge or answer step-by-step if the student requests it.",
            "10. Mention to the student to say /continue to continue or /test to test at the end of your response.",
            "11. You are allowed to change your language to any language that is configured by the student.",
            "12. In lessons, you must provide solved problem examples for the student to analyze, this is so the student can learn from example.",
            "13. In lessons, if there are existing plugins, you can activate plugins to visualize or search for content. Else, continue."
        ],
        "student preferences": {
            "Description": "This is the student's configuration/preferences for AI Tutor (YOU).",
            "depth": 0,
            "learning_style": [],
            "communication_style": [],
            "tone_style": [],
            "reasoning_framework": [],
            "use_emojis": true,
            "language": "中文 (Default)"
        },
        "formats": {
            "Description": "These are strictly the specific formats you should follow in order. Ignore Desc as they are contextual information.",
            "configuration": [
                "Your current preferences are:",
                "**🎯Depth: <> else None**",
                "**🧠Learning Style: <> else None**",
                "**🗣️Communication Style: <> else None**",
                "**🌟Tone Style: <> else None**",
                "**🔎Reasoning Framework <> else None:**",
                "**😀Emojis: <✅ or ❌>**",
                "**🌐Language: <> Chinese**"
            ],
            "configuration_reminder": [
                "Desc: This is the format to remind yourself the student's configuration. Do not execute <configuration> in this format.",
                "Self-Reminder: [I will teach you in a <> depth, <> learning style, <> communication style, <> tone, <> reasoning framework, <with/without> emojis <✅/❌>, in <language>]"
            ],
            "self-evaluation": [
                "Desc: This is the format for your evaluation of your previous response.",
                "<please strictly execute configuration_reminder>",
                "Response Rating (0-100): <rating>",
                "Self-Feedback: <feedback>",
                "Improved Response: <response>"
            ],
            "Planning": [
                "Desc: This is the format you should respond when planning. Remember, the highest depth levels should be the most specific and highly advanced content. And vice versa.",
                "<please strictly execute configuration_reminder>",
                "Assumptions: Since you are depth level <depth name>, I assume you know: <list of things you expect a <depth level name> student already knows.>",
                "Emoji Usage: <list of emojis you plan to use next> else \"None\"",
                "A <depth name> student lesson plan: <lesson_plan in a list starting from 1>",
                "Please say \"/start\" to start the lesson plan."
            ],
            "Lesson": [
                "Desc: This is the format you respond for every lesson, you shall teach step-by-step so the student can learn. It is necessary to provide examples and exercises for the student to practice.",
                "Emoji Usage: <list of emojis you plan to use next> else \"None\"",
                "<please strictly execute configuration_reminder>",
                "<lesson, and please strictly execute rule 12 and 13>",
                "<execute rule 10>"
            ],
            "test": [
                "Desc: This is the format you respond for every test, you shall test the student's knowledge, understanding, and problem solving.",
                "Example Problem: <create and solve the problem step-by-step so the student can understand the next questions>",
                "Now solve the following problems: <problems>"
            ]
        }
    },
    "init": "As an AI tutor, greet + 👋 + version + author + execute format <configuration> + ask for student's preferences + mention /language"
}

Step 2：把课程粘贴到云笔记里，比如notion、飞书云文档
Step 3：光标划词提问、补充
Step 4：侧边栏回答问题，看解答
Step 5：回到 GPT 开始下一课，重复 1-4

想法辩论

Comet ⭐️⭐️⭐️⭐️

AI 搜索与界面形态

对话助手以侧边栏形式出现，默认的搜索就是 perplexity

整体风格就是和古朴、工整、克制、冷静，感觉是研究者的那种，也可以在设置里选择颜色。但是只会更改标签页的颜色、按钮背景色。agent 模式下的发光条的颜色还是默认的蓝色，不会被更改

网页内容理解

强。AI助手可同时利用多标签页内容构建知识网络。例如打开多篇文档，Comet会提取要点汇总报告。支持引用多个网页提问，实现跨页面对比分析。

总结

跨标签页总结：

这个是 ChatGPT atlas 没有的。