Google 發表新推理模型 Gemini 2.5，專攻進階程式設計有多強？

Google 在 3/25 正式推出新一代 AI 模型 Gemini 2.5 系列，並先揭露首款模型 Gemini 2.5 Pro（Experimental），這是一款多模態推理模型。值得關注的是，Gemini 2.5 系列，以及未來 Google 所有新模型，都將具備「先思考再說話」的推理能力。

Google 指出，在 AI 領域，推理能力不單指分類和預測，而是分析資訊、得出合乎邏輯的結論，並結合背景和細微差異做出判斷的能力。Gemini 2.5 的特點是結合顯著強化的基礎模型與後期訓練，達到新的水準。

Gemini 2.5 Pro 鎖定開發者，能理解更長、更複雜內容

Google 指出，他們持續專注於提升程式設計效能，Gemini 2.5 比 2.0 有明顯進步，具備更進階的 coding 能力。Gemini 2.5 Pro 的設計目的，是協助開發者打造視覺吸引力高的網頁應用程式，並優化 AI 代理類型的程式設計任務。

Gemini 2.5 Pro 上下文視窗一次最多可讀取 100 萬個 token（約 75 萬字），未來將擴展到 200 萬個 token，有助於理解更長、更複雜的輸入內容，包括文字、音訊、圖像、視訊甚至整個程式碼庫。

Gemini 2.5 Pro 現已提供給 Gemini Advanced 訂閱用戶與 Google AI Studio 開發者使用，未來也將擴展到 Vertex AI 平台。Google 也將在未來幾週內推出定價。

Gemini 2.5 實力如何？擊敗 o3-mini、DeepSeek R1

Google 表示，Gemini 2.5 Pro 在多項需要高階推理能力的數學、科學基準測試中表現優於自家過去的先進模型，以及部分主流競爭對手。

在業界常用的程式碼編輯測試（Aider Polyglot）中，Gemini 2.5 Pro 獲得 68.6% 的分數，優於 OpenAI、Anthropic 和 DeepSeek 等對手模型。而在 SWE-Bench Verified 程式開發能力測試中，也以 63.8% 的表現超過 OpenAI o3-mini 與 DeepSeek R1，但略低於 Anthropic Claude 3.7 Sonnet 的 70.3%。

在「人類最後的考試（Humanity’s Last Exam）」這類跨領域測試中，Gemini 2.5 Pro 無工具輔助下也取得 18.8% 的成績，超越多數旗艦模型。

立即下載《2025 AI 工具應用報告》

【推薦閱讀】

◆ 微軟 Security Copilot 一口氣新增 11 款資安 AI 代理，要幫忙到不行的資安團隊

◆ DeepSeek R2 要來了？V3 模型默默更新，硬體靠 Mac Studio 就可跑

◆ 高通全面搶攻邊緣 AI，副總裁劉思泰揭「新品牌」佈局不只是硬體

＊本文初稿由 AI 生成，經《TechOrange》編撰，資料來源：《9toGoogle》、《TechCrunch》、Google，首圖來源：Google。

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Google 發表新推理模型 Gemini 2.5，專攻進階程式設計有多強？

Gemini 2.5 Pro 鎖定開發者，能理解更長、更複雜內容

Gemini 2.5 實力如何？擊敗 o3-mini、DeepSeek R1

立即下載《2025 AI 工具應用報告》

TO 會員電子報

台灣 AI 採用贏全球，產出成果卻落後一截？微軟揭企業 AI 的導入盲點

南韓砸逾 8,800 億美元打造 AI 國家隊：拆解台、日、韓的 AI 國力競賽

從 8 小時到 22 秒就能破解！當 AI 變成駭客工具，你的公司準備好了嗎？（下篇）

資安長看不到的「暗物質」：放手讓 AI 自動修補前，先過 5 道門檻