Meta 的超級 AI：首個多模態語言模型 Spirit LM 顛覆你對語音助理的想像

Meta 日前推出了名為 Meta Spirit LM 的公開共享多模態語言模型，這是該公司首個能夠無縫整合文字和語音輸入輸出的模型。

模型特點與版本

Spirit LM 由 Meta 的基礎人工智能研究（FAIR）團隊設計，目標是解決現有 AI 語音體驗的局限性，提供更具表現力和自然的語音生成。該模型能夠跨模態學習任務，包括自動語音識別（Automatic Speech Recognition，ASR）、文字轉語音（Text-to-Speech，TTS）和語音分類。

Meta 發布了兩個版本的 Spirit LM：

Spirit LM Base：使用語音標記處理和生成語音。

Spirit LM Expressive：包含額外的音高和語調標記，能捕捉更細微的情緒狀態。

公開共享非商業性質

Spirit LM 是完全公開共享的，但目前僅供非商業用途使用。Meta 釋出了模型參數、程式碼和相關文件，希望藉此鼓勵 AI 研究社群探索新方法，將語音和文字整合到 AI 系統中。

應用潛力

Spirit LM 設計用於學習各種模態的新任務，如：

自動語音識別（ASR）

文字轉語音（TTS）

語音分類

Spirit LM Expressive 模型更進一步，將情緒暗示納入語音生成中，可以檢測和反映憤怒、驚訝或喜悅等情緒狀態，使AI互動更接近人類。

更廣泛的研究努力

Spirit LM 是 Meta FAIR 向公眾發布的一系列研究工具和模型的一部分。Meta 的總體目標是實現先進的機器智能（AMI），強調開發既強大又易於使用的 AI 系統。

未來展望

通過提供更自然、更具表現力的AI生成語音方法，並將模型公開共享，Meta 使更廣泛的研究社群能夠探索多模態 AI 應用的新可能性。Spirit LM 代表了機器學習領域的一個有前景的進步，有潛力推動新一代更接近人類的 AI 互動。

然而，目前 Spirit LM 僅供非商業用途，這可能限制了 Spirit LM 在商業應用中的直接使用。未來，隨著研究的深入和可能開放商業使用權，我們可能會看到更多基於 Spirit LM 技術的實際應用。

＊本文開放合作夥伴轉載，資料來源：《VentureBeat》首圖來源：《Unsplash》

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Meta 的超級 AI：首個多模態語言模型 Spirit LM 顛覆你對語音助理的想像

模型特點與版本

公開共享非商業性質

應用潛力

更廣泛的研究努力

未來展望

TO 會員電子報

AI 太貴難落地、太強又怕失控：Anthropic Sonnet 5 如何把企業 AI 變便宜，又不踩政府安全紅線？

人才是一切，但晶片才是天花板：俄羅斯主權 AI 戰略的結構性矛盾

讓電商顧客每次造訪營收衝高 4.75 倍：百貨巨頭 Macy’s 如何用 AI 當導購？

台灣 AI 採用贏全球，產出成果卻落後一截？微軟揭企業 AI 的導入盲點