SenseTime SenseNova 5.5: China's first real-time multimodal AI model

SenseTime SenseNova 5.5: China’s first real-time multimodal AI model

About the Author

By Ryan Daws | July 9, 2024 https://twitter.com/gadget_ry

Categories: Artificial Intelligence, Companies, Development,

Ryan Daws is a senior editor at TechForge Media with over a decade of experience in crafting compelling narratives and making complex topics accessible. His articles and interviews with industry leaders have earned him recognition as a key influencer by organisations like Onalytica. Under his leadership, publications have been praised by analyst firms such as Forrester for their excellence and performance. Connect with him on X (@gadget_ry) or Mastodon (@gadgetry@techhub.social)

SenseTime has unveiled SenseNova 5.5, an enhanced version of its LLM that includes SenseNova 5oâtouted as China’s first real-time multimodal model.

SenseNova 5o represents a leap forward in AI interaction, providing capabilities on par with GPT-4o’s streaming interaction features. This advancement allows users to engage with the model in a manner akin to conversing with a real person, making it particularly suitable for real-time conversation and speech recognition applications.

According to SenseTime, its latest model outperforms rivals across several benchmarks:

At the World Artificial Intelligence Conference (WAIC) in Shanghai this weekend, SenseTime unveiled SenseNova 5.5.

The company claims the model outperforms GPT-4o in 5 out of 8 key metrics.

While I'd take it with a grain of salt, China's AI startups are showing major progress pic.twitter.com/1ZFbojHs3v
— Rowan Cheung (@rowancheung) July 8, 2024

Dr. Xu Li, Chairman of the Board and CEO of SenseTime, commented: “This is a critical year for large models as they evolve from unimodal to multimodal. In line with users’ needs, SenseTime is also focused on boosting interactivity.

âWith applications driving the development of models and their capabilities, coupled with technological advancements in multimodal streaming interactions, we will witness unprecedented transformations in human-AI interactions.”

The upgraded SenseNova 5.5 boasts a 30% improvement in overall performance compared to its predecessor, SenseNova 5.0, which was released just two months earlier. Notable enhancements include improved mathematical reasoning, English proficiency, and command-following abilities.

In a move to democratise access to advanced AI capabilities, SenseTime has introduced a cost-effective edge-side large model. This development reduces the cost per device to as low as RMB 9.90 ($1.36) per year, potentially accelerating widespread adoption across various IoT devices.

The company has also launched “Project $0 Go,” a free onboarding package for enterprise users migrating from the OpenAI platform. This initiative includes a 50 million tokens package and API migration consulting services, aimed at lowering entry barriers for businesses looking to leverage SenseNova’s capabilities.

SenseTime’s commitment to edge-side AI is evident in the release of SenseChat Lite-5.5, which features a 40% reduction in inference time compared to its predecessor, now at just 0.19 seconds. The inference speed has also increased by 15%, reaching 90.2 words per second.

Expanding its suite of AI applications, SenseTime introduced Vimi, a controllable AI avatar video generator. This tool can create short video clips with precise control over facial expressions and upper body movements from a single photo, opening up new possibilities in entertainment and interactive applications.

The company has also upgraded its SenseTime Raccoon Series, a set of AI-native productivity tools. The Code Raccoon now boasts a five-fold improvement in response speed and a 10% increase in coding precision, while the Office Raccoon has expanded to include a consumer-facing webpage and a WeChat mini-app version.

SenseTime’s large model technology is already making waves across various industries. In the financial sector, it’s improving efficiency in compliance, marketing, and investment research. In agriculture, it’s helping to reduce the use of materials by 20% while increasing crop yields by 15%. The cultural tourism industry is seeing significant boosts in travel planning and booking efficiency.

With over 3,000 government and corporate customers already using SenseNova across technology, healthcare, finance, and programming sectors, SenseTime is cementing its position as a key AI player.

(Image Credit: SenseTime)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

Tags: ai, artificial intelligence, benchmark, China, Model, multimodal, sensenova, sensetime