Skip to content
Home ยป Zhipu AI Unveils GLM-4.6V: A Vision Language Model with 128K Context and Integrated Tool Calling Capabilities

Zhipu AI Unveils GLM-4.6V: A Vision Language Model with 128K Context and Integrated Tool Calling Capabilities

Mistral AI has recently launched Devstral 2, a new family of coding models designed for software engineering agents. This announcement comes alongside the release of Mistral Vibe CLI, an open-source command line interface aimed at simplifying terminal-native development. These tools are expected to enhance the capabilities of developers working with AI-driven coding solutions.

In a separate development, Jina AI introduced Jina-VLM, a 2.4 billion parameter multilingual vision-language model. This model is focused on efficient visual question answering and understanding documents on limited hardware. It aims to improve accessibility and performance in various applications.

Google is also making strides in AI technology with its LiteRT NeuroPilot stack, which optimizes MediaTek Dimensity NPUs for running large language models directly on devices. This advancement could significantly enhance the performance of generative models on smartphones and other portable devices.

Meanwhile, Cisco has unveiled the Cisco Time Series Model, its first open-weights foundation model built on a decoder-only transformer architecture. This model is designed for analyzing time series data, particularly in observability and security metrics, providing valuable insights for businesses.

On the educational front, a new coding guide has been published that teaches how to create a procedural memory agent. This agent can learn, store, retrieve, and reuse skills over time, making it a useful resource for developers interested in building intelligent systems.

Additionally, Google Colab has integrated KaggleHub, allowing users to access Kaggle datasets and models with just one click. This feature is expected to streamline the workflow for data scientists and machine learning practitioners.

Lastly, Microsoft has released VibeVoice-Realtime, a lightweight text-to-speech model that supports streaming text input and generates long-form speech. This model aims to improve real-time speech applications, making it easier for developers to implement voice features in their projects.

These recent advancements highlight the rapid progress in AI technologies and their growing impact on software development and data analysis. As companies continue to innovate, the tools available to developers are becoming more powerful and accessible, fostering a new era of creativity and efficiency in the tech industry.