DD
MM
YYYY

PAGES

DD
MM
YYYY

spot_img

PAGES

Home Blog Page 3

Google Updates Google AI Studio with Advanced Features 

0

Google has introduced “vibe coding” as a major feature in Google AI Studio. This update rolled out in late 2025, and it has continued to evolve into 2026 with refinements, codelabs, and related tools.

Vibe coding is an AI-driven approach to software development where you describe your app idea in natural language; a “vibe” or high-level prompt, and Gemini generates a fully functional, runnable application — often including frontend, backend logic, AI integrations, and more — without you writing traditional code.

You can iterate conversationally by refining prompts, preview live, and deploy directly  to Cloud Run for production-ready hosting. It’s designed to make app building accessible to non-coders while speeding up prototyping for developers. The term “vibe coding” draws from earlier concepts popularized by figures like Andrej Karpathy in 2025 but Google has built a dedicated experience around it in AI Studio.

Key Features in Google AI Studio’s Vibe Coding Mode

Prompt-to-app generation — Start with a description like “Build a retro Snake game with a music player and AI-powered beat detection,” and it creates a working web app. Live previews and iteration — Edit via follow-up prompts, voice input in some cases, or direct tweaks.

AI integrations — Easily add Gemini-powered features like image generation/editing, video analysis, Google Search grounding, or external platform connections. One-click or prompt-based publishing to scalable hosting. App gallery and remixing — Browse, remix, and build on community or example apps.

Advanced models — Powered by Gemini’s latest coding-optimized versions. By early 2026, Google added codelabs; building games or apps deployable to Cloud Run, better prompting guides, database support, and integrations. Related tools — Google Labs introduced “vibe design” with Stitch for UI-focused prompting, including voice, which complements vibe coding for end-to-end app creation.

This has been praised for democratizing app development — turning ideas into live, shareable, AI-powered software in minutes — though some users note it works best with clear, iterative prompting to handle complex or production-grade needs.

Vibe Coding (in Google AI Studio) and Cursor AI represent two prominent approaches to AI-assisted software development in 2026, but they serve somewhat different users and workflows. Both fall under the broad “vibe coding” umbrella—where you describe ideas in natural language and let AI handle much of the heavy lifting—but their philosophies, strengths, and ideal use cases diverge significantly.

Non-coders, designers, rapid prototyping, quick AI-powered web apps, idea-to-live demos
Experienced developers, serious/local projects, large codebases, refactoring, production code Workflow Style.

Native Gemini strengths: image gen/edit, video analysis, Google Search grounding, easy external integrations. Strong code-focused; multimodal improving but less native than Gemini ecosystem. Excellent for UI/UX consistency; annotation mode, screenshot-to-app remixing. Good, but more code-oriented; can lose “vibe” in complex/multi-page designs.

Less control for complex and large-scale production code; browser-only; occasional rough edges on edge cases. Requires more technical knowledge; steeper learning for non-devs; no instant full-app deploy. Wins for quick prototypes, designers, non-technical founders; “unrivaled for visual fidelity and rapid ideation”
Often tops lists for coders; “best AI IDE” for deep work, refactors, real projects.

Vibe Coding shines when you want to go from “Build a retro Snake game with AI beat detection and music player” to a live, shareable web app in minutes—without touching code. It’s optimized for prompt-to-production in the browser, with strong Gemini multimodal capabilities. Many call it the go-to for designers or non-coders prototyping AI-infused apps.

Cursor excels when you’re already in code: it understands your entire repo, suggests/edits across files, debugs intelligently, and handles complex refactors via agents. Experienced devs often prefer it for serious work because it augments traditional coding rather than replacing it. It’s frequently ranked higher for “commercial-grade” or large projects.

Pick Cursor AI if you’re: a developer working on real apps, need precise control, repo-wide edits, or prefer a VS Code-like environment with supercharged AI. Many developers in 2026 actually use both: vibe code a rough prototype in AI Studio ? export/refine/polish in Cursor for production.

Google has also rolled out related tools like Firebase Studio and Antigravity IDE to bridge some gaps, but as of March 2026, pure vibe coding in AI Studio remains more prompt and agent-driven, while Cursor stays the powerhouse for hands-on coding acceleration.

Alibaba’s Workforce Shrinks 34% in 2025 to 128,197 as Company Sheds Offline Retail Assets and Doubles Down on AI Ambitions

0

Alibaba Group Holding Ltd. disclosed Thursday that its global headcount fell sharply to 128,197 employees as of December 31, 2025, a 34% reduction from 194,320 a year earlier.

The development underlines aggressive divestitures of labor-intensive offline retail businesses and a strategic pivot toward artificial intelligence as the company’s primary growth engine.

The headcount drop, one of the largest percentage declines among major global tech firms in recent years, was driven primarily by the 2024 sale of Sun Art Retail Group (a hypermarket chain) at the end of the year and the earlier exit from its stake in department store operator Intime. Those transactions removed tens of thousands of employees from Alibaba’s consolidated numbers and marked the culmination of a multi-year effort to streamline non-core, capital-heavy retail operations.

Alibaba’s latest quarterly earnings report, covering the December 2025 quarter, showed revenue slightly missing analyst expectations while net profit plunged 67% year-over-year, underscoring the financial strain from restructuring costs, competitive pressures in core e-commerce, and heavy investment in cloud and AI infrastructure. Shares in Hong Kong fell 6% on Friday, reflecting investor disappointment with the profit decline and cautious near-term outlook.

Alibaba CEO Eddie Wu used the earnings call to reiterate the company’s ambition to evolve into a full-stack AI enterprise, spanning semiconductor design and manufacturing, cloud computing infrastructure, foundational models, and agentic AI applications. Wu set an explicit target of growing combined cloud and AI revenue to more than $100 billion annually within five years, a roughly fourfold increase from current levels, positioning the unit as the principal driver of future profitability.

This week, Alibaba launched Wukong, an agentic AI service tailored for businesses that enables autonomous, multi-step task execution across enterprise workflows. The company also announced price increases of up to 34% for certain cloud and storage services, citing rising demand and higher supply-chain costs for advanced compute resources.

The workforce reduction aligns with this pivot. By offloading asset-heavy retail operations, Alibaba has freed up capital and management bandwidth to fund massive AI R&D and infrastructure build-out, including domestic GPU alternatives, large-scale model training, and agentic platforms. The company has also aggressively recruited AI talent in recent quarters, offsetting some of the broader headcount decline.

Alibaba’s 34% staff reduction in 2025 is among the most dramatic of any major global tech company over the past year. It follows a pattern seen across the sector, from Silicon Valley to Hangzhou, where firms have shed jobs to improve efficiency, refocus on core growth areas (particularly AI), and respond to slower revenue growth and margin pressure.

The cuts were far larger than the 11% reduction reported for December 2024 compared with the prior year, indicating acceleration in 2025 as divestitures closed and AI investment ramped up. Alibaba’s remaining workforce continues to support its dominant e-commerce platforms (Taobao, Tmall), cloud business (Alibaba Cloud), logistics arm (Cainiao), and emerging AI initiatives.

Alibaba’s Hong Kong-listed shares declined 6% on Friday, extending year-to-date losses amid investor caution over the profit plunge, ongoing competitive intensity in e-commerce, and uncertainty surrounding the pace of AI monetization. The sharp workforce reduction was viewed as both a positive signal of cost discipline and a reminder of the challenges in transitioning from a consumer-internet giant to an AI-first enterprise.

Analysts noted that while the headcount drop improves operating leverage in the near term, the success of Alibaba’s AI strategy, including Wukong, cloud price adjustments, and semiconductor efforts, will be critical to reversing margin compression and driving sustainable growth.

Alibaba’s 2025 results and workforce disclosure mean the company is shedding legacy retail assets to fuel an all-in bet on AI across the stack. The company is positioning itself as a full-spectrum AI player, from chips to models to agentic applications, in direct competition with global leaders like Microsoft, Google, Amazon, and domestic rivals including Tencent and Baidu.

Some analysts have described the $100 billion cloud-and-AI revenue target over five years as one of the most ambitious growth projections in global tech. However, they warn that execution will depend on scaling compute capacity amid U.S. export restrictions, achieving rapid enterprise adoption of agentic tools like Wukong, and maintaining pricing power in a competitive cloud market.

OpenAI’s Agreement to Acquire Astral is a Strategic Investment Pivot

0

OpenAI has announced an agreement to acquire Astral, a startup known for building high-performance, open-source developer tools in the Python ecosystem.

Astral is behind popular tools including: An extremely fast Python package and project manager (alternative to pip and similar tools). Ruff: A super-fast Python linter and formatter written in Rust. ty: A high-speed Python type checker and language server. These tools are widely adopted, powering millions of developer workflows and forming a key part of modern Python development.

The deal will bring Astral’s team and expertise into OpenAI’s Codex group; OpenAI’s AI-powered coding assistant and system, which has seen rapid growth and millions of users. OpenAI aims to accelerate Codex development, enabling deeper integrations so AI agents can interact more seamlessly with real developer tools across the full software development lifecycle.

Both companies emphasized a continued commitment to open source: OpenAI plans to support and maintain Astral’s projects post-closing, with ongoing community-focused development. Financial terms were not disclosed. The acquisition is not yet finalized—it remains subject to customary closing conditions and regulatory approval.

This move fits into OpenAI’s broader push to strengthen its position in AI-driven coding tools, especially amid competition from rivals like Anthropic which made a similar move by acquiring Bun in late 2025. It also aligns with OpenAI’s recent pattern of acquisitions to bolster developer ecosystems and agentic AI capabilities.

This is a strategic step toward making AI more deeply embedded in everyday developer workflows rather than just code generation. Anthropic acquired Bun in late 2025, marking its first public acquisition and a strategic move to bolster its AI-powered coding tools. Bun, the high-performance JavaScript and TypeScript runtime, toolkit, bundler, package manager, and test runner created by Jarred Sumner, joined Anthropic to power and accelerate Claude Code — Anthropic’s AI coding agent.

Claude Code launched generally available in May 2025 had already adopted Bun earlier in the year and shipped as a Bun executable to millions of users. The acquisition ensures stability, faster performance, and deeper integration for Claude Code, the Claude Agent SDK, and future AI coding products.

Bun remains fully open source under the MIT license, with ongoing public development on GitHub. Anthropic committed to continued investment in Bun’s core features for the broader JS/TS developer community. Financial terms were not disclosed. The move coincided with Anthropic revealing that Claude Code had hit a $1 billion annualized revenue run rate, highlighting the explosive growth of its developer-focused AI tools.

This acquisition reflects the intensifying competition in AI-driven developer workflows. AI companies are moving beyond pure model capabilities to vertically integrate the full software development stack — from runtime environments and tooling to agentic interactions. For context, Claude Code relies heavily on Bun for speed and reliability.

Post-acquisition, Claude Code saw notable performance gains attributed to Bun’s team led by Jarred Sumner working directly on optimizations. It parallels recent moves like OpenAI’s acquisition of Astral to enhance Codex with Python ecosystem tools (uv, Ruff, ty).

This fits Anthropic’s broader strategy: disciplined acquisitions that align with technical excellence, enterprise strength, and safe AI development — while keeping core tools open and community-driven. It’s a clear signal that winning in AI coding means owning or deeply controlling the infrastructure developers actually use.

Nvidia CEO Says Would Be Alarmed If $500k Engineers Do Not Spend Up to $250k Yearly on Token

0

Nvidia CEO Jensen Huang delivered a striking message for engineering talent during an appearance on the “All-In Podcast” episode published Thursday: top engineers who fail to consume hundreds of thousands of dollars worth of AI tokens annually are a cause for serious concern.

Huang stated he would be “deeply alarmed” if one of Nvidia’s $500,000-a-year engineers spent less than half that amount — $250,000 — on AI tokens over the course of a year.

“That $500,000 engineer at the end of the year, I’m going to ask them how much did you spend in tokens? If that person said $5,000, I will go ape something else,” he said.

When asked whether Nvidia itself is spending $2 billion annually on tokens for its engineering team, Huang replied: “We’re trying to.”

He drew a direct analogy to outdated methods: “This is no different than one of our chip designers who says, ‘Guess what? I’m just going to use paper and pencil.’”

Huang argued that engineers who underutilize AI tokens are effectively limiting their own productivity and impact.

Tokens as a Recruiting and Productivity Tool

Huang went further, revealing that AI token budgets are already becoming a competitive recruiting lever in Silicon Valley.

“They’re going to make a few hundred thousand dollars a year, their base pay,” he said of engineers. “I’m going to give them probably half of that on top of it as tokens so that they could be amplified 10X.”

“It is now one of the recruiting tools in Silicon Valley: How many tokens comes along with my job?” Huang added. “And the reason for that is very clear, because every engineer that has access to tokens will be more productive.”

Tokens, the basic unit used by large language models to process and generate text, are typically charged on a per-thousand or per-million basis by providers such as OpenAI, Anthropic, Google, and others. Heavy usage can quickly become expensive, especially for engineers running large-scale experiments, fine-tuning models, or building complex agentic workflows.

Tokens as the “Fourth Component” of Compensation

Huang is not alone in viewing generous AI compute access as a critical talent differentiator. Business Insider reported earlier in March 2026 that tech companies are experimenting with offering token budgets alongside traditional salary, bonuses, and equity — effectively treating inference power as a new form of compensation.

Tomasz Tunguz of Theory Ventures described tokens as a potential “fourth component” of pay packages. Peter Gostev, AI capability lead at Arena (a startup focused on model performance benchmarking), suggested that frontier labs like OpenAI and Anthropic could create recruitment marketplaces listing token budgets alongside salary ranges.

Thibault Sottiaux, an engineering lead on OpenAI’s Codex team, noted on X that candidates increasingly ask how much compute they will receive.

Even OpenAI CEO Sam Altman has speculated about a future where compute access replaces traditional income support. In a May 2024 appearance on the same “All-In Podcast,” Altman mused: “I wonder if the future looks something more like Universal Basic Compute than Universal Basic Income, and everybody gets a slice of GPT-7’s compute. And they can use it, they can resell it, they can donate it to somebody to use for cancer research, but what you get is not dollars but this like slice, you own part of the productivity.”

As Engineers Shift to Token

Huang’s comments reflect Nvidia’s unique position at the center of the AI boom. As the dominant supplier of GPUs for training and inference, Nvidia benefits directly from skyrocketing token consumption across the industry. By framing heavy token usage as a productivity imperative — and even a recruiting tool — Huang is reinforcing the narrative that access to advanced AI compute is now a core requirement for top engineering talent.

The remarks also highlight a shift in how companies measure engineering productivity. Traditional metrics (lines of code, features shipped) are giving way to compute-intensive workflows: model experimentation, agent orchestration, large-scale data processing, and real-time inference. Engineers who underuse tokens may be seen as operating below their potential in an AI-native environment.

For talent acquisition, token budgets could become a powerful differentiator — especially as frontier models grow more expensive to run at scale. Startups and large tech firms alike may increasingly compete not just on salary and equity, but on how much high-quality inference capacity they can provide.

Huang’s stance is likely to accelerate the trend toward compute-inclusive compensation packages across Silicon Valley and beyond. As AI agents and multimodal models become central to software development, engineering roles will demand ever-larger token allocations — turning inference spend into a visible line item in hiring negotiations.

Nvidia itself stands to benefit disproportionately: more engineers consuming more tokens means more demand for Nvidia GPUs and cloud capacity. The company’s push to make heavy token usage a performance expectation — and even a hiring criterion — further cements its central role in the AI talent and productivity ecosystem.

The notion also underscores a broader philosophical shift: in the AI era, raw human intelligence is increasingly amplified (and measured) by access to compute. Engineers who maximize their token spend aren’t just more productive — they’re demonstrating mastery of the new tools defining the profession. For top talent, the question may soon be less “how much equity?” and more “how many tokens come with the job?”

Cursor Releases “Composer” which Outperforms Existing Coding Models

0

Cursor; the AI-powered code editor from Anysphere has recently released Composer 2, their latest in-house “agentic” coding model. These are proprietary models optimized for low-latency, agentic coding—meaning the AI can autonomously plan, edit multiple files, test, and iterate on code within your codebase, rather than just suggesting snippets.

The big focus has always been on combining strong coding intelligence with exceptional speed often 4x faster than comparable frontier models in earlier versions and now, very competitive pricing. Cursor announced Composer 2, positioning it as achieving frontier-level coding performance at a dramatically lower cost.

CursorBench: 61.3%; their internal real-world coding tasks benchmark. Terminal-Bench 2.0: 61.7%; beats Anthropic’s Claude Opus 4.6 at 58.0%, though trails OpenAI’s GPT-5.4 at 75.1%. SWE-bench Multilingual: 73.7%. This represents a significant jump ~17 points on some metrics from Composer 1.5 in a short cycle.

It’s described as on par with or beating models like Claude Opus 4.6 in practical coding scenarios, while being much more affordable. Composer 2 Standard: $0.50 per million input tokens/ $2.50 per million output tokens. Composer 2 Fast (higher speed variant, now the default): $1.50 / $7.50 per million.

For comparison, Claude Opus 4.6 is around $5/$25, and GPT-5.4 is higher—making Composer 2 roughly 3-10x cheaper depending on the competitor. Technical edges include scaled reinforcement learning (RL) on real Cursor usage data, self-summarization for better long-context handling in multi-step tasks, and optimization for interactive agentic workflows in the IDE.

Users and early testers on X are calling it “legit,” with reports of it catching subtle bugs that other models including Claude and various GPT variants missed, and handling complex refactors efficiently. There are also leaks/claims that the base might build on Moonshot AI’s Kimi K2.5 with heavy continued pretraining + RL on Cursor’s proprietary coding data, which would explain the speed/cost advantages over fully proprietary frontier models from OpenAI or Anthropic.

It depends on the metric:Yes, on speed + cost + practical agentic coding in Cursor’s environment especially vs. similarly priced or even higher-priced options like recent Claude versions.

Partially, on raw intelligence—it’s frontier-level and beats some; Opus 4.6 on certain benches, but top models like GPT-5.4 still lead on the hardest tasks. The real win is the value: high performance at a fraction of the cost, making it feel like it outperforms in real developer workflows.

SWE-bench is one of the most widely used and respected benchmarks for evaluating how well large language models (LLMs) and AI coding agents can handle real-world software engineering tasks. Introduced in late 2023, it stands out because it uses actual problems from GitHub rather than synthetic or toy coding exercises.

SWE-bench tasks an AI with resolving real GitHub issues from popular open-source repositories. For each task, the model receives: The full codebase at the state before the issue was fixed. The issue description (title + body from GitHub). Sometimes additional context like comments. The goal is to generate a code patch (diff) that fixes the problem. Success is measured automatically: the patch is applied in a clean Docker environment, and the model’s change must make the relevant unit tests pass. Original full SWE-bench: ~2,294 tasks, all from 12 popular Python repositories.

Tasks include bug fixes, small features, refactors, and more — reflecting genuine developer work. This makes it much harder and more realistic than older benchmarks like HumanEval which tests isolated function completion because it requires: Understanding large, complex codebases often tens of thousands of lines.

Navigating dependencies and repo structure. Interpreting sometimes ambiguous or poorly written issue reports. Generating multi-file edits that don’t break existing functionality. SWE-bench Verified: A cleaned, human-validated subset of 500 tasks. Annotators checked that issues are clear, tests are correct, tasks are solvable from the given info, and no data leaks/memorization artifacts exist.

This version is more reliable for comparing models; less noise from bad tasks. Top models in early 2026 reach ~75-82% on Verified. SWE-bench Lite: A smaller, easier subset often ~300 tasks used for faster evaluation or when full runs are too expensive.

SWE-bench Multilingual: Extends the idea beyond Python. It includes 300+ curated tasks from repositories in 9 languages; Java, TypeScript/JavaScript, Go, Rust, C/C++, etc. This tests cross-language understanding and generalization — performance is noticeably lower than on Python-only versions because most frontier models are still heavily Python-biased in training data.

There are also community forks and extensions like SWE-bench Pro, SWE-bench Live, and others that add multi-language depth, harder tasks, or anti-contamination measures. Scores are usually % Resolved. On SWE-bench Verified: Often higher, e.g., 76-82% for top models like Claude Opus 4.6 or newer Sonnet/Opus variants.

On SWE-bench Multilingual: Lower overall, highlighting gaps in non-Python performance. In the context of Cursor’s Composer 2, they reported 73.7% on SWE-bench Multilingual — a very strong result, especially at their price point, showing it’s competitive even on the harder cross-language version.

Tests agentic capabilities: planning, exploration, multi-file editing, debugging loops. Many tasks are relatively “simple” bug fixes; hours of human work, not days/weeks. Potential data contamination/memorization risks some papers argue top scores partly come from models “remembering” popular repos.

Overall, SWE-bench and especially Verified + Multilingual remains the de facto standard for agentic coding evaluation in 2026 — far more indicative of real usefulness in tools like Cursor, Devin-style agents, or GitHub Copilot Workspace than function-level benchmarks.