The Generative AI Meetup Podcast

Hosted by Mark and Shashank, software engineers and organizers in Silicon Valley. Get their grounded perspective each week as they explore the generative AI landscape through news analysis, tech discussions, hands-on experiments, and clear explanations.

Dive into the latest language models, AI agent capabilities, and RAG techniques. Understand the hardware race, key research, startup trends, benchmarks, and the real-world impact of AI across industries like healthcare, robotics, and creative work. We also test AI limits, explain core concepts, discuss ethics, and interview builders shaping the field.

For engineers, developers, researchers, and anyone seeking a practical understanding of AI’s rapid evolution and its applications.

Listen on:

Episodes

4 days ago

The Best Open Source US Model (Right behind China)

4 days ago

https://novacut.ai/
https://genaimeetup.com/
Anthropic has officially closed a $65 billion Series H at a $965 billion valuation, nearly 2.5x its valuation from just 100 days ago. Meanwhile, funding is flowing across the ecosystem: Frameworks AI at $15B, Baseten at $11B, OpenRouter's $113M Series B, and Cognition AI's $1B Series D.
NVIDIA went on an open-source super week with Nemotron 3 Ultra, Cosmos 3, and Nemotron 3.5 ASR. Microsoft dropped 5 new MAI models. Google released Gemma 4 12B, and Anthropic shipped Opus 4.8.
On the benchmarks front, DeepSWE crowns GPT-5.5 as the leader in long-horizon coding tasks, while ITBench shows even frontier models struggle with real-world SRE incidents — Claude Opus 4.7 tops out at just 47%.
Plus: Cloudflare acquires VoidZero to build the future of AI-native edge development, and Google is paying SpaceX $920M/month for compute.
Topics covered: • Anthropic's $65B Series H and path to $1T • Fireworks AI, Baseten, OpenRouter & Cognition funding rounds • Microsoft's 5 new MAI models • NVIDIA's open-source super week (Nemotron, Cosmos 3) • MiniMax M3, Gemma 4 12B, JetBrains Mellum2, Opus 4.8 • DeepSWE benchmark: GPT-5.5 leads long-horizon coding • ITBench: Frontier models under 50% on real SRE tasks • Cloudflare + VoidZero for AI-native edge dev • Google's $920M/month SpaceX compute deal
#AI #Anthropic #NVIDIA #OpenAI #AInews #TechNews #LLM

Funding rounds
Anthropic formally confirmed the closure of its $65 billion Series H funding round at a post-money valuation of $965 billion. This represents a 2.5-fold increase over its $380 billion Series G valuation from February 2026, adding $585 billion in value in approximately 100 days
https://www.anthropic.com/news/series-h
Frameworks AI raising at 15B valuation representing a near fourfold increase from its $4 billion Series C valuation recorded in October 2025
processing 15 trillion tokens daily for major production clients including Cursor, Notion, and Perplexity
https://finance.yahoo.com/sectors/technology/articles/fireworks-ai-eyes-15-billion-174609357.html
Baseten is raising 1B at 11B valuation
annualized revenue, which skyrocketed from $200 million to $600 million over a single quarter
https://techstartups.com/2026/05/26/ai-inference-startup-baseten-in-talks-to-raise-1-billion-at-11-billion-valuation/
OpenRouter has secured a $113 million Series B funding
OpenRouter has experienced exponential traffic growth, with weekly production throughput expanding fivefold from 5 trillion to 25 trillion tokens over a six-month horizon
https://www.businesswire.com/news/home/20260526953416/en/OpenRouter-Raises-%24113-Million-CapitalG-led-Series-B-as-Weekly-Volume-Explodes-to-25T-Tokens
Further up the stack: Cognition AI secured a $1 billion Series D round led by Lux Capital and 8VC
https://cognition.ai/blog/series-d

Model Releases
MAI models:
MAI-Code-1-Flash: A 5-billion active parameter model optimized for ultra-low latency within GitHub Copilot and VS Code.
MAI-Image-2.5: A high-fidelity image generation model ranking third on global image evaluation arenas, outperforming competing architectures like Nano Banana Pro.
MAI-Transcribe-1.5: A multi-lingual speech processing engine offering fivefold speed improvements across 43 languages.
MAI-Voice-2: Natural audio and voice generation across 15 languages, available at a highly competitive price point.
Web IQ: A search-grounding API engineered to directly compete with Perplexity.
https://microsoft.ai/models/

https://www.peoplematters.in/news/ai-and-emerging-tech/uber-imposes-dollar1500-monthly-ai-spending-limit-on-employees-amid-rising-costs-50073

Nvidia has executed an "Open-Source Super Week," positioning itself as a dominant software and model publisher:
Nemotron 3 Ultra (best US open source open weights model but behind china): A massive 550-billion parameter MoE (55 billion active) designed with a 1-million token context window, optimized specifically for high-throughput, cyclical agent loops. It achieved peak throughput rates of 400 tokens per second on day-zero optimized clusters.
Cosmos 3: A physical AI world-modeling framework comprising 16-billion Nano and 64-billion Super variants. Built on a Mixture-of-Transformers (MoT) architecture, Cosmos 3 natively binds textual, visual, auditory, and physical kinetic vectors.
Nemotron 3.5 ASR: A highly compact 0.6-billion parameter streaming speech recognition model pushing sub-100 millisecond latencies across 40 language locales.

https://www.minimax.io/models/text/m3
MiniMax M3: A 1-million token context model hitting 59.0% on SWE-Bench Pro and 74.2% on MCP Atlas, though noted for high token consumption due to intensive internal self-validation loops.

https://blog.google/innovation-and-ai/technology/developers-tools/introducing-gemma-4-12b/
Gemma 4 12B: Google's Apache 2.0 on-device model, which utilizes an encoder-free architecture that projects vision and audio vectors directly into the text-token space, bypassing separate CLIP-style encoders to minimize local memory footprints.
https://www.jetbrains.com/mellum/
JetBrains Mellum2: A compact 12-billion parameter MoE (2.5 billion active) engineered for ultra-low latency routing and retrieval-augmented generation (RAG) sub-agents within developer IDEs.
Opus 4.8
https://www.anthropic.com/news/claude-opus-4-8

https://www.cnbc.com/2026/06/05/google-to-pay-spacex-920-million-a-month-for-xai-compute-capacity.html

Benchmarks:
https://deepswe.d atacurve.ai/blog
https://venturebeat.com/technology/deepswe-blows-up-the-ai-coding-leaderboard-crowns-gpt-5-5-and-finds-claude-opus-exploiting-a-benchmark-loophole (GPT 5.5 the winner in long horizon tasks)
a highly complex software engineering benchmark focused on original, long-horizon tasks across five distinct programming languages. Comprising 113 chaotic tasks across 91 live, production-grade repositories, DeepSWE forces agents to generate 5.5 times more code and modify an average of 7 separate files per task compared to standard evaluations. On this challenging leaderboard, GPT-5.5 leads with a score of 70%, establishing a significant 16-percentage-point lead over contemporary alternatives
I think older benchmarks where models reach ~90% accuracy can be considered saturated. Few percentage points don’t give us any good signal.
https://research.ibm.com/publications/developing-ai-agents-for-it-automation-tasks-with-itbench
ITBench-AA, an evaluation framework focusing on live Kubernetes incident response and Site Reliability Engineering (SRE) operations. Comprising 59 live, containerized SRE incident snapshots, the results are remarkably sobering: every frontier model scored under 50% on successful incident resolution, with Claude Opus 4.7 leading at 47% and GPT-5.5 following closely at 46%.

Edge AI announcements:
https://www.cloudflare.com/press/press-releases/2026/cloudflare-acquires-voidzero-to-build-the-future-of-the-ai-native-web/
The consolidation of the AI-native developer stack has reached the runtime virtualization layer. Cloudflare recently completed the acquisition of VoidZero, the development group responsible for Vite, Vitest, Rolldown, and Oxc, backing the transaction with a $1 million open-source ecosystem fund. This acquisition is highly strategic; as autonomous agents write an increasing proportion of production software, local development environments, compilation pipelines, and bundlers must be optimized for execution speeds that match agent speeds.
Cloudflare's goal is to construct a localized, full-stack edge playground. In this sandbox, AI agents can generate, test, bundle (utilizing the highly parallelized, Rust-based Oxc and Rolldown engines), and deploy entire web applications end-to-end within milliseconds. This architecture completely bypasses traditional local machine container bottlenecks, enabling high-velocity agent loops to execute in a fully sandboxed, web-scale edge runtime.

Sunday May 24, 2026

Karpathy Joins Anthropic and the AI Compute Gold Rush

Sunday May 24, 2026

This week on AI Meta, we break down Andrej Karpathy’s move to Anthropic, Claude’s growing developer mindshare, and why recursive self-improvement may be the next major frontier in AI. We also cover Google’s latest Gemini announcements, Anthropic’s reported compute deal with xAI/SpaceX, the rise of gray-market Claude API access in China, OpenAI’s ongoing drama, Cerebras, Nvidia, Intel, and Leopold Aschenbrenner’s massive AI infrastructure bets.
Plus: SpaceX IPO speculation, Cursor, Grok, and why the AI economy increasingly looks like a global casino. Not financial advice.

https://novacut.ai

Thursday Apr 30, 2026

How the AI bubble will pop

Thursday Apr 30, 2026

https://novacut.ai/

Shashank and Mark break down a packed few weeks in AI: new open-source and local models, MCP and tool use, Chinese labs closing the frontier-model gap, Google's full-stack advantage, and the exploding cost of data-center buildouts.
They debate whether AI spending keeps compounding as models get cheaper and demand rises, or whether smaller, local, good-enough models eventually puncture today's valuations.

Thursday Apr 09, 2026

Local AI Models Are Here, Mythos Rumors, and Building an AI Agent Company

Thursday Apr 09, 2026

In this episode, recorded across two continents with Mark in Fukuoka, Japan and Shashank by the Ganges River in Rishikesh, India, the hosts dive into the rapidly evolving AI landscape. They explore how ChatGPT has reached mass adoption even among yoga practitioners and Ayurvedic doctors in India, discuss the newly released Gemma 4 and its surprising quality as a local model, debate the rumors around Anthropic's powerful Mythos model, and examine the recent Axios npm supply chain attack. Most notably, Mark shares his experience building a largely autonomous software development workflow using AI coding agents and Playwright MCP — burning through $350/month in AI subscriptions but achieving the output of a 50-person engineering team. They discuss the future of AI agent-driven companies and why human taste still matters in product design.

Thursday Mar 05, 2026

AI Matches Human Intelligence, Pentagon Drama, and the Rise of Agent Swarms

Thursday Mar 05, 2026

Youtube Channel: https://www.youtube.com/@GenerativeAIMeetup
Mark's Travel Vlog: https://www.youtube.com/@kumajourney11
Mark's Personal Youtube Channel: https://www.youtube.com/@markkuczmarski896
Attend a live event: https://genaimeetup.com/
Shashank Linked In: https://www.linkedin.com/in/shashu10/
Novacut: https://novacut.ai

Mark and Shashank break down the latest developments in AI from their travels in Fukuoka and Seychelles. They cover Gemini 3.1 Pro matching human performance on the ARC-AGI-1 benchmark at a fraction of the cost, the upcoming ARC-AGI-3 video game-style test, and why only three US companies (OpenAI, Anthropic, Google) seem to be pushing state-of-the-art right now while Meta and xAI deal with leadership shakeups.
The conversation moves to OpenAI's GPT 5.3 Codex Spark model running on Cerebras hardware for lightning-fast inference, Abu Dhabi's M42 initiative sequencing 700,000+ genomes and centralizing health records for AI-driven healthcare, and the viral OpenClaw incident where an AI agent wrote a hit piece on a human open-source maintainer who rejected its pull request.
They also discuss the Anthropic vs. Pentagon drama over autonomous weapons and mass surveillance restrictions, an ex-Google Maps PM who vibe-coded a Palantir-style intelligence dashboard in a weekend, and their hands-on experiences with Claude Code, Codex, Cursor, and MCP integrations. The episode wraps with thoughts on agent swarms, the human-in-the-loop problem for taste-driven tasks, and whether we're close to the first solo-founder billion-dollar company powered entirely by AI agents.

Monday Feb 02, 2026

Kimi K2.5, Genie 3, Space Data Centers & The Rise of Moltbook

Monday Feb 02, 2026

After a brief hiatus, Mark and Shashank dive into the whirlwind of AI developments from recent weeks. They explore Kimi 2.5's impressive open-source capabilities, Google's groundbreaking Project Genie world model, and AI solving previously unsolved mathematical problems. The conversation shifts to the Davos discussions between Demis Hassabis and Dario Amodei on AGI timelines, before taking a fascinating detour into space-based data centers. The episode culminates with an in-depth look at OpenClaw (formerly ClawdBot) and Moltbook—a Reddit-like social network for AI agents that's spawning everything from cryptocurrency to manifestos. The hosts grapple with both the exciting possibilities and unsettling implications of autonomous AI agents collaborating at scale.

Tuesday Jan 06, 2026

Groq, Hotel Delivery Robots, and Mark Launches a Company

Tuesday Jan 06, 2026

It’s been a travel-heavy hiatus—Mark’s been living in Spain and Shashank’s been bouncing across Asia (including a month in China)—but they’re back to unpack a packed week of AI news. They start with the headline hardware story: the Groq (GROQ) deal/partnership dynamics and why ultra-fast inference is becoming the next battleground, plus how this could reshape access to cutting-edge serving across the ecosystem. From there, they pivot to NVIDIA’s CES announcements and what “Vera Rubin” implies for data center upgrades, cost-per-token curves, and the messy real-world math of rolling hardware generations. Shashank then brings the future to life with on-the-ground stories from China: a Huawei “everything store” that feels like an Apple Store meets a luxury dealership, folding devices that look straight out of sci-fi, and a parade of robots—from coffee bots to delivery robots that can ride elevators and deliver to your hotel room. They also touch on companion-style consumer robots and why “cute” might be a serious product strategy. Finally, Mark announces the launch of Novacut, a long-form AI video editor built to turn hours of travel footage into a coherent vlog draft—plus export workflows for Premiere, DaVinci Resolve, and Final Cut. They close by talking about the 2026 shift from single model calls to “agentic” systems, including a fun (and slightly alarming) lesson from LLM outcome bias using poker hand reviews. Topics include: Groq inference, NVIDIA + CES, Vera Rubin GPUs, GPU depreciation math, China robotics, Huawei ecosystem, hotel delivery bots, companion robots, Novacut launch, Cursor vs agent workflows, and why agents still struggle with sparse feedback loops. Link mentioned: Novacut — https://novacut.ai

Friday Nov 21, 2025

Gemini 3, GPT-5.1, Anti-Gravity & Yann LeCun’s Exit: Are We Near AGI or Just in a Bubble?

Friday Nov 21, 2025

Youtube Channel: https://www.youtube.com/@GenerativeAIMeetup
Mark's Travel Vlog: https://www.youtube.com/@kumajourney11
Mark's Personal Youtube Channel: https://www.youtube.com/@markkuczmarski896
Attend a live event: https://genaimeetup.com/
Shashank Linked In: https://www.linkedin.com/in/shashu10/

In this episode of the Generative AI Meetup Podcast, Mark (in Ohio) and Shashank (in India) finally sit down after a month of travel to unpack a very eventful stretch in AI. They dive into Google’s new Gemini 3 Pro, its standout scores on Humanity’s Last Exam and ARC-AGI, and why these reasoning benchmarks matter more than yet another near-perfect standardized test score. Mark also makes a public feature request to DeepMind: please increase Gemini’s max output tokens.
From there they get hands-on with the developer experience:
Google’s new Anti-Gravity coding IDE (and how it compares to Cursor)
Using GPT-5.1 Codex High in Cursor’s autonomous “plan mode”
Why long context and long output windows are critical for deep research and book-length projects
The conversation then shifts to the bigger picture:
LLMs as therapists, sycophancy, safety, and the danger of AI always agreeing with you
Mark’s rant on robotics, humanoid robots, and a coming age of extreme abundance where robots handle most physical and intellectual work
Why learning to code may become the mental equivalent of going to the gym—a “brain gym” in a world where AI can do most practical tasks
They also cover the latest AI industry drama and milestones:
Yann LeCun leaving Meta, what that might signal about Big Tech AI labs, and how godfathers like Hinton, LeCun, and Bengio see the road to AGI
DeepMind’s new game-playing agent and why world models in 3D environments matter for real-world robotics
Genspark hitting unicorn status and what it means for “ChatGPT wrapper” startups
Co-inventing a new term on air: a “narwhal” = a trillion-dollar private company
If you’re curious about where frontier models, coding agents, robotics, and AGI trajectories all intersect—plus some philosophical musing on jobs, meaning, and abundance—this episode is for you.

Thursday Oct 30, 2025

Neo Arrives: Tele-Ops Today, AGI Tomorrow?

Thursday Oct 30, 2025

From a tiny island in Seychelles to the heartland of Ohio, we unpack a wild week in AI. First up: 1X’s “Neo” humanoid—$20k to buy or $500/month to rent—promising laundry, dishes, and errands soon…with a lot of teleoperation today. We debate whether tele-ops is a feature (not a bug), who it employs, and how quickly autonomy could follow. Then we zoom out to the money: Nvidia touches a $5T valuation, OpenAI reportedly eyes a $1T IPO, and the industry’s circular funding loops raise both eyebrows and opportunity. We also test-drive OpenAI’s Atlas browser (a Chromium fork with action-taking ambitions), and dig into Cursor’s agentic coding push, new in-house model, and blistering growth—plus the eternal “moat vs. momentum” question. Along the way: a live Neo preorder, enterprise ROI reality checks, and why agents may turn devs into project managers. If you’re curious where robotics, chips, and agentic software collide, this one’s for you. Ask a question on our
Youtube Channel: https://www.youtube.com/@GenerativeAIMeetup
Mark's Travel Vlog: https://www.youtube.com/@kumajourney11
Mark's Personal Youtube Channel: https://www.youtube.com/@markkuczmarski896
Attend a live event: https://genaimeetup.com/
Shashank Linked In: https://www.linkedin.com/in/shashu10/

Monday Oct 06, 2025

Sora 2, Claude 4.5 Sonnet, and the AI Browser Wars: Is NVIDIA Unstoppable?

Monday Oct 06, 2025

In this episode of the Gen.AI Meetup Podcast, hosts Shashank and Mark dive into the latest AI developments that are reshaping how we create, code, and browse. They explore OpenAI's impressive Sora 2 video generation model and its built-in social network, compare it with Google's VO3, and discuss whether AI-generated content will become mainstream entertainment.
The conversation shifts to the newest coding models, including Anthropic's Claude 4.5 Sonnet and Grok 4 Fast, examining their performance, pricing, and whether they're worth the cost for developers. Mark shares his experience vibe coding with Cursor and why faster, cheaper models might be better than the most powerful ones.
The hosts also explore the maturing AI browser space, discussing Perplexity's Comet browser, Dia from the Browser Company, and Google's Gemini integration in Chrome. They debate whether these AI-native browsers can convince users to switch from Chrome and what features would actually make them indispensable.
Finally, they tackle the big question: Is NVIDIA's $4.5 trillion valuation justified? They discuss the company's dominance in AI chips, the circular investment patterns in the industry, and whether specialized compute chips can compete with NVIDIA's end-to-end ecosystem.
Timestamps:
0:00 - Intro & OpenAI's Sora 2 announcement
8:30 - Sora 2 vs Google VO3: The new video generation king
15:45 - Claude 4.5 Sonnet: Worth the premium price?
25:20 - Grok 4 Fast: Crazy cheap, crazy fast
35:15 - NVIDIA's dominance: Bubble or justified?
50:40 - AI browsers: Comet, Dia, and the future of browsing
1:02:15 - Ambient computing and what's next
Mentioned Resources:
OpenRouter - Multi-model API aggregator
Cursor - AI-powered code editor
Perplexity Comet - AI-native browser
Upcoming event: Coding Agents Showcase - Jan 9th, Palo Altohttps://partiful.com/e/joRDIOYMqpogKjNtvlHY
Don't forget to RSVP for our Coding Agents event featuring Zed, Augment Code, Code Flash, Factory AI, and more! Spots are limited and filling fast.
Have questions? Drop them in the YouTube comments and we'll answer them in future episodes!