Contenuto principale

Risultati per

Mitigating chat failures in AI code development
Duncan Carlsmith
Department of Physics, University of Wisconsin-Madison
Tidal Disruption Explorer (MATLAB File Exchange 183760). The process of porting this Live Script ito HTML is described in this post.
Introduction
An agentic AI session ended for me this week with the message: "Claude is unable to respond to this request, which appears to violate our Usage Policy. Please start a new chat." Gulp. The substance of the conversation was completely benign — porting my MATLAB Live Script Tidal Disruption Explorer that simulates a self-gravitating cluster of particles being shredded by tidal forces near a massive object, like Comet Shoemaker-Levy 9 was shredded by Jupiter in 1992. The next chat picked up the work and finished it in seven turns.
Why nothing was lost is the subject of this post. The new product is the HTML5 port of Tidal Disruption Explorer and deployed at duncancarlsmith.github.io/TidalDisruptionExplorer-HTML5. But the more transferable product may be practices that can help make AI-assisted code development resilient to chat failures, connection drops, sandbox losses, and content-policy false positives. Two prior posts set my context: Live Script deployed as a 3D web application with AI introduced the workflow, and Giving All Your Claudes the Keys to Everything introduced the ngrok command server that makes the Mac controllable from any AI client. This post is about how to use such tools without losing your work when the chat dies.
Failure modes worth designing for
Long agentic sessions can fail in many ways, and most are out of the user's control. The bash_tool connection in the cloud container can go unresponsive mid-task. A stray Python process can mask a real command server on the same port. A lost development sandbox can vaporize generated artifacts — in an earlier turn of this same project, an entire test-harness directory disappeared with the sandbox and had to be reconstructed from the conversation log. Persistent context is not in fact persistent. Skills are forgotten. The user closes the laptop, the WiFi blinks off, or the chat hits a length limit. This project used Claude, but the problems are not AI-specific in my experience with 5-6 leading vendors. Without preparation, each of these is a real setback.
Best practices to consider
1. Externalize project state in a committed PROGRESS journal
A single file, committed in a repo, names every milestone, the test-pass count for each, the current state in prose, and an explicit "Recovery instructions for a fresh session" section that lists the source files, the test harness names, and the toolchain assumptions. When the previous chat failed, the next one resumed from this file alone, without needing the failed conversation. When the dev sandbox loss took out 10 test harnesses, they were rebuilt from the conversation log because the journal had recorded exactly what each harness checked and the expected pass count for each. These harnesses are also stored locally when complete and successful.
2. Two external locations:
The container contents are fragile even without a chat failure due to context compaction and hidden file management. I chose a local working directory as the editable source of truth. A GitHub repository was the final product and might have been used rather than my local storage - that choice was a matter of familiarity and trust. Each change was written locally first via the command server, verified on disk by reading it back, then committed and pushed to GitHub.
3. Run browser tests in the AI's container, not on the user's machine
For this project, the final product was a web app. In prior work, I used a local Chromium to view and test the product. It turns out that Claude's container ships with Node and Playwright preinstalled, and Chromium may be available from the Puppeteer install. Browser regression tests for the HTML5 application were run there entirely, and I only viewed staged intermediate products. Containing the development is not possible when building a MATLAB product without the added burden of using MATLAB in the cloud. The idea was to do as much as possible without overhead in the AI container.
4. Multistep plan with explicit approval gates
Decompose the work into milestones with sub-milestones. Each has a test harness with a documented expected pass count and a concrete deliverable. Don't merge "running a test" with "uploading the result" with "committing the change" — these separate decisions each has its own approval and verification. If the chat dies between any two of them or something else goes awry, the user can stop without leaving anything dangling. This project: 8 milestones, 27 sub-milestones, 260 documented sub-checks.
5. Versioned backups before any destructive write
Every PROGRESS edit got a timestamped pre-edit copy first in the local project repo, one per milestone.
Result
Recovery from the failed chat only cost me one turn. Six more turns to finish the project. The final result: 260 of 260 sub-checks pass across all milestones, live deployment verified. Many hairs pulled (the violation of usage policy issue was not the only one encountered!), but no utter despair experienced!
Links
Live HTML5 application: https://duncancarlsmith.github.io/TidalDisruptionExplorer-HTML5/
MATLAB Live Script (File Exchange 183760): https://www.mathworks.com/matlabcentral/fileexchange/183760-tidal-disruption-explorer
Source repository (GitHub): https://github.com/DuncanCarlsmith/TidalDisruptionExplorer-HTML5
Aditya
Aditya
Ultima attività il 23 Apr 2026 alle 0:11

Hi,
I am trying to use an esp32 board with quectal ec200u LTE Modem to send sensor data to thingspeak. The board can process the sensor data however I am unable to send the data to thingspeak. I have used the same process earlier too however with a different modem from Simcom.
Can someone help me with specific commands for achieving this? I can share the code which i am trying to use.
Regards
Aditya
Julio
Julio
Ultima attività il 20 Apr 2026 alle 19:28

Good morning everyone. I’m having a problem with ThingSpeak. I’m sending data from an ESP LoRa with the RTC set to the Brasília time zone (GMT-3).
Previously, when I exported the data to CSV, it used the ThingSpeak time, which appeared 3 hours ahead. Now that I’m sending the timestamp from the ESP, the graphs are showing the data 3 hours behind. Is there a way to align the graph times while keeping the Brazilian time zone?
Mike Croucher
Mike Croucher
Ultima attività il 30 Apr 2026 alle 17:30

Short version: MathWorks have released the MATLAB Agentic Toolkit which will significantly improve the life of anyone who is using MATLAB and Simulink with agentic AI systems such as Claude Code or OpenAI Codex. Go and get it from here: https://github.com/matlab/matlab-agentic-toolkit
Pooja
Pooja
Ultima attività il 15 Apr 2026 alle 14:01

MATLAB EXPO India | 7 May | Bengaluru
Get inspired by the latest trends and real-world customer success stories transforming industries. Learn from trusted experts across 4 tracks.
  • AI & Autonomous Systems
  • Electrification
  • Systems & Software Engineering
  • Radar, Wireless & HDL
Digital Twin Development of PEARL Autonomous Surface System Thermal Management
The top session of the countdown showcases how the PEARL engineering team used a digital twin to solve real‑world thermal challenges in a solar‑powered autonomous marine platform operating in extreme environments. After thermal shutdown events in the field, the team built a model that predicts temperatures at multiple locations with ~1% accuracy, while balancing accuracy with model complexity.
Beyond the technology, this keynote delivers practical lessons for predictive modeling and digital twins that apply well beyond marine systems.
We hope you’ve enjoyed the Top 10 countdown series—and a big thank‑you to Olivier de Weck at Massachusetts Institute of Technology, for delivering such a compelling and insightful keynote.
🎥 If you missed it live, be sure to watch the recording to see why it earned the #1 spot at MATLAB EXPO 2026.
Pooja
Pooja
Ultima attività il 8 Apr 2026 alle 15:00

MATLAB EXPO India is Back!
This in-person events brings together engineers, scientists, and researchers to explore the latest trends in engineering and science, and discover new MATLAB and Simulink capabilities to apply to your work.
May 7, 2026 l Bengaluru
Missed the Cody World Cup Watch Party on March 27—or want to relive the glory?
The full recording is now available, and it’s every bit as entertaining as it was live.
What you’ll see in the video:
🔥 Top MATLAB users in action
Watch expert solvers think, debug, strategize—and occasionally panic.
Which functions do they reach for? How do they break down the problem?
BEHOLD the power moves… and the 3D arrays.
🏆 Three teams. Six champions. One viciously clever problem.
There may have been NaN traps.
There may have been nested for‑loops.
There may have been… emotions.
🎙️ Professional‑grade commentary by:
@Ned Gulley – Capricious dictator, Lord Ned
@Matt Tearle – Architect of Diabolical Challenges
Their line‑by‑line play‑by‑play turns MATLAB into a true spectator sport.
👉Watch the recording here and take a shot at the Champion-level problem yourself.
Finally, tell us what you want to see next—head‑to‑head contests? Team battles? Drop your ideas in the comments. All suggestions welcome!
It’s no surprise this keynote landed at #2. MaryAnn Freeman, Senior Director of Engineering, AI, and Data Science explores how AI, especially generative AI, is transforming the way engineers design, build, and innovate. From accelerating the design loop with faster, data‑driven solutions, to blending human creativity with AI insights, to evolving engineering tools that turn ideas into build‑ready systems. This keynote shows how embedded intelligence helps engineers push past traditional limits and bridge imagination with real‑world impact.
If you’re curious about how AI is reshaping engineering workflows today (and what that means for the future of design), this is a must‑watch.
👉 Watch the keynote recording and see why it was one of the most popular sessions of MATLAB EXPO Online 2025.
This is a reminder that the Cody World Cup Watch Party takes place on March 27 at 10:00 AM ET.
We’ll watch how top MATLAB minds solve a fun‑but‑challenging Cody championship‑round problem, followed by a live open discussion with the players.
📅 To join, download the ics calendar file (link updated and no sign‑in required) or copy the meeting link and add it to your calendar!
📌 Full details, agenda, and prep suggestions are in the original announcement.
Software‑defined vehicles are becoming reality—and this #3 ranked session shows how. In this keynote, Daniel Scurtu (NXP) demonstrates how MathWorks and NXP are working together to accelerate system‑level embedded development.
🔋 Using a vehicle electrification demo that runs across multiple NXP processors, you’ll see:
  • Model‑Based Design workflows from concept to deployment
  • Intelligent battery management and motor control
  • Automatic code generation and hardware deployment
  • ☁️ Real‑time cloud analytics and over‑the‑air updates
🛠️ Featuring MATLAB and Simulink products alongside NXP tools like Model-based Design Toolbox (MBDT), S32 Design Studio IDE, and Real-Time Drivers (RTD), this session highlights an end‑to‑end approach that reduces complexity and speeds the transition to software‑defined vehicles.
👉 Watch the session on demand and catch what you missed
Hi everyone,
Some of you may remember my earlier post. Quick version: I'm a biomed PhD student, I use MATLAB daily, and I noticed that AI coding tools often suggest functions that don't exist in R2025b or use deprecated ones. So I built skills that teach them what actually works.
v2.0 adds 54 template `.m` scripts, rewrites all knowledge cards based on blind testing, and verifies every function call against live MATLAB. I tested each skill on 17 prompts and caught 8 hallucinated functions across 5 toolboxes (Medical Imaging, Deep Learning, Image Processing, Stats-ML, Wavelet).
Give it a spin!
The skills follow the Agent Skills open standard, so they also work with Codex, Gemini CLI, Claude Code and others. If you use the official Matlab MCP Server from MathWorks, these skills complement it: the MCP server executes your code, the skills help the AI write good code to begin with.
One ask
How do we measure performance and evaluate agent skills? We can run blind tests and catch hallucinated functions, but that only covers what we thought to test. The honest answer is that the best way to evaluate these is community consensus and real-world testimonials. How are you using them? What worked? What still broke?
Your use cases and feedback are the most reliable eval I can get, and as a student building this, they're also the real motivation for me to keep going. If a skill saved you from a hallucinated function or pointed you to the right function call, I'd love to hear about it. If something is still wrong, I need to hear about it.
Issues, PRs, or just a reply here. Star the repo if it saved you time.
Thanks!
Happy Spring! and Happy Coding in Matlab!
Best,
Ritish
What’s New in MATLAB and Simulink in 2025
If you missed this session live, this is one of those “everyone’s talking about it” updates you’ll want to catch up on. 👀
This session is packed with the kinds of enhancements that quietly (and not so quietly) change how you work every day.
Here’s why it earned a spot in our Top 4:
  • A redesigned MATLAB desktop with customizable sidebars and light/dark themes—built to adapt to how you work
  • New side panels for coding and development tasks, plus more control over organizing and customizing figures
  • MATLAB Copilot, a generative AI assistant optimized for MATLAB to help you explore ideas, learn techniques, and boost productivity directly in the desktop
  • Simulink workflow improvements like a redesigned Simulink scope, more detailed info in quick insert, and automatic signal line straightening
  • Enhanced Python integration across MATLAB and Simulink
  • New AI deployment options optimized for Qualcomm and Infineon hardware targets
If staying current with MATLAB and Simulink is part of your role—or your edge—this session is a must‑watch. Missing it means missing context for features that will shape how you work in 2026 and beyond.
🎥 Watch the recording and see what’s new.
💬 Discussion topic:
Which single update from this release do you think will most improve your day‑to‑day workflow, and why?
View of Classical Stark Effect web application
A Live Script can be converted to an HTML5 framework web application with AI as described in Double Pendulum Chaos Explorer: From HTML5 Prototype to MATLAB interactive application with AI. I have recently provides converted the Live Script Classical Stark Effect to a web application supporting a 3D twirlable display of motion of a particle subject to an inverse square law force plus an additional constant force - the problem known as the classical Stark effect.
The web application deployed to GitHub may be launched here and documents its dependencies below the interactive application. The files are available at Classical Stark Effect — Interactive Web Simulation. One gotcha was the need to enable hardware acceleration in Chrome (no problem in Safari) to support a 3D twirlable display. If hardware acceleration is disabled in Chrome, the application provides a warning and replaces the 3D twirlable display with a 2D alternate.
The conversion of the script to a web application was performed with Perplexity.ai. The GitHub deployment was accomplished with Anthropic's Claude using the open source GitHub CLI. WIth the gh CLI (already installed and authenticated on my Mac) via osascript, and Claude connected to my file system via MCP and an ngrok server, Claude executed on my Mac the following sequence of steps:
1. git init
Creates a hidden .git/ directory in the staging folder, initializing it as a local git repository. Before this command the folder is just a plain directory; after it, git can track files there. Run once per new project.
2. git branch -M main
Renames the default branch to main. Older git versions default to master; GitHub now expects main. The -M flag forces the rename even if main already exists. Must run after git init and before the first commit.
3. git add -A
Stages all files in the directory tree for the next commit. The -A flag means "all" -- new files, modified files, and deleted files are all included. This does not write anything to GitHub; it only updates git's internal index (the staging area) on your local machine.
4. git commit -m 'Initial release: Classical Stark Effect Interactive Simulation'
Takes everything in the staging area and freezes it into a permanent commit object stored in .git/. This is the snapshot that will be pushed. The -m flag provides the commit message inline. After this command, git knows exactly what files exist and what their contents are -- gh repo create --push will send exactly this snapshot.
5. gh repo create ClassicalStarkEffect --public --source=. --push
Three things happen in sequence inside this one command:
  • gh repo create ClassicalStarkEffect --public -- calls the GitHub API to create a new empty public repository named ClassicalStarkEffect under the authenticated account (DuncanCarlsmith).
  • --source=. -- tells gh to treat the current directory as the local git repo. It reads .git/ to find the commits and configures the remote.
  • --push -- sets the new GitHub repo as origin and runs the equivalent of git push origin main, sending the commit from step 4 up to GitHub.
Without steps 1-4 having run first, --push would have nothing to send and the repo would land empty.
6. gh api repos/DuncanCarlsmith/ClassicalStarkEffect/pages --method POST -f build_type=legacy -f source[branch]=main -f 'source[path]=/'
Calls the GitHub REST API directly to enable GitHub Pages on the repo. Breaking down the flags:
  • --method POST -- this is a create operation (not a read), so it uses HTTP POST.
  • -f build_type=legacy -- critical flag. Tells GitHub to serve files directly from the branch. The alternative (workflow) would expect a .github/workflows/ Actions file to build and deploy the site, which doesn't exist here, and would produce a permanent 404.
  • -f source[branch]=main -- serve from the main branch.
  • -f 'source[path]=/' -- serve from the root of the branch (as opposed to a /docs subdirectory).
This is the API equivalent of going to Settings > Pages in the GitHub web UI and setting Branch: main, Folder: / (root), clicking Save.
7. curl -s -o /dev/null -w "%{http_code}" https://duncancarlsmith.github.io/ClassicalStarkEffect/
Not a git or gh command, but the verification step. GitHub Pages takes ~60 seconds to build after step 6. This curl fetches the live URL and prints only the HTTP status code (-w "%{http_code}"), discarding the body (-o /dev/null) and suppressing progress output (-s). 200 means live; 404 means still building.
You’re invited to the Cody World Cup Watch Party! Six of the world’s best MATLAB users have advanced to the Cody Contest 2025 Bonus Round to tackle a championship-level Cody problem. Now it’s your chance to watch, learn, and interact with those pros!
📅When & How to Join
Date: March 27, 2026
Time: 10:00 AM Eastern Time
Where: Microsoft Teams (download the ics calendar file or copy the meeting link and add it to your calendar!)
📽 Agenda
Part 1 – Watch Together (25 min)
Watch how those top MATLAB users think, debug, strategize, and occasionally panic😅. Enjoy professional-grade commentary from MathWorks experts as the action unfolds.
Part 2 – Live Discussion (35 min)
Chat directly with those top minds and the problem creator, @Matt Tearle! Reply in the comments with questions you’d like us to ask them.
🧩 Solve the Problem Yourself!
For the best experience, try that Cody problem yourself before the event. Trust us — the discussions are way more fun after you’ve wrestled with it.
Whether you are a beginner or a seasoned expert, this is your chance to see the best in action, pick up MATLAB tips, and have some fun. See you there!
🤖 What does it take to make robotic motion feel… human?
In this session, Tetsushi Sotowa shares how NSK is combining advanced control techniques with deep learning to enable human‑like grasping in electric grippers
You’ll see a real‑world case study featuring:
  • Bilateral and force control systems developed in‑house
  • MATLAB and Simulink–based control workflows
  • Deep learning integration using Deep Learning Toolbox
  • A practical path from mechatronics research to intelligent actuation
The result: an AI‑enhanced actuator capable of more natural, responsive grasping—bringing robotics one step closer to human motion.
👉 Interested in AI‑driven robotics and advanced control? Check out the session now from MATLAB EXPO 2025.
MATLAB MCP Core Server v0.6.0 has been released onGitHub: https://github.com/matlab/matlab-mcp-core-server/releases/tag/v0.6.0
Release highlights:
  • New cross-platform MCP Bundle; one-click installation in Claude Desktop
Enhancements:
  • Provide structured output from check_matlab_code and additional information for MATLAB R2022b onwards
  • Made project_path optional in evaluate_matlab_code tool for simpler tool calls
  • Enhanced detect_matlab_toolboxes output to include product version
Bug fixes:
  • Updated MCP Go SDK dependency to address CVE.
We encourage you to try this repository and provide feedback. If you encounter a technical issue or have an enhancement request, create an issue https://github.com/matlab/matlab-mcp-core-server/issues
Missed a crowd‑favorite session feautring Marko Gecic at Infineon and Lucas Garcia at MathWorks?
This talk shows how to verify and test AI for real‑time, safety‑critical systems using an AI virtual sensor that estimates motor rotor position on an Infineon AURIX TC4x microcontroller. Built with MATLAB and Simulink, the demo covers training, verification, and real‑time control across a wide range of operating conditions.
You’ll see practical techniques to test robustness, measure sensitivity to input perturbations, and detect out‑of‑distribution behavior—critical steps for meeting standards like ISO 26262 and ISO 8800. The session also highlights how Model‑Based Design leverages AURIX TC4x features such as the PPU and CDSP to deploy AI with confidence.
Featuring: Dr. Arthur Clavière, Collins Aerospace
How can we be confident that a machine learning model will behave safely on data it’s never seen—especially in avionics? In this session, Dr. Arthur Clavière introduces a formal methods approach to verifying maching learning generalization. The talk highlights how formal verification can be apploied toneural networks in safety-critical avionics systems.
💬 Discussion question:
Where do you see formal verification having the biggest impact on deploying ML in safety‑critical systems—and what challenges still stand in the way?
Join the conversation below 👇
🚀 Unlock Smarter Control Design with AI
What if AI could help you design better controllers—faster and with confidence?
In this session, Naren Srivaths Raman and Arkadiy Turevskiy (MathWorks) show how control engineers are using MATLAB and Simulink to integrate AI into real-world control design and implementation.
You’ll see how AI is being applied to:
🧠 Advanced plant modeling using nonlinear system identification and reduced order modeling
📡 Virtual sensors and anomaly detection to estimate hard-to-measure signals
🎯 Datadriven control design, including nonlinear MPC with neural statespace models and reinforcement learning
Productivity gains with generative AI, powered by MATLAB Copilot