Craftset
Apply Now

You Can't Scale What You Can't See

Learn to build durable, reliable industrial systems with proven instrumentation techniques. Reduce downtime, diagnose failures faster, and build the visibility your teams need to scale.

Only 5 spots available at no cost. Once they're gone, the next cohort starts at $3,000/seat.

Here's what scaling without observability actually costs:

It's 2am and the phone rings. “The server has crashed, and we can't get the AGVs to reconnect.” This was the phone call I had dreaded for months.

I had been helping a system integrator develop a material handling solution using AGVs, PLCs, and multiple software services to coordinate everything. It was an elegant system, with one problem: AGVs would sometimes disconnect from the system and come to a complete halt, seemingly at random. We had no idea what was causing it, and troubleshooting was a nightmare.

To diagnose the problem, we had to send engineers physically to the site to watch the system in real time, wait for a disconnect to happen, then manually trace network requests across multiple fragmented logs. We had a 12-hour window to do it before the logs were overwritten and the evidence was gone.

Eventually, the customer had enough. No further projects until this issue was resolved. The result: $40M worth of revenue gone.

The painful part? We had all the data we needed to solve this — it was just scattered, fragmented, and disappearing faster than we could chase it. We didn't have a technology problem. We had a visibility problem.

Build reliable, scalable systems with industrial observability

Observability is the ability to understand the internal state of your systems solely by examining their outputs — without needing to predict how your system might fail in advance.

Industrial systems are increasingly complex, distributed, and software-driven. This complexity makes them powerful — but also opaque. When something goes wrong, it can be difficult or impossible to understand why, much less how to fix it. To make matters worse, as the complexity of applications has grown, the tools we use to understand and troubleshoot them have not kept pace. Traditional monitoring tools that rely on predefined metrics and alerts are no longer sufficient to provide the visibility needed to maintain reliability and performance.

You can build observable systems by learning a set of proven principles, tools, and instrumentation patterns. Engineers and operations teams who do this consistently diagnose issues faster, reduce unplanned downtime, and scale their operations without reliability becoming a bottleneck.

A common misconception about observability is that it's only for software companies with large engineering teams. In reality, the principles of observability apply just as powerfully to industrial systems — and the teams that apply them stop chasing failures and start preventing them.

In an industry where downtime is measured in dollars per minute, nothing matters more than visibility.

What You'll Learn to Do

How to determine the root cause of any failure by interrogating your system directly — without being limited by disconnected logs or missing metrics

  • Ask questions about a failure you've never seen before and get answers, even if you didn't predict you'd need to ask them
  • Follow an investigation iteratively — one question leading to the next — until you reach the source, without hitting a dead end where your tools can go no further
  • Understand what your system was doing at the exact moment a failure occurred, not just that a threshold was crossed

How to quickly find deeply hidden, elusive failures that only happen under rare combinations of conditions

  • Capture the full context of every operation as a structured event so that when a rare failure occurs, the evidence is already there waiting for you
  • Reproduce elusive failures by identifying the exact combination of conditions — timing, load, sequence, state — that caused them
  • Stop relying on an engineer being present at the right moment; let your instrumentation witness failures on your behalf

How to resolve conflicting interpretations of a failure when multiple systems and vendors are involved

  • Produce a single, unified view of what every part of your system was doing during a failure — one that all parties are looking at simultaneously
  • Replace disconnected logs from different systems with a correlated trace that shows the sequence of events across every boundary in your stack
  • Verify proposed fixes with objective before-and-after data rather than accepting a vendor's assurance that the problem is resolved

How to understand any state your system has gotten into, even ones you've never seen before and couldn't have predicted

  • Move beyond dashboards that only confirm what you already suspected and build the capacity to discover what you didn't know to look for
  • Understand the internal state of your system solely by observing and interrogating its outputs — without needing access to the system itself
  • Stop hitting investigative dead ends where you can narrow a problem to a region of your system but lack the resolution to go any further

How to scale your systems with confidence that your tools will keep up with your complexity rather than being outpaced by it

  • Ensure that each new system you add is as observable as the first — so complexity increases without a corresponding increase in diagnostic effort
  • Replace undocumented workarounds and accumulated tribal knowledge with instrumentation that makes system behavior legible to anyone on your team
  • Build the foundation that lets you answer questions about your 50th system as quickly and confidently as you answer them about your first
SP

Sam Prescott

Senior Product Manager, FORTNA · Engineer & Observability Educator

Sam has spent his career at the intersection of software and industrial systems, working with teams to build reliable, observable systems in some of the world's most demanding environments.

Sam began his career as a software engineer at GE, where he worked on industrial control systems that run everything from power plants to wind farms. He then moved into product management at Emerson, where he led the development of software and hardware products that help teams monitor and maintain those systems in the field.

Currently a Senior Product Manager at FORTNA, Sam is focused on delivering automated solutions for material handling systems — the kind of complex, high-stakes environments where reliability isn't optional.

Connect on LinkedIn

Is this for you?

This is for you if:

  • Systems integrators and OEMs

    You're building and maintaining industrial systems — and tired of diagnosing failures by hand with fragmented logs.

  • Operations and reliability managers

    Responsible for uptime across a facility or fleet. You don't need to write the code — but you need to understand what your systems are doing and why they fail. This workshop is built for mixed technical and non-technical teams in the same room.

  • Engineering leads and CTOs at industrial companies

    You've scaled the equipment. Now reliability is becoming a bottleneck on growth. You need your team to stop firefighting and start preventing — and you need a system that survives beyond any one engineer.

This is NOT for you if:

  • Teams who lack a curious, investigative mindset

    This workshop is not for teams that are content to just run their systems and react to failures — it's for teams that want to understand what's happening in their systems and why.

  • Teams with zero appetite for change

    Observability requires instrumenting your systems. If your organization can't touch production configurations or runs entirely on legacy closed systems with no egress, this isn't the right fit — yet.

  • People looking for passive video content

    This is a live, collaborative workshop. You'll be building in real time, asking questions, and adapting to your own environment. If you want something to watch on the couch, this isn't it.

The Curriculum

Three weeks. Six sessions. One working observability system built on your own environment.

Week 1
April 1, 2026

Lecture — 1 Hour

Observability: Origins, Principles, and Industrial Application

  • The history of observability in mechanical and control systems and its evolution into modern software engineering
  • What observability is and how it differs fundamentally from monitoring and metrics-based approaches
  • Structured events as the core building block of observable systems — what they are, what they contain, and why they work
  • How to adapt modern observability concepts to industrial systems: PLCs, OPC UA, and edge devices
  • The OT-IT instrumentation gap and practical strategies for bridging it

Lab — 1 Hour

Instrumenting Your First OPC UA Application

  • Set up a simple OPC UA application as your instrumentation target
  • Define the meaningful operations in your system and model them as structured events
  • Instrument the application to capture and emit structured events with rich contextual data
  • Verify your events are well-formed and contain the attributes needed for effective debugging

Office hours held each week — bring your questions, your environment, and your toughest problems.

Week 2
April 8, 2026

Lecture — 1 Hour

Distributed Tracing and OpenTelemetry

  • What a trace is and how it differs from a log or a metric
  • How to correlate events across multiple systems and services without native header propagation — the industrial correlation problem
  • Introduction to OpenTelemetry: the vendor-neutral framework for capturing and exporting observability data from any system
  • How to connect your observability pipeline to any storage backend or visualization tool using OpenTelemetry's collector architecture

Lab — 1 Hour

Building Your First Trace with OpenTelemetry

  • Instrument a multi-component application to emit correlated structured events using OpenTelemetry
  • Stitch events together into a complete distributed trace and verify the parent-child span relationships
  • Visualize your traces using Jaeger and interpret waterfall diagrams to identify latency, gaps, and anomalies
  • Query your trace data to answer specific questions about system behavior during a simulated failure

Office hours held each week — bring your questions, your environment, and your toughest problems.

Week 3
April 15, 2026

Lecture — 1 Hour

The Core Analysis Loop, Monitoring Coexistence, and Team Practices

  • The core analysis loop: how to interrogate event data iteratively to find anomalies and identify root cause without hitting investigative dead ends
  • How observability and monitoring coexist — using metrics and dashboards as signals that trigger deeper investigation, not as a replacement for it
  • How to integrate your observability stack with existing tools like Grafana without duplicating effort or creating conflicting sources of truth
  • Team philosophies and engineering practices for building and sustaining an observability culture in your organization

Lab — 1 Hour

Debugging a Real Issue Using Observability

  • Work through a realistic failure scenario in a sample industrial application using structured events and trace data
  • Apply the core analysis loop to move from anomaly detection to confirmed root cause without being told in advance what the problem is
  • Integrate your observability data with Grafana and build a dashboard that uses metrics and traces together
  • Walk away with a repeatable debugging framework you can apply to your own systems immediately after the workshop

Office hours held each week — bring your questions, your environment, and your toughest problems.

What's Included?

  • Live sessions

    Learn in real-time, interactive sessions. Ask questions in context, work through problems as they come up, and adapt the material to your team's specific environment. This is not a pre-recorded course you watch alone.

  • Access to session recordings

    Every session is recorded so you can revisit at your own pace. The recordings don't expire, and you'll keep access to all course materials forever.

  • Private cohort community

    A dedicated Slack community with your cohort of industrial engineers, operations managers, and automation professionals. Share progress, ask questions between sessions, and build connections with peers who are solving the same problems in the same kinds of environments.

  • Pre-configured Docker images

    A ready-to-run observability stack — OpenTelemetry Collector, storage backend, and visualization tool — preconfigured and ready to deploy. Your team spends the workshop learning and building, not fighting infrastructure setup.

  • Instrumentation code examples

    Working code examples for common industrial protocols and system types, so you're not starting from scratch when adapting the patterns to your own environment. Use them as a starting point or a reference — they're yours to keep.

  • Written course materials

    Reference guides, architecture diagrams, and implementation checklists your team can keep using after the workshop ends. Not just slides — structured material designed to be useful the day after the final session, when you're extending the system on your own.

Frequently Asked Questions

Apply for Free Access

Fill this out and we'll be in touch within 24 hours.

5 founding spots remaining — free for early applicants