Technology

Physical intelligence

for manufacturing.

We are factory people who know AI

not the other way around.

Almetra builds AI systems that learn from real shopfloor behaviour — turning factory video into structured process understanding, operational insight, and automation readiness.

The technical core is not classic computer vision. It is long-horizon activity understanding, process mining from video, typed process graphs, large temporal-aware teacher models, teacher-student distillation, and edge-deployed intelligence in real factories.

THE TECHNICAL PIPELINE

From pixels to process graph

six transformation layers.

Manufacturing work unfolds over minutes. Almetra's systems model this temporal structure and convert it into a ground-truth representation of what is happening on the shopfloor.

LAYER 01

Pixels

Raw video

LAYER 02

Entities

People · tools · parts

LAYER 03

Actions

Steps · motions · waits

LAYER 04

Process Events

Cycles · bottlenecks

LAYER 05

Process Graph

Typed · auditable

LAYER 06

Insight & Action

CI · automation

Vision AI

Detection

Temporal model

Event extraction

Graph construction

Improvement & robotics

TECHNICAL STACK

Six transformation layers

from raw video to operational decisions.

LAYER

WHAT IT DOES

TECHNICAL SIGNAL

Tech. SIGNAL

Pixels → Entities

Video becomes robust representations of people, hands, tools, parts, workstations, and scene regions.

Messy real-world perception under occlusion, lighting variation, and factory noise.

Pixels → Entities

Video becomes robust representations of people, hands, tools, parts, workstations, and scene regions.

Messy real-world perception under occlusion, lighting variation, and factory noise.

Entities → Actions

Temporal-aware models infer steps, motions, interactions, waiting, rework, and deviations over minutes.

Long-horizon video understanding beyond frame-level detection.

Entities → Actions

Temporal-aware models infer steps, motions, interactions, waiting, rework, and deviations over minutes.

Long-horizon video understanding beyond frame-level detection.

Actions → Events

Activity mapped into cycle starts, value-add work, idle time, bottlenecks, and standard-work deviations.

ML outputs become manufacturing events — not labels for their own sake.

Actions → Events

Activity mapped into cycle starts, value-add work, idle time, bottlenecks, and standard-work deviations.

ML outputs become manufacturing events — not labels for their own sake.

Events → Process Graph

A typed graph represents states, transitions, variants, dependencies, loops, SOP constraints, and failure modes.

Process intelligence, ontology design, auditability, and operational reasoning.

Events → Process Graph

A typed graph represents states, transitions, variants, dependencies, loops, SOP constraints, and failure modes.

Process intelligence, ontology design, auditability, and operational reasoning.

Graph → Insight

The system explains where time is lost, why output varies, which steps constrain capacity, and what to improve.

AI that directly affects production decisions.

Graph → Insight

The system explains where time is lost, why output varies, which steps constrain capacity, and what to improve.

AI that directly affects production decisions.

Insight → Automation

Process context identifies automation candidates and gives robotic systems the task context they need.

Credible bridge from observation to robotic execution.

Insight → Automation

Process context identifies automation candidates and gives robotic systems the task context they need.

Credible bridge from observation to robotic execution.

TECHNICAL CHALLENGES

The problems

our team works on.

LONG-HORIZON VIDEO

Modelling work over minutes, not frames

Manufacturing processes unfold over minutes with temporal dependencies, action segmentation, and process state that frame-level detection cannot capture.

TEACHER-STUDENT DISTILLATION

Compressing large models for edge deployment

Transferring the capabilities of large temporal teacher models into compact student models that run close to the production line — under real factory latency and bandwidth constraints.

VIDEO-TO-PROCESS GRAPHS

Reliable process events from noisy shopfloor video

Converting raw factory video into typed process graphs with auditable transitions, dependencies, and failure modes — structured enough for operational reasoning and automation decisions.

CROSS-FACTORY ADAPTATION

Adapting across factories with limited labels

Generalising across factories, workstations, operators, camera viewpoints, and product variants without requiring exhaustive annotation at each new deployment.

CYCLE MATERIALISATION

Deriving cycles from dense workstep streams

Extracting clean cycles and per-product bookkeeping from continuous, dense workstep streams — replacing brittle counting heuristics with reliable structured event data.

AUTOMATION READINESS

Which tasks are worth automating — and when

Determining which human tasks are economically meaningful, technically feasible, and operationally safe to automate — grounded in real process data, not lab conditions.

INFRASTRUCTURE & RESEARCH

Credibility grounded

in systems, research, and production deployment.

TRAINING INFRASTRUCTURE

NVIDIA B300 Blackwell-class accelerators

Self-hosted high-performance training infrastructure for training and adapting large temporal models on factory video at scale.

ACADEMIC RESEARCH

Prof. Jürgen Gall, University of Bonn · Prof. Rudolph Lioutikov, KIT

Academic advisors and collaborators in computer vision and robotics — bringing frontier research depth to production deployment challenges.

ROBOTICS HARDWARE

Universal Robots · Franka arms · GELLO teleoperation

Industrial automation work on UR platforms and manipulation research on Franka arms with GELLO-style teleoperation for skill collection and task learning.

DATA ADVANTAGE

Thousands of hours of real factory video at scale

Almetra observes real manufacturing work across hundreds of deployed processes — creating a rare dataset for temporal video understanding, process graph construction, and automation-readiness analysis.

Temporal video, process graphs, edge AI, and robotics

See what's happening on your manual line

in production, not a lab.

from day one.

We are building physical intelligence infrastructure for manufacturing. If temporal modelling, teacher-student distillation, or video-to-process-graph problems sound like the right frontier, we'd like to hear from you.

Request a demo or start with a factory assessment. Most teams see their first real-time insights within two weeks.

Watch recent customer story