Product architecture

Three tiers, one stack · SDK + decision brain + domestic edition

An inference SDK at the bottom, a decision brain in the middle, compliance audit on top — one codebase, four silicon backends, two delivery editions.

Three tiers, each in its lane

Hardware abstraction · write once, run anywhere

Inference SDK

We collapse the inference backend behind one interface. Your business code only calls predict; underneath we switch between NVIDIA, Huawei Ascend, Hygon, and CPU with a single environment variable.

  • Vision detection in 50 ms
  • Vision LLM review in 200 ms
  • Zero-edit adaptation to domestic chips
  • CPU runs in CI environments

See, judge, record — all in one closed loop

Decision brain

High-confidence samples take the fast path; low-confidence ones go to vision LLM review; compliance rules act as a hard gate; humans can step in when needed; outcomes write to audit storage in milliseconds.

  • Reject, recheck, release verdict in under 300 ms
  • Every decision carries a written rationale
  • GMP allow/deny rules as a hard gate
  • Human-in-the-loop pause and resume

China-native · MLPS Level 3 ready

Domestic stack

Ships with Kylin V10 SP3 ARM64 images, Huawei Ascend 910B / 310B inference, Dameng / OceanBase databases, and one-way diode friendliness — no outbound dependencies.

  • Native Kylin OS image
  • One-flag switch to Ascend inference
  • Domestic database adaptation
  • MLPS Level 3 documentation included

Integrate in two lines of code

Switch backends between NVIDIA, Ascend, Hygon, and CPU without touching business logic.

from factoryos_inference import InferenceClient, InferenceInput
import numpy as np
# 自动探测最佳后端 (NVIDIA / 华为昇腾 / 海光 DCU / 通用 CPU)
client = InferenceClient()
client.load_model("yolo_defect_v3")
# 构造推理输入
image = np.random.rand(1, 3, 640, 640).astype(np.float32)
inp = InferenceInput(data=image)
# 执行推理
out = client.predict("yolo_defect_v3", inp)
print(f"backend={out.backend.value} latency={out.latency_ms:.1f}ms")
for det in out.detections:
print(f" {det.label}: {det.score:.3f} @ {det.bbox}")

Electronic batch records · one printable page

Inspectors can pull any vial and trace the AI evidence chain end to end.

SHA-256 digital signatures, tamper-evident
Audit trail with end-to-end timestamps
Electronic signature, dual compliance
Written rationale plus original-image evidence

Deployment · isolated OT network

Data never leaves the plant. IT and OT are physically separated; a one-way diode lets data in only.

50 ms

Vision inference

200 ms

VLM recheck

300 ms

Decision output

0.05%

Miss rate