Product architecture
An inference SDK at the bottom, a decision brain in the middle, compliance audit on top — one codebase, four silicon backends, two delivery editions.
Hardware abstraction · write once, run anywhere
We collapse the inference backend behind one interface. Your business code only calls predict; underneath we switch between NVIDIA, Huawei Ascend, Hygon, and CPU with a single environment variable.
See, judge, record — all in one closed loop
High-confidence samples take the fast path; low-confidence ones go to vision LLM review; compliance rules act as a hard gate; humans can step in when needed; outcomes write to audit storage in milliseconds.
China-native · MLPS Level 3 ready
Ships with Kylin V10 SP3 ARM64 images, Huawei Ascend 910B / 310B inference, Dameng / OceanBase databases, and one-way diode friendliness — no outbound dependencies.
Switch backends between NVIDIA, Ascend, Hygon, and CPU without touching business logic.
| from factoryos_inference import InferenceClient, InferenceInput |
| import numpy as np |
| # 自动探测最佳后端 (NVIDIA / 华为昇腾 / 海光 DCU / 通用 CPU) |
| client = InferenceClient() |
| client.load_model("yolo_defect_v3") |
| # 构造推理输入 |
| image = np.random.rand(1, 3, 640, 640).astype(np.float32) |
| inp = InferenceInput(data=image) |
| # 执行推理 |
| out = client.predict("yolo_defect_v3", inp) |
| print(f"backend={out.backend.value} latency={out.latency_ms:.1f}ms") |
| for det in out.detections: |
| print(f" {det.label}: {det.score:.3f} @ {det.bbox}") |
Inspectors can pull any vial and trace the AI evidence chain end to end.
Data never leaves the plant. IT and OT are physically separated; a one-way diode lets data in only.
50 ms
Vision inference
200 ms
VLM recheck
300 ms
Decision output
0.05%
Miss rate