Open-source Patent
DigiMorph is building an Decentralized, Open Innovation Ecosystem
"A Computing Task Allocation System and Method for Distributed Inference in Large Language Models."
This patent focuses on optimizing the inference process of large language models (LLMs) on mobile devices by utilizing distributed task allocation, model segmentation, and linear optimization techniques. The goal is to enable efficient, low-latency, and cost-effective inference across multiple edge devices.
This patent is based on the February 2024 prototype and experimental results from the DigiMorph Lab team, demonstrating real-world feasibility and performance improvements in decentralized AI inference. The core content of the patent is open-sourced in this repository, while the full patent has been filed in Australia, the United States, and China in 2024.
The Value of This Patent for DigiMorph
As one of the foundational technologies for DigiMorph, this patent plays a critical role in shaping the platform’s approach to AI-driven digital embodiment and decentralized intelligence.
🔹 Empowering Personal Digital Embodiments
Enables low-latency, personalized AI agents to operate on distributed devices without cloud dependency.
🔹 Decentralized AI Infrastructure
Aligns with DigiMorph’s Web3 and decentralized AI vision, reducing costs and improving scalability.
🔹 Integration with Web2 & Web3 Ecosystems
Supports AI agents functioning across social media, gaming, blockchain-based identities, and dApps.
🔹 AI Computation Monetization
With DigiMorph’s Proof of Interaction mechanism, distributed AI inference could be monetized via decentralized staking, AI compute credits, or token rewards.
This patent serves as a cornerstone for DigiMorph’s AI Agent evolution, offering a scalable, decentralized, and intelligent infrastructure to support multi-agent collaboration, real-world AI autonomy, and next-gen digital intelligence.
Key Innovations
✅ Distributed Task Allocation for LLMs
Instead of executing LLMs on a single device, this method distributes inference tasks across multiple mobile devices, improving efficiency and reducing latency.
✅ Advanced Network-Aware Task Scheduling
Implements Mixed-Integer Linear Programming (MILP) for optimal model partitioning, considering:
Model FLOPs, memory constraints, device processing capabilities
Network conditions such as bandwidth, jitter, and packet loss
Binomial distribution modeling for packet loss rate
✅ Enhanced ONNX-Based Adaptive Model Deployment
Provides an end-to-end process for model segmentation, conversion, and execution on Android devices without relying on high-performance GPUs.
✅ Robustness in Weak Network Conditions
Unlike traditional approaches that assume high-bandwidth, low-latency environments, this method incorporates packet loss modeling and jitter adaptation, ensuring reliable performance even in constrained network settings.
✅ Optimized Communication Overhead
Introduces a network link quality penalty term to balance computation and communication trade-offs.
Implements effective payload calculations for communication protocols to enhance transmission efficiency.
Last updated