Open-source Patent

DigiMorph is building an Decentralized, Open Innovation Ecosystem

"A Computing Task Allocation System and Method for Distributed Inference in Large Language Models."

This patent focuses on optimizing the inference process of large language models (LLMs) on mobile devices by utilizing distributed task allocation, model segmentation, and linear optimization techniques. The goal is to enable efficient, low-latency, and cost-effective inference across multiple edge devices.

This patent is based on the February 2024 prototype and experimental results from the DigiMorph Lab team, demonstrating real-world feasibility and performance improvements in decentralized AI inference. The core content of the patent is open-sourced in this repository, while the full patent has been filed in Australia, the United States, and China in 2024.

The Value of This Patent for DigiMorph

As one of the foundational technologies for DigiMorph, this patent plays a critical role in shaping the platform’s approach to AI-driven digital embodiment and decentralized intelligence.

🔹 Empowering Personal Digital Embodiments

  • Enables low-latency, personalized AI agents to operate on distributed devices without cloud dependency.

🔹 Decentralized AI Infrastructure

  • Aligns with DigiMorph’s Web3 and decentralized AI vision, reducing costs and improving scalability.

🔹 Integration with Web2 & Web3 Ecosystems

  • Supports AI agents functioning across social media, gaming, blockchain-based identities, and dApps.

🔹 AI Computation Monetization

  • With DigiMorph’s Proof of Interaction mechanism, distributed AI inference could be monetized via decentralized staking, AI compute credits, or token rewards.

This patent serves as a cornerstone for DigiMorph’s AI Agent evolution, offering a scalable, decentralized, and intelligent infrastructure to support multi-agent collaboration, real-world AI autonomy, and next-gen digital intelligence.

Key Innovations

✅ Distributed Task Allocation for LLMs

  • Instead of executing LLMs on a single device, this method distributes inference tasks across multiple mobile devices, improving efficiency and reducing latency.

✅ Advanced Network-Aware Task Scheduling

  • Implements Mixed-Integer Linear Programming (MILP) for optimal model partitioning, considering:

    • Model FLOPs, memory constraints, device processing capabilities

    • Network conditions such as bandwidth, jitter, and packet loss

    • Binomial distribution modeling for packet loss rate

✅ Enhanced ONNX-Based Adaptive Model Deployment

  • Provides an end-to-end process for model segmentation, conversion, and execution on Android devices without relying on high-performance GPUs.

✅ Robustness in Weak Network Conditions

  • Unlike traditional approaches that assume high-bandwidth, low-latency environments, this method incorporates packet loss modeling and jitter adaptation, ensuring reliable performance even in constrained network settings.

✅ Optimized Communication Overhead

  • Introduces a network link quality penalty term to balance computation and communication trade-offs.

  • Implements effective payload calculations for communication protocols to enhance transmission efficiency.

More Information: https://github.com/DigiMorphLab/distributed-llm-inferencearrow-up-right

Last updated