DApp Store | Web3 Hub for Events & Games

Trending topics

Aegaeon: Effective GPU Pooling for Concurrent LLM Serving on the Market Beida and Alibaba Cloud Aegaeon has been beta deployed in Alibaba Cloud Model Studio for over three months, currently serving tens of models that range from 1.8B to 72B parameters. It reduces the number of GPUs required for serving these models from 1,192 to 213, highlighting an 82% GPU resource saving

Top

Ranking

Favorites