litserve服务在alb下设置合理空闲检测超时时间解决偶发502问题
问题
阿里云上alb服务偶发记录后端502,request_time/upstream_response_time都很短。
调用链:client -> alb -> backend_server
解读
## 1.默认uvicorn,timeout_keep_alive=5
if __name__ == "__main__":
# 12+ features like batching, streaming, etc...
server = ls.LitServer(InferencePipeline(max_batch_size=1, api_path=f'/{MODEL_NAME}'), accelerator="gpu", workers_per_device=WORKERS_PER_DEVICE, devices=[0])
server.run(port=5001)
## 2.设置空闲关闭检测,alb默认15s后端须>15(timeout_keep_alive=25)
server.run(port=5001, timeout_keep_alive=25)