litserve服务在alb下设置合理空闲检测超时时间解决偶发502问题

问题

阿里云上alb服务偶发记录后端502,request_time/upstream_response_time都很短。

调用链:client -> alb -> backend_server

解读

## 1.默认uvicorn,timeout_keep_alive=5
if __name__ == "__main__":
    # 12+ features like batching, streaming, etc...
    server = ls.LitServer(InferencePipeline(max_batch_size=1, api_path=f'/{MODEL_NAME}'), accelerator="gpu", workers_per_device=WORKERS_PER_DEVICE, devices=[0])
    server.run(port=5001)

## 2.设置空闲关闭检测,alb默认15s后端须>15(timeout_keep_alive=25)
server.run(port=5001, timeout_keep_alive=25)