-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[model support] please support mamba-codestral-7B-v0.1 #1968
Comments
It already supports it. Use the mamba conv1d plugin. |
Now we can support Mamba2 model with the HF Mamba2 config format: https://huggingface.co/state-spaces/mamba2-2.7b/blob/main/config.json. For the mamba-codestral-7B-v0.1, you can create a new config.json from the existing params.json and make it similar to the HF Mamba2 config format. And also change the tensor name in codestral checkpoint to align with the HF Mamba2 checkpoints. Then it can work. We will have a fix to directly support mamba-codestral-7B-v0.1 checkpoint soon. |
We added a mamba-codestral-7B-v0.1 exampel in today's update. Please refer to https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/mamba and have a try. |
cannot install tensorrt_llm==0.12.0.dev2024072301 |
You need to reinstall tensorrt_llm. |
convert ok, but trtllm-build failed [TensorRT-LLM] TensorRT-LLM version: 0.12.0.dev2024072301 |
I cannot reproduce this error. Can you share your command? |
sorry, i start a new python env and it works. thx for that , i will close the issue. |
Are there plans to support tp>1 @lfr-0531? |
Coming soon. |
https://mistral.ai/news/codestral-mamba/
You can deploy Codestral Mamba using the mistral-inference SDK, which relies on the reference implementations from Mamba’s GitHub repository. The model can also be deployed through TensorRT-LLM. For local inference, keep an eye out for support in llama.cpp. You may download the raw weights from HuggingFace.
Unfortunately, this doesn't work
File "/home/jet/github/TensorRT-LLM/examples/mamba/convert_checkpoint.py", line 302, in main
hf_config, mamba_version = load_config_hf(args.model_dir)
File "/home/jet/github/TensorRT-LLM/examples/mamba/convert_checkpoint.py", line 260, in load_config_hf
config = json.load(open(resolved_archive_file))
TypeError: expected str, bytes or os.PathLike object, not NoneType
The text was updated successfully, but these errors were encountered: