AI company Sesame has released its base model, CSM-1B, which powers the viral virtual assistant Maya. This model, consisting of 1 billion parameters, is available under an Apache 2.0 license that allows for commercial use with minimal restrictions. You should remember them from their incredible AI conversation demonstration.
Sesame launches CSM-1B: The AI model powering virtual assistant Maya
According to Sesame’s description on the AI development platform Hugging Face, CSM-1B generates “RVQ audio codes” from text and audio inputs. RVQ, or “residual vector quantization,” encodes audio into discrete tokens. This technique is utilized in several modern AI audio technologies, including Google’s SoundStream and Meta’s Encodec.
The CSM-1B model utilizes a backbone from Meta’s Llama family along with an audio decoder component. Sesame notes that a fine-tuned variant of the CSM model powers Maya. In its repositories, Sesame states, “The model open-sourced here is a base generation model. It is capable of producing a variety of voices, but it has not been fine-tuned on any specific voice.” The model has limited capability for non-English languages due to data contamination in the training set.
Consumer Reports: AI voice cloning tools have almost no security checks
Sesame has not disclosed the specific data used to train CSM-1B. The company has acknowledged a lack of safeguards for the model, relying instead on an honor system. They urge developers not to use the model to imitate a person’s voice without their consent, generate misleading content such as fake news, or participate in “harmful” or “malicious” activities.
A demo on Hugging Face demonstrated that voice cloning could be achieved in under a minute, enabling the generation of speech on various topics, including contentious issues like elections and Russian propaganda. Consumer Reports has recently cautioned that many widely-used AI voice cloning tools lack “meaningful” safeguards against fraud or abuse.
Co-founded by Brendan Iribe, an Oculus co-creator, Sesame gained attention in late February 2025 for its assistant technologies that approach uncanny realism. Maya, along with Sesame’s other assistant, Miles, incorporates human-like breathing patterns, speech disfluencies, and can be interrupted while speaking, similar to OpenAI’s Voice Mode.
Sesame has secured an undisclosed amount of funding from Andreessen Horowitz, Spark Capital, and Matrix Partners. The company is also prototyping AI glasses “designed to be worn all day,” which will feature its custom voice models.
Featured image credit: Kerem Gülen/Imagen 3