Tutorial - AudioCraft
Let's run Meta's AudioCraft , to produce high-quality audio and music on Jetson!
What you need
-
One of the following Jetson devices:
Jetson AGX Orin (64GB) Jetson AGX Orin (32GB) Jetson Orin Nano (8GB)
-
Running one of the following versions of JetPack :
JetPack 5 (L4T r35.x)
-
Sufficient storage space (preferably with NVMe SSD).
-
10.7 GB
foraudiocraft
container image - Space for checkpoints
-
-
Clone and setup
jetson-containers
:git clone https://github.com/dusty-nv/jetson-containers bash jetson-containers/install.sh
How to start
Use
run.sh
and
autotag
script to automatically pull or build a compatible container image.
jetson-containers run $(autotag audiocraft)
The container has a default run command (
CMD
) that will automatically start the Jupyter Lab server.
Open your browser and access
http://<IP_ADDRESS>:8888
.
The default password for Jupyter Lab is
nvidia
.
Run Jupyter notebooks
AudioCraft repo comes with demo Jupyter notebooks.
On Jupyter Lab navigation pane on the left, double-click
demos
folder.
AudioGen demo
Run cells with
Shift + Enter
, first one will download models, which can take some time.
Info
You may encounter an error message like the following when executing the first cell, but you can keep going.
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
In the Audio Continuation cells, you can generate continuation based on text, while in Text-conditional Generation you can generate audio based just on text descriptions.
You can also use your own audio as prompt, and use text descriptions to generate continuation:
prompt_waveform, prompt_sr = torchaudio.load("../assets/sirens_and_a_humming_engine_approach_and_pass.mp3") # you can upload your own audio
prompt_duration = 2
prompt_waveform = prompt_waveform[..., :int(prompt_duration * prompt_sr)]
output = model.generate_continuation(prompt_waveform.expand(3, -1, -1), prompt_sample_rate=prompt_sr,descriptions=[
'Subway train blowing its horn', # text descriptions for continuation
'Horse neighing furiously',
'Cat hissing'
], progress=True)
display_audio(output, sample_rate=16000)
MusicGen and MAGNeT demos
The two other jupyter notebooks are similar to AuidioGen, where you can generate continuation or generate audio, while using models trained to generate music.