Meta audiocraft 超强文本转音乐


Meta audiocraft 超强文本转音乐

https://dyss.top/1087

安装

1
2
3
4
5
6
7
8
9
10
11
conda create -n audiocraft python=3.9 -y
conda activate audiocraft

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

python
>>> import torch
>>> torch.cuda.is_available()
True


  • 配置ffmpeg

打开ffmpeg官网:https://ffmpeg.org/download.html,下载对应Windows系统的 Windows builds from gyan.dev 里面的 ffmpeg-git-full.7z 文件,

将压缩包下载到本地电脑上解压,然后打开bin文件夹,复制路径

鼠标右键点击我的电脑-属性-高级系统设置-环境变量-选中Path点击编辑-新建,然后在输入框内输入上面的文件夹路径,然后确定

这时在cmd里输入如下命令:

1
ffmpeg

出现ffmpeg详细信息就表示ffmpeg安装成功

1
2
3
4
5
6
7
8
9
git clone https://github.com/facebookresearch/audiocraft.git
cd audiocraft

pip install -e .

pip uninstall -y xformers
pip install xformers
pip install gradio

使用 musicgen

1
python .\demos\musicgen_app.py

浏览器打开 http://127.0.0.1:7860/

使用 audiogen

新建 audiogen.py 文件

1
2
3
4
5
6
7
8
9
10
11
12
13
import torchaudio
from audiocraft.models import AudioGen
from audiocraft.data.audio import audio_write

model = AudioGen.get_pretrained('facebook/audiogen-medium')

model.set_generation_params(duration=5) # generate 8 seconds.
descriptions = ['dog barking', 'sirenes of an emergency vehicule', 'footsteps in a corridor']
wav = model.generate(descriptions) # generates 3 samples.

for idx, one_wav in enumerate(wav):
# Will save under {idx}.wav, with loudness normalization at -14 db LUFS.
audio_write(f'{idx}', one_wav.cpu(), model.sample_rate, strategy="loudness", loudness_compressor=True)

运行后会生成 wav 文件