WhisperX で音声認識と話者分離をしてみよう

Windows 編

WhisperX のインストール (CPU 処理)

Anaconda Powershell Prompt を管理者として実行します．具体的な方法はいくつかありそうですが，例えば，Anaconda Powershell Prompt を起動し「設定」から「このプロファイルを管理者として実行する」を「On」にします．なお，今回は Powershell 7.x 系を利用しています．管理者として実行しなければ，この後に実行する pip install whisperxでエラーが表示されるはずです．

まず，python 3.10 のバージョンを指定して新たな仮想環境を作成します．なお，-y オプションを省略した場合は，途中で確認メッセージが表示されるので，y を入力してください．

(base) PS C:\Users\...\whisperx> conda create -n whisperx_cpu python=3.10 -y ⏎

なお，作成済みの仮想環境を一覧で表示するには次のコマンドを実行してください．

(base) PS C:\Users\...\whisperx> conda env list ⏎

作成した仮想環境 whisperx_cpu を有効にします．

(base) C:\Users\...\whisperx> conda activate whisperx_cpu ⏎
(whisperx_cpu) PS C:\Users\...\whisperx>

ここでは音声認識処理にCPUを利用すること前提に whisperx をインストールします．

(base) C:\Users\...\whisperx> pip install whisperx ⏎

whisperx と同時に torch や torchaudio がインストールされたことを確認します．なお，Select-String の使い方はこちらを参照してください．またコマンドプロンプト（Anaconda Prompt）の場合は，Select-String の代わりに find を利用してください．

(whisperx_cpu) PS C:\Users\...\whisperx> pip list | Select-String whisper ⏎

faster-whisper                           1.2.1
whisperx                                 3.8.6

(whisperx_cpu) PS C:\Users\...\whisperx> pip list | Select-String torch ⏎

pytorch-lightning                        2.6.5
pytorch-metric-learning                  2.9.0
torch                                    2.8.0
torch-audiomentations                    0.12.0
torch_pitch_shift                        1.2.5
torchaudio                               2.8.0
torchcodec                               0.7.0
torchmetrics                             1.9.0
torchvision                              0.23.0

(whisperx_cpu) PS C:\Users\...\whisperx>

目次に戻る

« 戻る次へ »