Audio Annotation Pipeline

Audio Annotation Pipeline #

Processing steps of participant recall audio from the Movie 24 task.

Data Structure #

BOX/Vwani_Movie #

> [Download Box Drive](https://www.box.com/resources/downloads) to access the data through Finder. The pipeline code contains paths for Box mounted through Finder. You need to change these on your locally downloaded code files.

Convert to .wav #

1. convert microphone audio to `.wav` (from `.nsx`, `.ncs`, etc.)

if .m4a, use: ffmpeg -i .m4a .wav

2. keep file naming convention (e.g.`563_exp_10_preSleep_movie_24_audio.wav`)

3. copy `.wav` to `/Box/Vwani_Movie/raw_patient_audio/24 movie audio`

Fried Lab Movie task notes #

1. open Google Sheets: [Fried Lab Movie task notes](https://docs.google.com/spreadsheets/d/1wEZkKO_Ah8LCDakEByNJ-OYxreMJAsIdqKloZ0LBCmc/edit#gid=1894807214)

2. go to `ExpInfo` sheet, enter all info possible

3. go to `24_S06E01` sheet, enter info

4. open `.wav` and listen for the exact start and stop times for the free recall (format = h:mm:ss)

5. export `24_S06E01` sheet as `.csv` to `/Box/Vwani_Movie/tiro` (replace existing)

Tiro #

Install #

1. copy `/Box/Vwani_Movie/tiro` to your computer and set it up following `README.md`

tiro.py #

1. run `tiro.ipynb`

Change paths for your computer

set local tiro paths
- tiro = ‘/Users/ajrangel/projects/tiro/tiro’
set Box paths
- box = ‘/Users/ajrangel/Library/CloudStorage/Box-Box/Vwani_Movie/’

Speech-to-Text #

Whisper (OpenAI) #

Install #

1. install openai-whisper

pip install -U openai-whisper

whisper.sh #

1. go to `Box/Vwani_Movie/audio_annotation/whisper/code`

2. copy `whisper.sh` to local directory

3. open `whisper.sh` and edit the directory variables: `audio_dir`, `whisper`, `out_dir`

audio_dir=/Users/ajrangel/Library/CloudStorage/Box-Box/Vwani_Movie/audio_annotation/Free_recall_sound_files
whisper=/Users/ajrangel/anaconda3/envs/tiro_env/bin/whisper
out_dir=/Users/ajrangel/Library/CloudStorage/Box-Box/Vwani_Movie/audio_annotation/whisper/transcripts

4. run `whisper.sh`

– in Terminal, type path to local `whisper.sh` and hit enter

– enter patient number(s)

## Wordpools

> Complete human check of automated transcription before generating wordpool

#### **wordpool.ipynb**

1. go to `Box/Vwani_Movie/audio_annotation/Wordpools/code`

2. run `wordpool.ipynb`

## Hoffman Post-Processing

### Automatic Annotations

> Download [h2jupynb](https://www.hoffman2.idre.ucla.edu/Using-H2/Connecting/Connecting.html#connecting-via-jupyter-notebook-lab) and connect to Hoffman via Jupyter Lab

#### **automatic_annotations.py**

1. run `automatic_annotation.py`

Task Design #

Cued Free Recall #

6 cued recall questions added for p566-onwards:

Where does Chloe work and what upsets her about Jack?


Who are these characters and how do they relate to terrorism?



Who spoke to the President in the episode? What did they talk about?


What was his story?


Describe the times Jack was restrained. How did he get there or get free?


Where are the various locations that the episode takes place?