Goats?
I needed goat screams. Hundreds of them. High quality. Labeled with musical notes.
This is the story of the tools I built to get there.
The Origin: Build Something Weird Enough to Ship
I wanted to learn API development. Not another todo app. Something fun enough to keep me motivated.
In a hunt for silly APIs, I came across the Owen Wilson Whoa API, and it inspired me. Truly. I remembered the goat scream meme era from YouTube. Those viral compilations of goats screaming like humans. Perfect.
Phase 1: YouTube Scraping
First attempt: find videos, download them, and agents clip the screams.
This was slow. And the results were rough:
- Background music (Taylor Swift goat remixes, anyone?)
- People laughing over the screams
- Cuts in weird places
- Duplicate screams across videos
I needed automation.
Tool #1: The Extract-o-matic
I built a Streamlit app that:
- Takes a list of YouTube URLs
- Downloads the audio
- Detects audio patterns that match goat screams (amplitude spikes, frequency ranges)
- Auto-clips them into individual MP3s
- Generates basic metadata (source video, timestamp, duration)
`python
Simplified detection logic
def detect_screams(audio_path):
y, sr = librosa.load(audio_path)
# Look for sudden amplitude spikes in goat frequency range
onset_frames = librosa.onset.onset_detect(y=y, sr=sr)
# Filter for goat-like frequency characteristics
# ... (frequency analysis)
return scream_timestamps`
This got me from hours of manual work to minutes. But the quality was still inconsistent.
Phase 2: Synthetic Screams
The YouTube clips had too much noise. I needed cleaner source material.
Stock audio sites had some goat sounds, but not enough variety. So I turned to AI.
The ElevenLabs Discovery
I tried several platforms:
- OpenAI voice generation
- Various sound effect models (Adobe SFX Generator, Kling’s AI SFX Generator, PopPop, My Edit, Elevenlabs AI SFX Generator, etc.
ElevenLabs won. Their sound effects tool generates clips that actually sounded indistinguishable from real goats from youtube. It also worked best with hilarious prompts...
The Prompting Insight
"Goat scream" as a prompt was inconsistent.
What worked better: describing screamable experiences with a goat at the center.
Successful prompts:
- "Goat yells in horror at a spider because it hates spiders"
- "A dramatic goat yelling protest while being ridiculed"
- "A goat stubs its toe and screams like crazy"
Less successful:
- "Goat scream"
- "Goat noise"
- "Goat sound effect"
The context gave the model something to work with. Emotional state + situation = better variety.
Phase 3: The Sora Hack
Then Sora came out.
I was playing with video generation when I had an idea: what if I could get 15 seconds of goat screams in one generation?
The prompt that worked: "Montage of goats screaming"
Sora would generate a compilation-style video with multiple goats, multiple screams, natural variety. I'd feed the audio through my Extract-o-matic and get 5-10 clean clips per video.
I tried other video platforms (Veo, Wan, etc.) but Sora's audio quality was best for this use case.
Soon, I had as many (if not more) synthetic screams than real ones. And when I played them back-to-back, I couldn't reliably tell which were which—unless I had a specific memory attached to a real one.
The Quality Problem
I now had hundreds of clips. But many were still rough:
- Weird cutoffs
- Non-goat sounds that slipped through
- Duplicates
- Clips that needed manual trimming
I needed to audit them. Fast.
Tool #2: The Goat Screams Auditor
I needed a faster review workflow, so I built a swipe-style QA interface for audio clips, inspired by a hiring app we built in growth at Uber.
The Goat Screams Auditor:
- Shows one clip at a time with a media player
- Keyboard shortcuts:
→next,Ggood,Bbad,Nnot a scream - Auto-hides after rating
- Optional notes field
- Exports clean lists: good files, bad files, needs-editing
I breezed through the entire library. The good ones went to the API. The bad ones got deleted. The "needs editing" pile went to Audacity for manual cleanup when I had time (which I haven't yet, and doesn't matter).
The Music Problem
I wanted to build a beatbox (bleatbox, if you will) where you could make beats with goat screams.
First attempt: random screams on a grid.
Result: absolute chaos. Not fun. Just noise.
The screams needed musical structure. I needed to know what note each goat was hitting.
Tool #3: Wannableat
(If you wannableat my lover...)
Wannableat is a pitch analyzer that:
- Takes an audio file
- Runs pitch detection (using
librosaandcrepe) - Identifies the primary musical note
- Maps the tone sequence throughout the scream
- Outputs metadata for the API
python
def analyze_pitch(audio_path):
y, sr = librosa.load(audio_path)
# Using CREPE for more accurate pitch detection
time, frequency, confidence, activation = crepe.predict(y, sr)
# Convert frequencies to musical notes
notes = [frequency_to_note(f) for f in frequency if confidence > 0.5]
primary_note = most_common(notes)
return {
"primary_note": primary_note,
"tones_in_order": notes,
"confidence": avg(confidence)
}
Fun finding: Goat screams cluster around G#5 and A5. Something about goat vocal cords, I guess.
With pitch data, I could:
- Query screams by musical note
- Build an actual playable instrument
- Pitch-shift screams to hit specific notes
- Create melodies
The Goat Screams went from chaos to... slightly musical chaos.
The Flywheel
Here's where it gets interesting.
One of the apps I built is Goat I-Scream—it takes a photo of you and generates a video of you transforming into a screaming goat (using Veo).
The audio from those generations? It's a goat scream.
So now:
- User makes a personal Goat Screams video
- Audio gets extracted and stored
- Runs through Wannableat for pitch analysis
- Gets added to the API
- Feeds more beats in Goat Screams
- Attracts more users
- Who make more Goat Screams videos
- Repeat
The API feeds itself. Network effects, but for goat screams. Which probably doesn't need network effects, but it's kinda my jam.
What I Learned
Silly projects > serious projects for learning. I actually finished this one, and had a blast building the features and tools along the way. I didn't have a revenue idea for API (yet), so while this annoyed my family, it delighted me.
Synthetic data is underrated. The AI screams are indistinguishable from real ones. And I could generate variety on demand. This is a hot topic, and I'm fully on board.
Context Engineering > Prompt Engineering = Gold. "Goat scream" vs "goat screaming when it accidentally touches a cactus" comes from two completely unrelated contexts I had (goats + things that make me scream). Combining the contexts produces more creative results. Think beyond the context in front of you to raise the bar.
Build tools for your tools. The Extract-o-matic, Auditor, and Wannableat each saved hours. The meta-work was worth it. And I never really thought about music analysis and now I have three additional ideas I could fork off of it, just because I built it for goats.
Steal workflows from other domains. The Tinder-style auditor came from a hiring tool we built at Uber. The pitch analyzer came from music production. Cross-pollination works.
What's Next
I'm cleaning up and will open source the side tools:
- Extract-o-matic (Streamlit audio clipper)
- Goat Screams Auditor (rapid QA interface)
- Wannableat (pitch analyzer)
They're goat-themed, but the patterns work for any audio classification/processing pipeline.
And I'll be launching a more consumer facing app with lots of fun AI micro-apps I built on top of GoatScreams API.
Hold your butts...
Links
- API: goatscreams.com/developer
- Docs: api.goatscreams.com/docs
- GitHub: github.com/AIMateyApps/goat-scream-api
- Waitlist (apps launching soon): goatscreams.com/waitlist
Top comments (0)