Today has been an interesting day. I woke up at 4 am and couldn’t get back to sleep, so I got up at 6 and went for a run. Then I completed my power routine, meditated and didn’t eat anything till late in the morning.
The experiments with Whisper continued today. I recorded a meeting and fed it to GPT with a prelude based on the audience or the end goal. After curating the chunks, I threw them through the summarizer and had a quick TLDR.
Now I’m trying to figure out the best scaling solution to capture a conference room full of voice activity, feed it into contextualization or summarization, and chunk it down. I’m also looking for a good design to manage the prompts via command line.
I’m also working on abstracting the functions from my Discord bot and sharing them between libraries. And I’ve had a realization that to judge the relevance of a transcript, I need more context. So my current strategy is to crawl transcripts from whatever audio/video source, add a context layer, and break it into chunks of 2000 characters or less.
I’m exploring the concept of a semantic search, which is vectorization of a string based on a dictionary. This linear array of word weights can be plotted in a multi-dimensional space, and the semantic search looks for the nearest neighbors.
It’s been a busy day and I’m having fun moving forward.