by Max Vornovskykh | January 14, 2026 12:25 pm
In over twenty years of business, long before AI we operated under a fundamental truth: knowledge is the most valuable asset. Across hundreds of projects, our team has accumulated extensive QA expertise. Yet, as we scaled, we encountered the classic enterprise paradox – the more knowledge you gather, the harder it becomes to access it when you need it most.
For years, the task of collecting, organising, and retrieving knowledge came with a heavy asterisk. It was manual, inconsistent, and often relied on human memory.
Then came the COVID-19 pandemic. It forced a transition to hybrid work that initially felt like a constraint but revealed a hidden opportunity. Suddenly, every architectural decision, every debugging session, and every client negotiation had digital footprints.
Combined with the explosive maturing of LLMs (Large Language Models), we saw a chance to solve a problem that had plagued us (and the entire industry) for years. We set out to build a self-updating Corporate Brain – a system that transforms the information flowing through daily calls into tangible data pieces.
Key numbers: the system currently processes 50–60 recordings/day, and our tiered S3 approach keeps monthly storage around $20 even at ~65 videos/day (~350MB each).
This article details how we engineered a secure, event-driven AI pipeline to automate knowledge management, and how we applied our core QA philosophy to test the “untestable” world of generative AI.
We analyzed why corporate knowledge fails to stick, identifying four specific friction points and how our AI solution resolves them:
It wasn’t enough to make a ChatGPT wrapper, as it would introduce privacy risks. We had to design a system architecture for processing internal meeting videos without using public APIs.
That’s how we ended up with a local LLM knowledge base that was secure, scalable, and deeply integrated into our ecosystem. Our solution currently processes 50-60 meeting recordings per day. It transforms raw video into searchable, analysed knowledge assets, tagging them and storing them in our secure corporate environment.
When approaching a task like that, the common question is getting the best tech stack for a secure, self-hosted AI transcription service. Here’s what we did.
To handle the heavy lifting of video processing, we adopted Event-Driven Architecture. This meant utilising RabbitMQ to orchestrate communication between services, ensuring loose coupling. This way, our AI worker-services scaled independently of the ingestion layer and handled spikes in meeting volume without degraded performance.
Here is how the data flows through our Corporate Brain:

The Ingestion Layer (MS Teams Bot):
The Video Scraper (Whisper):
The Semantic Engine (Tag & Knowledge Extractors):
The Archive:
Notion Database:
Cost-Efficient Storage Strategy
Video data is heavy. To make this financially viable, we implemented a tiered storage strategy on AWS S3:
Local LLMs vs Public APIs (Security First)
As a QA company handling sensitive client data, sending transcripts to public APIs was not an option. To ensure that no sensitive data leaves our infrastructure, we leverage local LLMs (LLaMA 3.1) and local transcription models. Additionally, because access is enforced via ACL-based controls in the archive UI, only authorized teams can access recordings and related knowledge, ensuring security.
This is where our DNA as a QA company played a pivotal role. Building an AI demo is easy, but building a reliable system and putting it into production is much harder. The challenge with GenAI is non-determinism – how do you evaluate if the knowledge is extracted properly, summaries are relevant, and key points are actually key?
When choosing QA strategies for non-deterministic LLM outputs, our team leverages manual effort[1] combined with the latest methodologies. This one was not only a development project but a full R&D initiative focused on AI Testing & Evaluation Methodologies. We implemented:
This approach ensures that as we upgrade our models, we don’t silently degrade the quality of our insights.
Our QA team had done this several times for large, modern projects, so this part posed no difficulty for us – it simply allowed us to master a couple of new tools and approaches, which was, of course, very useful and interesting.
What we have built solved the problem of data capture and retrieval. The next phase is transforming this data into Active Intelligence.
Our next planned features include:
The Knowledge Graph
We are moving beyond linear search to a graph-based model connecting Topics→People→Projects→Decisions. This will allow us to visualise hidden dependencies between teams and instantly identify the true subject matter experts within the organisation.
Organisational Health Monitoring
By analysing sentiment and discussion dynamics over time, the system will act as an early warning signal for burnout, conflict, or project gridlock. This moves HR and management from reactive problem-solving to proactive culture management.
The AI Mentor
Imagine a new Junior QA Engineer joining a complex project. Instead of reading documentation for a week, they can ask the Corporate Brain: “Give me a 10-minute summary of the architectural decisions made regarding the payment gateway in the last 6 months”. This drastically reduces ramp-up time.
If you are considering building your own Corporate Brain, here is our advice from the trenches:
This project has served a dual purpose: it has given our leadership unprecedented visibility into our operations, and it has served as a proving ground for our engineers to master the testing of event-based AI systems. We are now applying these advanced competencies to our client projects, ensuring we remain a quality frontrunner in AI.
Whenever you’re working on internal AI projects or looking for an experienced team to ensure quality for a customer-facing product, we can provide tailored QA support. Find cooperation details at the dedicated page[2].
[3]Source URL: https://blog.qatestlab.com/2026/01/14/secure-local-ai-for-corporate-data/
Copyright ©2026 QATestLab Blog unless otherwise noted.