Current Projects

Research projects exploring knowledge distillation, domain-specific model training, and affordable AI deployment. Led by Dave Gilligan, powered by Blue Note Logic infrastructure, operated through Gilligan Tech ENK in Norway.

Master's Thesis v1.0 Spec

CorpusAI CO2 Emissions Model

Nordic Non-ETS Emissions Analysis

A knowledge-distilled 3B parameter model for interpreting Nordic greenhouse gas emissions data. Uses a 32B Qwen teacher model to distill reasoning capabilities into a compact student model, anchored to verified government statistics from SSB and Miljødirektoratet. Designed for CPU-only deployment on AMD EPYC hardware.

3B Parameters
32B Teacher
Q8_0 Quantisation
>99% Accuracy
Qwen 2.5 LoRA Unsloth Ollama MariaDB Qdrant GGUF
Case Study Active

NorwAI Legal Intelligence

Do Better Norge × CorpusAI

A two-stage distillation pipeline producing affordable, Norwegian-specific legal AI. Stage 1: Qwen 27B distils into a 7B "Alpha v0.1" model. Stage 2: Fine-tuning with Norwegian legal intelligence for family law, Barnelova, and Barnevernsloven. Deployed through CorpusAI for Do Better Norge.

7B Parameters
27B Teacher
2-Stage Pipeline
EU Sovereignty
Qwen 2.5 NorwAI LoRA CorpusAI Family Law Bokmål GDPR

The Distillation Framework

Both projects share a core methodology: compress large model intelligence into small, deployable models that organisations can afford to run. This is the innovation thesis — making expert AI accessible without enterprise GPU budgets.

Step 01
Large Teacher Training

A large model (27B–32B parameters) processes domain-specific data and generates Chain-of-Thought reasoning pairs. The teacher demonstrates how to think about the domain, not just what the answers are.

Step 02
Knowledge Distillation

LoRA fine-tuning transfers the teacher's reasoning into a small student model (3B–7B). The student learns domain-specific reasoning at a fraction of the computational cost, using Unsloth on RTX 5090 hardware.

Step 03
Custom Corpus Anchoring

Organisations upload their own documents (legal briefs, policies, reports) into a private CorpusAI corpus. The model is grounded in verified data — every response is traceable to source documents.

Step 04
Affordable Deployment

Quantised to GGUF format and deployed via Ollama on commodity CPU hardware. No cloud dependency, no per-query costs, full data sovereignty within EU/EEA borders. Private AI that organisations actually own.

Unique in the Market

Your Data, Your Model

Law firms, NGOs, municipalities — any organisation can fine-tune a model on their own documents. The result is a private AI that knows your domain, not the internet's version of it.

No GPU Required

Distilled models run on standard CPU servers. A municipal planning office doesn't need a data centre — they need a model that fits on the hardware they already have.

European Data Sovereignty

All training data and inference stays within EU/EEA borders. Hosted on Hetzner infrastructure in Helsinki and Nuremberg. Full GDPR compliance with cryptographic tenant isolation.

Try CorpusAI Platform
👤 Dave Gilligan Creator & Architect
🎵 Blue Note Logic Inc. Infrastructure & Tech
🇳🇴 Gilligan Tech ENK Local Operations, Norway