Model Catalogue
Definitions of Modification States
| State | Description |
|---|---|
| Fine-tuned In-House | Further trained internally on task-specific data. |
| Externally Fine-tuned | Loaded from a third-party fine-tuned checkpoint. |
| Trained from Scratch | Entirely trained from random parameters, no pretraining. |
| None | Directly loaded pretrained model from original creator. |
Embedding Models
Text Embedding Models
| Model | Inputs | Usage | Hosting | License | Modifications | Paper/Info | Repository |
|---|---|---|---|---|---|---|---|
| all-MiniLM-L6-v2 | Text | General-purpose embeddings (prototyping, multiple projects) | Self Hosted | Apache-2.0 | None | HuggingFace | GitHub |
| GIST-small-Embedding-v0 | Text | SetFit embeddings (RoS) | Self Hosted | MIT | None | HuggingFace | Paper |
| text-embedding-3-large | Text | High-quality embeddings (Sidekick) | OpenAI API | Proprietary (OpenAI) | None | OpenAI | N/A |
| e5-base-v2 | Text | Embeddings for SetFit classification (RoS) | Self Hosted | MIT | Fine-tuned In-House (RoS) | Paper | HuggingFace |
Visual and Multimodal Embeddings
| Model | Inputs | Usage | Hosting | License | Modifications | Paper/Info | Repository |
|---|---|---|---|---|---|---|---|
| Swin | Image | Visual embeddings for MM/VCD | Self Hosted | MIT | Externally Fine-tuned | Paper | GitHub |
| ViT | Image | Visual embeddings for MM/VCD and MM/VFR | Self Hosted | Apache-2.0 | Externally Fine-tuned (VCD), Fine-tuned In-House (VFR) | Paper | GitHub |
| CLIP-ViT | Image/Text | Zero-shot image-text matching, (MM/semantic delta) | Self Hosted | MIT | None | Paper | HuggingFace |
| CLAP | Audio + Text | Audio embeddings (use to be confirmed) (MM/semantic delta) | Self Hosted | MIT | None | CLAP Overview | GitHub |
Image Analysis Models
| Model | Inputs | Usage | Hosting | License | Modifications | Paper/Info | Repository |
|---|---|---|---|---|---|---|---|
| Florence-2 | Image/Text | Image captioning, object detection (MM) | Self Hosted | MIT | None | Paper | HuggingFace |
| GroundingDINO | Image/Text queries | Open-set object detection (MM) | Self Hosted | Apache-2.0 | None | Paper | GitHub |
| SAM 2 | Image + Prompts (points/bbox) | Image segmentation (MM + MM/demos) | Self Hosted | Apache-2.0 | None | Paper | GitHub |
| HRNet (hrnet1) | Image | Segment alignment for MM/VCD | Self Hosted | Apache-2.0 | Externally Fine-tuned | Paper | GitHub |
| DETR | Image | Object detection for brand compliance (MM) | Self Hosted | Apache-2.0 | Fine-tuned In-House | Paper | GitHub |
| YOLO Models | Image | Real-time object detection (e.g. Face detection in MM/demos) | Self Hosted | GPL-3.0/AGPL-3.0 | None/Externally Fine-tuned | Overview | GitHub |
| RAM | Image | Zero-shot image tagging (MM/demos) | Self Hosted | Apache-2.0 | None | Overview | GitHub |
| GPT-4o | Image/Text | Image QA (MM) | OpenAI API | Proprietary | None | OpenAI | N/A |
Text Analysis Models
| Model | Usage | Hosting | License | Modifications | Paper/Info | Repository |
|---|---|---|---|---|---|---|
| RoBERTa | Classification, paraphrase detection (MM), tested for text QA for semantic extractors (RoS). | Self Hosted | MIT | None | Paper | GitHub, SQuAD |
| SetFit | Few-shot classification (MM, RoS) | Self Hosted | Apache-2.0 | Fine-tuned In-House (RoS/MM) | Overview | GitHub |
| Route0x | Intent classification, routing | Self Hosted | MIT | None | Overview | GitHub |
| BERTopic | Topic Modelling (CNS) | Self Hosted | MIT | Fine-tuned In-House (CNS) | Paper | GitHub |
Audio Analysis Models
| Model Name | Type | Used for | Hosting | License | Modifications | Paper/Info Link | Repo Link |
|---|---|---|---|---|---|---|---|
| Whisper | Audio | Speech-to-text transcription (MM + CNS) | Self Hosted | MIT | None | Whisper Paper | GitHub |
| Pyannote | Audio + Metadata | Speaker diarisation (MM/demos) | Self Hosted | MIT | None | Pyannote Overview | GitHub |
Document Ingestion Tasks
| Model Name | Type | Used for | Hosting | License | Modifications | Paper/Info Link | Repo Link |
|---|---|---|---|---|---|---|---|
| Azure Doc Intelligence | Doc OCR (MM) | Document parsing and layout analysis | Azure Cloud Service | Proprietary (MS) | None | Azure Overview | N/A |
| Docling | Document Parsing (MM) | Document parsing and layout analysis | Self Hosted | MIT | None | Docling Overview | GitHub |
LLM APIs and Models
| Model | Usage | Hosting | License | Paper/Info | Repository |
|---|---|---|---|---|---|
| GPT-4o | SPARQL generation, structured outputs (Sidekick, MM) | OpenAI API/Azure | Proprietary (OpenAI) | OpenAI | N/A |
| GPT-4o-mini | Summarisation, SPARQL queries (Sidekick, MM) | OpenAI API/Azure | Proprietary (OpenAI) | N/A | N/A |
| GPT-4.1 | Structured outputs (MM) | OpenAI API/Azure | Proprietary (OpenAI) | N/A | N/A |
| Ollama/llama3.1 | SPARQL queries (Sidekick, MM) | Self Hosted | Proprietary (OpenAI) | Paper | GitHub |