Thank you for sending your enquiry! One of our team members will contact you shortly.        
        
        
            Thank you for sending your booking! One of our team members will contact you shortly.        
    Course Outline
Introduction to Vision-Language Models
- Overview of VLMs and their role in multimodal AI
 - Popular architectures: CLIP, Flamingo, BLIP, etc.
 - Use cases: search, captioning, autonomous systems, content analysis
 
Preparing the Fine-Tuning Environment
- Setting up OpenCLIP and other VLM libraries
 - Dataset formats for image-text pairs
 - Preprocessing pipelines for vision and language inputs
 
Fine-Tuning CLIP and Similar Models
- Contrastive loss and joint embedding spaces
 - Hands-on: fine-tuning CLIP on custom datasets
 - Handling domain-specific and multilingual data
 
Advanced Fine-Tuning Techniques
- Using LoRA and adapter-based methods for efficiency
 - Prompt tuning and visual prompt injection
 - Zero-shot vs. fine-tuned evaluation trade-offs
 
Evaluation and Benchmarking
- Metrics for VLMs: retrieval accuracy, BLEU, CIDEr, recall
 - Visual-text alignment diagnostics
 - Visualizing embedding spaces and misclassifications
 
Deployment and Use in Real Applications
- Exporting models for inference (TorchScript, ONNX)
 - Integrating VLMs into pipelines or APIs
 - Resource considerations and model scaling
 
Case Studies and Applied Scenarios
- Media analysis and content moderation
 - Search and retrieval in e-commerce and digital libraries
 - Multimodal interaction in robotics and autonomous systems
 
Summary and Next Steps
Requirements
- An understanding of deep learning for vision and NLP
 - Experience with PyTorch and transformer-based models
 - Familiarity with multimodal model architectures
 
Audience
- Computer vision engineers
 - AI developers
 
             14 Hours
        
        
Delivery Options
Private Group Training
Our identity is rooted in delivering exactly what our clients need.
- Pre-course call with your trainer
 - Customisation of the learning experience to achieve your goals -
 - Bespoke outlines
 - Practical hands-on exercises containing data / scenarios recognisable to the learners
 - Training scheduled on a date of your choice
 - Delivered online, onsite/classroom or hybrid by experts sharing real world experience
 
Private Group Prices RRP from €4560 online delivery, based on a group of 2 delegates, €1440 per additional delegate (excludes any certification / exam costs). We recommend a maximum group size of 12 for most learning events.
Contact us for an exact quote and to hear our latest promotions
Public Training
Please see our public courses