Demo Video
Core Capabilities
๐ฅ Hotspot Analysis
Automatically identifies CPU-intensive functions and code paths that consume the most execution time.
- Function-level profiling
- CPI rate analysis
- Pipeline stall detection
๐งต Threading Analysis
Detects threading bottlenecks, lock contention, and synchronization issues in multi-threaded applications.
- Wait time analysis
- Lock contention detection
- Parallel efficiency metrics
โก Vectorization
Analyzes vectorization opportunities and provides recommendations for SIMD optimization.
- Loop vectorization analysis
- Dependency detection
- AVX-512 optimization guidance
๐พ Memory Profiling
Examines memory access patterns, cache behavior, and DRAM utilization.
- L3 cache miss analysis
- Memory bandwidth utilization
- Data access optimization
๐ค AI Integration
Seamlessly integrates with AI coding assistants for automated performance diagnostics.
- Claude Code support
- Cursor integration
- Automated report generation
๐ Comprehensive Reports
Generates detailed diagnostic reports with actionable optimization recommendations.
- CSV export support
- Markdown reports
- Performance metrics tracking
Quick Start
Install the skill to your AI coding assistant in 30 seconds:
git clone https://github.com/nv3ifu/vtune-advisor-cpu-profiling-skill.git cd vtune-advisor-cpu-profiling-skill .\install.ps1 -Claude # For Claude Code .\install.ps1 -Cursor # For Cursor .\install.ps1 -AllAgents # Install to all supported AI tools
Usage Example
Once installed, simply ask your AI assistant to analyze performance:
You: Help me analyze this program's CPU performance bottleneck
AI: [Automatically invokes vtune-advisor-cpu-profiling skill]
I'll analyze the program's performance:
1๏ธโฃ Running VTune to collect performance data...
$ vtune -collect hotspots -result-dir vtune_result\hotspots -- .\app.exe
2๏ธโฃ Diagnosis Results:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Main Bottleneck: CPU_TIME_BOUND โ
โ Hotspot Function: matrix_multiply โ
โ CPU Time: 45.2% โ
โ L3 Cache Miss: 32% ๐ด (High) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
3๏ธโฃ Optimization Recommendations:
โข Optimize data access patterns to improve cache hit rate
โข Consider using SIMD vectorization for computation
Expected Performance Gain: 2-3x ๐
Technology Stack
Supported Analysis Types
VTune Profiler
- Performance Snapshot
- Hotspots Analysis
- Threading Analysis
- Memory Consumption
- Memory Access
- Microarchitecture Exploration
- Anomaly Detection
- HPC Performance
- I/O Analysis
- System Overview
Intel Advisor
- Survey Analysis
- Trip Counts
- Dependencies
- Vectorization
- Suitability Analysis
- Roofline Model