Development Guide
Last Updated: 2026-02-13
Comprehensive guide for developing, extending, and maintaining VaulType.
Table of Contents
Section titled “Table of Contents”- Project Structure
- Swift Package Organization
- C/C++ Bridging Conventions
- Naming Conventions
- Git Workflow
- How to Add a New Processing Mode
- How to Add a New Voice Command
- How to Integrate a New Model Format
- Testing Strategy
- Performance Profiling
- Memory Leak Detection
- Next Steps
Project Structure
Section titled “Project Structure”VaulType/├── VaulType.xcodeproj # Xcode project file├── VaulType/ # Main app target│ ├── App/│ │ ├── VaulTypeApp.swift # @main entry point│ │ ├── AppDelegate.swift # NSApplicationDelegate (menu bar, lifecycle)│ │ └── MenuBarManager.swift # Menu bar icon and dropdown management│ ├── Views/│ │ ├── Settings/│ │ │ ├── SettingsView.swift # Root settings window│ │ │ ├── GeneralSettingsTab.swift # Launch at login, hotkey, etc.│ │ │ ├── AudioSettingsTab.swift # Input device, noise gate│ │ │ ├── ModelsSettingsTab.swift # Model download/management│ │ │ ├── ModesSettingsTab.swift # Processing mode configuration│ │ │ └── AdvancedSettingsTab.swift # Injection method, diagnostics│ │ ├── Overlay/│ │ │ ├── OverlayWindow.swift # NSPanel for floating overlay│ │ │ └── OverlayView.swift # SwiftUI overlay content│ │ └── Components/│ │ ├── AudioLevelIndicator.swift # Real-time audio level meter│ │ ├── ModelDownloadRow.swift # Model download progress UI│ │ └── ModeSelector.swift # Processing mode picker│ ├── Services/│ │ ├── Audio/│ │ │ ├── AudioCaptureService.swift # AVAudioEngine management│ │ │ └── VoiceActivityDetector.swift # VAD implementation│ │ ├── Speech/│ │ │ ├── WhisperService.swift # whisper.cpp Swift wrapper│ │ │ └── WhisperContext.swift # whisper_context lifecycle│ │ ├── LLM/│ │ │ ├── LLMService.swift # llama.cpp Swift wrapper│ │ │ ├── LlamaContext.swift # llama_context lifecycle│ │ │ ├── OllamaService.swift # Ollama REST API client│ │ │ └── PromptTemplateEngine.swift # Template variable substitution│ │ ├── Injection/│ │ │ ├── TextInjectionService.swift # CGEvent + clipboard injection│ │ │ └── ClipboardManager.swift # Clipboard save/restore│ │ ├── Commands/│ │ │ ├── CommandParser.swift # Natural language → command│ │ │ ├── CommandRegistry.swift # Built-in command definitions│ │ │ └── CommandExecutor.swift # Execute parsed commands│ │ ├── HotkeyManager.swift # Global hotkey registration│ │ ├── ModelManager.swift # Model download/storage│ │ └── AppContextService.swift # Active app detection│ ├── Models/│ │ ├── DictationEntry.swift # SwiftData: dictation history│ │ ├── PromptTemplate.swift # SwiftData: prompt templates│ │ ├── AppProfile.swift # SwiftData: per-app config│ │ ├── VocabularyEntry.swift # SwiftData: custom vocabulary│ │ └── ModelInfo.swift # SwiftData: installed models│ ├── Utilities/│ │ ├── Constants.swift # App-wide constants│ │ ├── Logger+Extensions.swift # os_log category helpers│ │ └── Permissions.swift # Permission check helpers│ └── Resources/│ ├── Assets.xcassets # App icon, menu bar icons│ ├── Entitlements.plist # Accessibility, microphone│ ├── Info.plist # App configuration│ └── PromptTemplates/ # Built-in .json prompt templates│ ├── clean.json│ ├── structure.json│ ├── prompt.json│ └── code.json├── WhisperKit/ # whisper.cpp bridging module│ ├── include/│ │ └── whisper-bridging-header.h # C bridging header│ ├── Sources/│ │ └── WhisperWrapper.swift # High-level Swift API│ └── Package.swift├── LlamaKit/ # llama.cpp bridging module│ ├── include/│ │ └── llama-bridging-header.h # C bridging header│ ├── Sources/│ │ └── LlamaWrapper.swift # High-level Swift API│ └── Package.swift├── VaulTypeTests/ # Unit tests│ ├── Services/│ │ ├── CommandParserTests.swift│ │ ├── PromptTemplateEngineTests.swift│ │ └── TextInjectionTests.swift│ └── Models/│ └── SwiftDataModelTests.swift├── VaulTypeUITests/ # UI tests│ ├── SettingsUITests.swift│ └── OverlayUITests.swift├── scripts/│ ├── build-deps.sh # Build whisper.cpp + llama.cpp│ ├── download-model.sh # CLI model downloader│ ├── create-dmg.sh # DMG packaging│ └── notarize.sh # Notarization script└── docs/ # Documentation (this folder)Swift Package Organization
Section titled “Swift Package Organization”VaulType uses Swift Package Manager (SPM) for dependency management alongside the Xcode project.
Local Packages
Section titled “Local Packages”Package.swift (root)├── WhisperKit # Local package wrapping whisper.cpp├── LlamaKit # Local package wrapping llama.cpp└── VaulTypeCore # Shared models and utilities (future)External Dependencies
Section titled “External Dependencies”Add dependencies in Package.swift or via Xcode’s package resolution:
let package = Package( name: "VaulType", platforms: [.macOS(.v14)], dependencies: [ .package(url: "https://github.com/sparkle-project/Sparkle", from: "2.6.0"), ], targets: [ .target( name: "VaulType", dependencies: ["Sparkle", "WhisperKit", "LlamaKit"] ), ])Adding a New Package
Section titled “Adding a New Package”- Add the dependency to
Package.swiftor via Xcode > File > Add Package Dependencies - Import the module in the relevant source files
- Document the dependency in TECH_STACK.md
- Verify license compatibility (see LEGAL_COMPLIANCE.md)
C/C++ Bridging Conventions
Section titled “C/C++ Bridging Conventions”VaulType bridges to whisper.cpp and llama.cpp via C interop. Follow these conventions:
Bridging Header Structure
Section titled “Bridging Header Structure”#ifndef WhisperBridgingHeader_h#define WhisperBridgingHeader_h
// Include the whisper.cpp public API#include "whisper.h"
// Any additional C helper functions// Keep these minimal — prefer Swift wrappersint whisper_helper_get_segment_count(struct whisper_context *ctx);
#endifSwift Wrapper Pattern
Section titled “Swift Wrapper Pattern”Always wrap raw C API calls in a Swift class that manages memory:
import Foundation
final class WhisperContext: @unchecked Sendable { private let context: OpaquePointer
init(modelPath: String) throws { var params = whisper_context_default_params() params.use_gpu = true // Metal acceleration
guard let ctx = whisper_init_from_file_with_params(modelPath, params) else { throw WhisperError.modelLoadFailed(path: modelPath) } self.context = ctx }
deinit { whisper_free(context) }
func transcribe(audioData: [Float], language: String? = nil) async throws -> String { // Always dispatch to a background queue — never block the main thread return try await withCheckedThrowingContinuation { continuation in DispatchQueue.global(qos: .userInitiated).async { [self] in var params = whisper_full_default_params(WHISPER_SAMPLING_GREEDY) params.n_threads = Int32(ProcessInfo.processInfo.activeProcessorCount) params.language = language.map { ($0 as NSString).utf8String! }
let result = whisper_full(self.context, params, audioData, Int32(audioData.count))
if result == 0 { let text = self.collectSegments() continuation.resume(returning: text) } else { continuation.resume(throwing: WhisperError.transcriptionFailed(code: result)) } } } }
private func collectSegments() -> String { let segmentCount = whisper_full_n_segments(context) var result = "" for i in 0..<segmentCount { if let text = whisper_full_get_segment_text(context, i) { result += String(cString: text) } } return result.trimmingCharacters(in: .whitespacesAndNewlines) }}Conventions
Section titled “Conventions”| Rule | Details |
|---|---|
| Memory management | Always pair _init with _free in init/deinit |
| Threading | Never call C APIs on the main thread |
| Error handling | Map C error codes to Swift Error types |
| Naming | Swift wrappers use Context suffix (e.g., WhisperContext, LlamaContext) |
| Sendable | Mark as @unchecked Sendable if the C context is thread-safe |
| Bridging headers | One per C library, kept minimal |
Naming Conventions
Section titled “Naming Conventions”VaulType follows the Swift API Design Guidelines with these project-specific rules:
| Type | Convention | Example |
|---|---|---|
| SwiftUI View | PascalCase + View suffix | SettingsView.swift |
| Service | PascalCase + Service suffix | AudioCaptureService.swift |
| SwiftData Model | PascalCase, noun | DictationEntry.swift |
| Extension | Type+Feature.swift | Logger+Extensions.swift |
| Protocol | Adjective or -able/-ible | Transcribable.swift |
| Test | TestedType + Tests suffix | CommandParserTests.swift |
// Types: PascalCasestruct DictationEntry { }enum ProcessingMode { }protocol AudioCapturing { }
// Properties and methods: camelCaselet currentMode: ProcessingModefunc startRecording() async throws
// Constants: camelCase (not UPPER_SNAKE)let maxAudioBufferSize = 16_000 * 30 // 30 seconds at 16kHz
// Enum cases: camelCaseenum ProcessingMode: String, Codable { case raw case clean case structure case prompt case code case custom}
// Boolean properties: use `is`, `has`, `should` prefixvar isRecording: Boolvar hasModelLoaded: Boolvar shouldAutoInject: Boolos_log Categories
Section titled “os_log Categories”// Use subsystem + category patternimport os
extension Logger { static let audio = Logger(subsystem: "com.vaultype.app", category: "audio") static let whisper = Logger(subsystem: "com.vaultype.app", category: "whisper") static let llm = Logger(subsystem: "com.vaultype.app", category: "llm") static let injection = Logger(subsystem: "com.vaultype.app", category: "injection") static let commands = Logger(subsystem: "com.vaultype.app", category: "commands") static let ui = Logger(subsystem: "com.vaultype.app", category: "ui")}Git Workflow
Section titled “Git Workflow”VaulType uses trunk-based development with short-lived feature branches.
Branch Naming
Section titled “Branch Naming”main # Always deployablefeature/add-code-mode # New featuresfix/clipboard-restore # Bug fixeschore/update-whisper # Dependency updates, maintenancedocs/setup-guide # Documentation changesCommit Messages
Section titled “Commit Messages”Follow Conventional Commits:
feat: add Structure processing modefix: clipboard not restored after paste injectionperf: preload Whisper model on app launchdocs: update API documentation for LLMServicechore: bump whisper.cpp to v1.7.3test: add CommandParser unit testsrefactor: extract PromptTemplateEngine from LLMServiceWorkflow
Section titled “Workflow”- Create a feature branch from
main - Make changes with focused, atomic commits
- Run tests locally:
xcodebuild test -scheme VaulType - Push and create a pull request
- CI runs tests and linting
- Code review and approval
- Squash-merge to
main - Delete feature branch
Tags and Releases
Section titled “Tags and Releases”# Tag a releasegit tag -a v0.1.0 -m "MVP: Menu bar + whisper.cpp + text injection"git push origin v0.1.0# CI automatically builds, signs, notarizes, and creates a GitHub ReleaseHow to Add a New Processing Mode
Section titled “How to Add a New Processing Mode”Processing modes transform raw Whisper output through the LLM pipeline. Here’s how to add one:
Step 1: Define the Mode
Section titled “Step 1: Define the Mode”Add a new case to the ProcessingMode enum:
enum ProcessingMode: String, Codable, CaseIterable, Identifiable { case raw case clean case structure case prompt case code case custom case email // <-- New mode
var id: String { rawValue }
var displayName: String { switch self { // ...existing cases... case .email: return "Email" } }
var description: String { switch self { // ...existing cases... case .email: return "Format dictation as a professional email" } }
var icon: String { switch self { // ...existing cases... case .email: return "envelope" } }}Step 2: Create the Prompt Template
Section titled “Step 2: Create the Prompt Template”Create a JSON template file:
{ "name": "Email", "mode": "email", "systemPrompt": "You are a writing assistant that formats dictated speech into professional emails. Maintain the sender's intent and tone while improving clarity and structure.", "userPromptTemplate": "Format the following dictated text as a professional email. Add appropriate greeting and sign-off if not present. Fix grammar and punctuation.\n\nDictated text: {text}\n\nApp context: {app_name}\nLanguage: {language}", "isBuiltIn": true}Step 3: Register in LLMService
Section titled “Step 3: Register in LLMService”func process(text: String, mode: ProcessingMode, context: AppContext) async throws -> String { switch mode { case .raw: return text // ...existing cases... case .email: let template = try loadTemplate(for: .email) return try await runInference(text: text, template: template, context: context) }}Step 4: Add Tests
Section titled “Step 4: Add Tests”func testEmailModeFormatsAsEmail() async throws { let service = LLMService(model: mockModel) let result = try await service.process( text: "hey john wanted to follow up on yesterdays meeting about the project timeline", mode: .email, context: .default ) XCTAssertTrue(result.contains("Hi") || result.contains("Dear") || result.contains("Hello"))}Step 5: Update UI
Section titled “Step 5: Update UI”The mode selector automatically picks up new CaseIterable cases. Verify it appears correctly in Settings > Modes tab.
How to Add a New Voice Command
Section titled “How to Add a New Voice Command”Step 1: Define the Command
Section titled “Step 1: Define the Command”struct CommandDefinition { let name: String let patterns: [String] // Regex patterns to match let handler: (CommandContext) async throws -> Void}
extension CommandRegistry { static func registerBuiltinCommands() { // Existing commands...
register(CommandDefinition( name: "screenshot", patterns: [ "take a screenshot", "screenshot", "capture screen", "take screen capture" ], handler: { context in try await ScreenshotCommand.execute(context: context) } )) }}Step 2: Implement the Handler
Section titled “Step 2: Implement the Handler”enum ScreenshotCommand { static func execute(context: CommandContext) async throws { // Simulate Cmd+Shift+3 for full screenshot let event = CGEvent( keyboardEventSource: nil, virtualKey: 0x14, // '3' key keyDown: true ) event?.flags = [.maskCommand, .maskShift] event?.post(tap: .cghidEventTap)
// Key up let eventUp = CGEvent( keyboardEventSource: nil, virtualKey: 0x14, keyDown: false ) eventUp?.post(tap: .cghidEventTap)
Logger.commands.info("Screenshot command executed") }}Step 3: Add Tests
Section titled “Step 3: Add Tests”func testScreenshotCommandParsing() throws { let parser = CommandParser() let result = try parser.parse("take a screenshot") XCTAssertEqual(result?.name, "screenshot")}
func testScreenshotVariations() throws { let parser = CommandParser() let variations = ["screenshot", "take a screenshot", "capture screen"] for phrase in variations { let result = try parser.parse(phrase) XCTAssertEqual(result?.name, "screenshot", "Failed to parse: \(phrase)") }}How to Integrate a New Model Format
Section titled “How to Integrate a New Model Format”To support a new model format beyond GGML (Whisper) and GGUF (LLM):
Step 1: Add a Model Type
Section titled “Step 1: Add a Model Type”enum ModelFormat: String, Codable { case ggml // Whisper models case gguf // LLM models (llama.cpp) case coreml // <-- New: Core ML models}Step 2: Create an Inference Adapter
Section titled “Step 2: Create an Inference Adapter”protocol TranscriptionService { func transcribe(audioData: [Float], language: String?) async throws -> String}
final class CoreMLWhisperService: TranscriptionService { private let model: MLModel
init(modelPath: String) throws { let compiledURL = try MLModel.compileModel(at: URL(fileURLWithPath: modelPath)) self.model = try MLModel(contentsOf: compiledURL) }
func transcribe(audioData: [Float], language: String?) async throws -> String { // Core ML inference implementation // ... }}Step 3: Update ModelManager
Section titled “Step 3: Update ModelManager”func loadModel(info: ModelInfo) throws -> Any { switch info.format { case .ggml: return try WhisperContext(modelPath: info.localPath) case .gguf: return try LlamaContext(modelPath: info.localPath) case .coreml: return try CoreMLWhisperService(modelPath: info.localPath) }}Testing Strategy
Section titled “Testing Strategy”Test Pyramid
Section titled “Test Pyramid” ╱ UI Tests ╲ ← Fewest: critical user flows ╱───────────────╲ ╱ Integration Tests╲ ← Middle: service interactions ╱─────────────────────╲ ╱ Unit Tests ╲ ← Most: pure logic, parsers, templates ╱───────────────────────────╲Unit Tests
Section titled “Unit Tests”Focus on pure logic that doesn’t require hardware or models:
// CommandParser, PromptTemplateEngine, text processingfunc testCleanModeRemovesFillerWords() { let engine = TextProcessor() let result = engine.removeFillers("so um I think we should uh proceed") XCTAssertEqual(result, "I think we should proceed")}Integration Tests
Section titled “Integration Tests”Test whisper.cpp and llama.cpp Swift wrappers with small models:
// Requires a test model in the test bundlefunc testWhisperTranscribesAudio() async throws { let whisper = try WhisperContext(modelPath: testModelPath) let audio = try loadTestAudio("hello_world.wav") let result = try await whisper.transcribe(audioData: audio) XCTAssertTrue(result.lowercased().contains("hello"))}UI Tests
Section titled “UI Tests”Test SwiftUI settings and overlay with XCUITest:
func testSettingsWindowOpens() { let app = XCUIApplication() app.launch() // Click menu bar icon, then Settings app.menuBarItems["VaulType"].click() app.menuItems["Settings..."].click() XCTAssertTrue(app.windows["Settings"].waitForExistence(timeout: 3))}Mock Audio Input
Section titled “Mock Audio Input”For testing the audio pipeline without a real microphone:
final class MockAudioCaptureService: AudioCapturing { var mockAudioData: [Float] = []
func startCapture() async throws { // Simulate audio callback with mock data delegate?.audioCaptureService(self, didCaptureAudio: mockAudioData) }}See TESTING.md for the complete testing guide.
Performance Profiling
Section titled “Performance Profiling”Using Instruments
Section titled “Using Instruments”- Product > Profile (⌘I) in Xcode
- Choose the relevant template:
| Template | Use For |
|---|---|
| Time Profiler | Finding CPU hotspots during inference |
| Allocations | Tracking memory usage for model loading |
| Leaks | Detecting memory leaks in C bridging code |
| Metal System Trace | GPU utilization for whisper.cpp/llama.cpp |
| Energy Log | Battery impact during dictation |
Key Metrics to Monitor
Section titled “Key Metrics to Monitor”┌────────────────────────┬───────────────┬───────────────┐│ Metric │ Target │ Alert │├────────────────────────┼───────────────┼───────────────┤│ Idle memory │ <50 MB │ >100 MB ││ Whisper model loaded │ <500 MB │ >1 GB ││ Whisper + LLM loaded │ <2 GB │ >3 GB ││ Transcription latency │ <2s (5s clip) │ >5s ││ LLM processing │ <3s │ >8s ││ Text injection │ <100ms │ >500ms ││ Idle CPU │ ~0% │ >2% ││ App launch time │ <1s │ >3s │└────────────────────────┴───────────────┴───────────────┘Profiling Whisper Inference
Section titled “Profiling Whisper Inference”import os
let signpost = OSSignposter(subsystem: "com.vaultype.app", category: "whisper")
func transcribe(audio: [Float]) async throws -> String { let state = signpost.beginInterval("transcription", id: signpost.makeSignpostID()) defer { signpost.endInterval("transcription", state) }
return try await whisperContext.transcribe(audioData: audio)}View in Instruments > os_signpost to see exact timing per transcription.
Memory Leak Detection
Section titled “Memory Leak Detection”C bridging code is the most common source of memory leaks. Follow these practices:
Use Instruments Leaks Template
Section titled “Use Instruments Leaks Template”- Run with the Leaks template in Instruments
- Perform several dictation cycles
- Check for leaked
whisper_contextorllama_contextobjects
RAII Pattern for C Resources
Section titled “RAII Pattern for C Resources”// Always pair init/free in a class with deinitfinal class WhisperContext { private let ctx: OpaquePointer
init(path: String) throws { guard let ctx = whisper_init_from_file(path) else { throw WhisperError.loadFailed } self.ctx = ctx }
deinit { whisper_free(ctx) // ALWAYS free in deinit }}Detecting Retain Cycles
Section titled “Detecting Retain Cycles”SwiftUI closures and Combine publishers can create retain cycles:
// BAD: retain cycle in Combine sinkcancellable = audioService.audioLevelPublisher .sink { self.updateLevel($0) } // Strong capture of self
// GOOD: weak capturecancellable = audioService.audioLevelPublisher .sink { [weak self] level in self?.updateLevel(level) }Memory Debugging Flags
Section titled “Memory Debugging Flags”Add to Xcode scheme > Run > Arguments > Environment Variables:
MallocStackLogging = 1MallocScribble = 1ASAN_OPTIONS = detect_leaks=1Next Steps
Section titled “Next Steps”- Setup Guide — Set up your development environment
- Testing Guide — Detailed testing practices
- Architecture — System architecture deep dive
- Contributing — How to contribute to VaulType
- API Documentation — Internal API reference