Skip to content

Development Guide

Last Updated: 2026-02-13

Comprehensive guide for developing, extending, and maintaining VaulType.


VaulType/
├── VaulType.xcodeproj # Xcode project file
├── VaulType/ # Main app target
│ ├── App/
│ │ ├── VaulTypeApp.swift # @main entry point
│ │ ├── AppDelegate.swift # NSApplicationDelegate (menu bar, lifecycle)
│ │ └── MenuBarManager.swift # Menu bar icon and dropdown management
│ ├── Views/
│ │ ├── Settings/
│ │ │ ├── SettingsView.swift # Root settings window
│ │ │ ├── GeneralSettingsTab.swift # Launch at login, hotkey, etc.
│ │ │ ├── AudioSettingsTab.swift # Input device, noise gate
│ │ │ ├── ModelsSettingsTab.swift # Model download/management
│ │ │ ├── ModesSettingsTab.swift # Processing mode configuration
│ │ │ └── AdvancedSettingsTab.swift # Injection method, diagnostics
│ │ ├── Overlay/
│ │ │ ├── OverlayWindow.swift # NSPanel for floating overlay
│ │ │ └── OverlayView.swift # SwiftUI overlay content
│ │ └── Components/
│ │ ├── AudioLevelIndicator.swift # Real-time audio level meter
│ │ ├── ModelDownloadRow.swift # Model download progress UI
│ │ └── ModeSelector.swift # Processing mode picker
│ ├── Services/
│ │ ├── Audio/
│ │ │ ├── AudioCaptureService.swift # AVAudioEngine management
│ │ │ └── VoiceActivityDetector.swift # VAD implementation
│ │ ├── Speech/
│ │ │ ├── WhisperService.swift # whisper.cpp Swift wrapper
│ │ │ └── WhisperContext.swift # whisper_context lifecycle
│ │ ├── LLM/
│ │ │ ├── LLMService.swift # llama.cpp Swift wrapper
│ │ │ ├── LlamaContext.swift # llama_context lifecycle
│ │ │ ├── OllamaService.swift # Ollama REST API client
│ │ │ └── PromptTemplateEngine.swift # Template variable substitution
│ │ ├── Injection/
│ │ │ ├── TextInjectionService.swift # CGEvent + clipboard injection
│ │ │ └── ClipboardManager.swift # Clipboard save/restore
│ │ ├── Commands/
│ │ │ ├── CommandParser.swift # Natural language → command
│ │ │ ├── CommandRegistry.swift # Built-in command definitions
│ │ │ └── CommandExecutor.swift # Execute parsed commands
│ │ ├── HotkeyManager.swift # Global hotkey registration
│ │ ├── ModelManager.swift # Model download/storage
│ │ └── AppContextService.swift # Active app detection
│ ├── Models/
│ │ ├── DictationEntry.swift # SwiftData: dictation history
│ │ ├── PromptTemplate.swift # SwiftData: prompt templates
│ │ ├── AppProfile.swift # SwiftData: per-app config
│ │ ├── VocabularyEntry.swift # SwiftData: custom vocabulary
│ │ └── ModelInfo.swift # SwiftData: installed models
│ ├── Utilities/
│ │ ├── Constants.swift # App-wide constants
│ │ ├── Logger+Extensions.swift # os_log category helpers
│ │ └── Permissions.swift # Permission check helpers
│ └── Resources/
│ ├── Assets.xcassets # App icon, menu bar icons
│ ├── Entitlements.plist # Accessibility, microphone
│ ├── Info.plist # App configuration
│ └── PromptTemplates/ # Built-in .json prompt templates
│ ├── clean.json
│ ├── structure.json
│ ├── prompt.json
│ └── code.json
├── WhisperKit/ # whisper.cpp bridging module
│ ├── include/
│ │ └── whisper-bridging-header.h # C bridging header
│ ├── Sources/
│ │ └── WhisperWrapper.swift # High-level Swift API
│ └── Package.swift
├── LlamaKit/ # llama.cpp bridging module
│ ├── include/
│ │ └── llama-bridging-header.h # C bridging header
│ ├── Sources/
│ │ └── LlamaWrapper.swift # High-level Swift API
│ └── Package.swift
├── VaulTypeTests/ # Unit tests
│ ├── Services/
│ │ ├── CommandParserTests.swift
│ │ ├── PromptTemplateEngineTests.swift
│ │ └── TextInjectionTests.swift
│ └── Models/
│ └── SwiftDataModelTests.swift
├── VaulTypeUITests/ # UI tests
│ ├── SettingsUITests.swift
│ └── OverlayUITests.swift
├── scripts/
│ ├── build-deps.sh # Build whisper.cpp + llama.cpp
│ ├── download-model.sh # CLI model downloader
│ ├── create-dmg.sh # DMG packaging
│ └── notarize.sh # Notarization script
└── docs/ # Documentation (this folder)

VaulType uses Swift Package Manager (SPM) for dependency management alongside the Xcode project.

Package.swift (root)
├── WhisperKit # Local package wrapping whisper.cpp
├── LlamaKit # Local package wrapping llama.cpp
└── VaulTypeCore # Shared models and utilities (future)

Add dependencies in Package.swift or via Xcode’s package resolution:

Package.swift
let package = Package(
name: "VaulType",
platforms: [.macOS(.v14)],
dependencies: [
.package(url: "https://github.com/sparkle-project/Sparkle", from: "2.6.0"),
],
targets: [
.target(
name: "VaulType",
dependencies: ["Sparkle", "WhisperKit", "LlamaKit"]
),
]
)
  1. Add the dependency to Package.swift or via Xcode > File > Add Package Dependencies
  2. Import the module in the relevant source files
  3. Document the dependency in TECH_STACK.md
  4. Verify license compatibility (see LEGAL_COMPLIANCE.md)

VaulType bridges to whisper.cpp and llama.cpp via C interop. Follow these conventions:

WhisperKit/include/whisper-bridging-header.h
#ifndef WhisperBridgingHeader_h
#define WhisperBridgingHeader_h
// Include the whisper.cpp public API
#include "whisper.h"
// Any additional C helper functions
// Keep these minimal — prefer Swift wrappers
int whisper_helper_get_segment_count(struct whisper_context *ctx);
#endif

Always wrap raw C API calls in a Swift class that manages memory:

WhisperKit/Sources/WhisperWrapper.swift
import Foundation
final class WhisperContext: @unchecked Sendable {
private let context: OpaquePointer
init(modelPath: String) throws {
var params = whisper_context_default_params()
params.use_gpu = true // Metal acceleration
guard let ctx = whisper_init_from_file_with_params(modelPath, params) else {
throw WhisperError.modelLoadFailed(path: modelPath)
}
self.context = ctx
}
deinit {
whisper_free(context)
}
func transcribe(audioData: [Float], language: String? = nil) async throws -> String {
// Always dispatch to a background queue — never block the main thread
return try await withCheckedThrowingContinuation { continuation in
DispatchQueue.global(qos: .userInitiated).async { [self] in
var params = whisper_full_default_params(WHISPER_SAMPLING_GREEDY)
params.n_threads = Int32(ProcessInfo.processInfo.activeProcessorCount)
params.language = language.map { ($0 as NSString).utf8String! }
let result = whisper_full(self.context, params, audioData, Int32(audioData.count))
if result == 0 {
let text = self.collectSegments()
continuation.resume(returning: text)
} else {
continuation.resume(throwing: WhisperError.transcriptionFailed(code: result))
}
}
}
}
private func collectSegments() -> String {
let segmentCount = whisper_full_n_segments(context)
var result = ""
for i in 0..<segmentCount {
if let text = whisper_full_get_segment_text(context, i) {
result += String(cString: text)
}
}
return result.trimmingCharacters(in: .whitespacesAndNewlines)
}
}
RuleDetails
Memory managementAlways pair _init with _free in init/deinit
ThreadingNever call C APIs on the main thread
Error handlingMap C error codes to Swift Error types
NamingSwift wrappers use Context suffix (e.g., WhisperContext, LlamaContext)
SendableMark as @unchecked Sendable if the C context is thread-safe
Bridging headersOne per C library, kept minimal

VaulType follows the Swift API Design Guidelines with these project-specific rules:

TypeConventionExample
SwiftUI ViewPascalCase + View suffixSettingsView.swift
ServicePascalCase + Service suffixAudioCaptureService.swift
SwiftData ModelPascalCase, nounDictationEntry.swift
ExtensionType+Feature.swiftLogger+Extensions.swift
ProtocolAdjective or -able/-ibleTranscribable.swift
TestTestedType + Tests suffixCommandParserTests.swift
// Types: PascalCase
struct DictationEntry { }
enum ProcessingMode { }
protocol AudioCapturing { }
// Properties and methods: camelCase
let currentMode: ProcessingMode
func startRecording() async throws
// Constants: camelCase (not UPPER_SNAKE)
let maxAudioBufferSize = 16_000 * 30 // 30 seconds at 16kHz
// Enum cases: camelCase
enum ProcessingMode: String, Codable {
case raw
case clean
case structure
case prompt
case code
case custom
}
// Boolean properties: use `is`, `has`, `should` prefix
var isRecording: Bool
var hasModelLoaded: Bool
var shouldAutoInject: Bool
// Use subsystem + category pattern
import os
extension Logger {
static let audio = Logger(subsystem: "com.vaultype.app", category: "audio")
static let whisper = Logger(subsystem: "com.vaultype.app", category: "whisper")
static let llm = Logger(subsystem: "com.vaultype.app", category: "llm")
static let injection = Logger(subsystem: "com.vaultype.app", category: "injection")
static let commands = Logger(subsystem: "com.vaultype.app", category: "commands")
static let ui = Logger(subsystem: "com.vaultype.app", category: "ui")
}

VaulType uses trunk-based development with short-lived feature branches.

main # Always deployable
feature/add-code-mode # New features
fix/clipboard-restore # Bug fixes
chore/update-whisper # Dependency updates, maintenance
docs/setup-guide # Documentation changes

Follow Conventional Commits:

feat: add Structure processing mode
fix: clipboard not restored after paste injection
perf: preload Whisper model on app launch
docs: update API documentation for LLMService
chore: bump whisper.cpp to v1.7.3
test: add CommandParser unit tests
refactor: extract PromptTemplateEngine from LLMService
  1. Create a feature branch from main
  2. Make changes with focused, atomic commits
  3. Run tests locally: xcodebuild test -scheme VaulType
  4. Push and create a pull request
  5. CI runs tests and linting
  6. Code review and approval
  7. Squash-merge to main
  8. Delete feature branch
Terminal window
# Tag a release
git tag -a v0.1.0 -m "MVP: Menu bar + whisper.cpp + text injection"
git push origin v0.1.0
# CI automatically builds, signs, notarizes, and creates a GitHub Release

Processing modes transform raw Whisper output through the LLM pipeline. Here’s how to add one:

Add a new case to the ProcessingMode enum:

VaulType/Models/ProcessingMode.swift
enum ProcessingMode: String, Codable, CaseIterable, Identifiable {
case raw
case clean
case structure
case prompt
case code
case custom
case email // <-- New mode
var id: String { rawValue }
var displayName: String {
switch self {
// ...existing cases...
case .email: return "Email"
}
}
var description: String {
switch self {
// ...existing cases...
case .email: return "Format dictation as a professional email"
}
}
var icon: String {
switch self {
// ...existing cases...
case .email: return "envelope"
}
}
}

Create a JSON template file:

VaulType/Resources/PromptTemplates/email.json
{
"name": "Email",
"mode": "email",
"systemPrompt": "You are a writing assistant that formats dictated speech into professional emails. Maintain the sender's intent and tone while improving clarity and structure.",
"userPromptTemplate": "Format the following dictated text as a professional email. Add appropriate greeting and sign-off if not present. Fix grammar and punctuation.\n\nDictated text: {text}\n\nApp context: {app_name}\nLanguage: {language}",
"isBuiltIn": true
}
VaulType/Services/LLM/LLMService.swift
func process(text: String, mode: ProcessingMode, context: AppContext) async throws -> String {
switch mode {
case .raw:
return text
// ...existing cases...
case .email:
let template = try loadTemplate(for: .email)
return try await runInference(text: text, template: template, context: context)
}
}
VaulTypeTests/Services/LLMServiceTests.swift
func testEmailModeFormatsAsEmail() async throws {
let service = LLMService(model: mockModel)
let result = try await service.process(
text: "hey john wanted to follow up on yesterdays meeting about the project timeline",
mode: .email,
context: .default
)
XCTAssertTrue(result.contains("Hi") || result.contains("Dear") || result.contains("Hello"))
}

The mode selector automatically picks up new CaseIterable cases. Verify it appears correctly in Settings > Modes tab.


VaulType/Services/Commands/CommandRegistry.swift
struct CommandDefinition {
let name: String
let patterns: [String] // Regex patterns to match
let handler: (CommandContext) async throws -> Void
}
extension CommandRegistry {
static func registerBuiltinCommands() {
// Existing commands...
register(CommandDefinition(
name: "screenshot",
patterns: [
"take a screenshot",
"screenshot",
"capture screen",
"take screen capture"
],
handler: { context in
try await ScreenshotCommand.execute(context: context)
}
))
}
}
VaulType/Services/Commands/Handlers/ScreenshotCommand.swift
enum ScreenshotCommand {
static func execute(context: CommandContext) async throws {
// Simulate Cmd+Shift+3 for full screenshot
let event = CGEvent(
keyboardEventSource: nil,
virtualKey: 0x14, // '3' key
keyDown: true
)
event?.flags = [.maskCommand, .maskShift]
event?.post(tap: .cghidEventTap)
// Key up
let eventUp = CGEvent(
keyboardEventSource: nil,
virtualKey: 0x14,
keyDown: false
)
eventUp?.post(tap: .cghidEventTap)
Logger.commands.info("Screenshot command executed")
}
}
VaulTypeTests/Services/CommandParserTests.swift
func testScreenshotCommandParsing() throws {
let parser = CommandParser()
let result = try parser.parse("take a screenshot")
XCTAssertEqual(result?.name, "screenshot")
}
func testScreenshotVariations() throws {
let parser = CommandParser()
let variations = ["screenshot", "take a screenshot", "capture screen"]
for phrase in variations {
let result = try parser.parse(phrase)
XCTAssertEqual(result?.name, "screenshot", "Failed to parse: \(phrase)")
}
}

To support a new model format beyond GGML (Whisper) and GGUF (LLM):

VaulType/Models/ModelInfo.swift
enum ModelFormat: String, Codable {
case ggml // Whisper models
case gguf // LLM models (llama.cpp)
case coreml // <-- New: Core ML models
}
VaulType/Services/Speech/CoreMLWhisperService.swift
protocol TranscriptionService {
func transcribe(audioData: [Float], language: String?) async throws -> String
}
final class CoreMLWhisperService: TranscriptionService {
private let model: MLModel
init(modelPath: String) throws {
let compiledURL = try MLModel.compileModel(at: URL(fileURLWithPath: modelPath))
self.model = try MLModel(contentsOf: compiledURL)
}
func transcribe(audioData: [Float], language: String?) async throws -> String {
// Core ML inference implementation
// ...
}
}
VaulType/Services/ModelManager.swift
func loadModel(info: ModelInfo) throws -> Any {
switch info.format {
case .ggml:
return try WhisperContext(modelPath: info.localPath)
case .gguf:
return try LlamaContext(modelPath: info.localPath)
case .coreml:
return try CoreMLWhisperService(modelPath: info.localPath)
}
}

╱ UI Tests ╲ ← Fewest: critical user flows
╱───────────────╲
╱ Integration Tests╲ ← Middle: service interactions
╱─────────────────────╲
╱ Unit Tests ╲ ← Most: pure logic, parsers, templates
╱───────────────────────────╲

Focus on pure logic that doesn’t require hardware or models:

// CommandParser, PromptTemplateEngine, text processing
func testCleanModeRemovesFillerWords() {
let engine = TextProcessor()
let result = engine.removeFillers("so um I think we should uh proceed")
XCTAssertEqual(result, "I think we should proceed")
}

Test whisper.cpp and llama.cpp Swift wrappers with small models:

// Requires a test model in the test bundle
func testWhisperTranscribesAudio() async throws {
let whisper = try WhisperContext(modelPath: testModelPath)
let audio = try loadTestAudio("hello_world.wav")
let result = try await whisper.transcribe(audioData: audio)
XCTAssertTrue(result.lowercased().contains("hello"))
}

Test SwiftUI settings and overlay with XCUITest:

func testSettingsWindowOpens() {
let app = XCUIApplication()
app.launch()
// Click menu bar icon, then Settings
app.menuBarItems["VaulType"].click()
app.menuItems["Settings..."].click()
XCTAssertTrue(app.windows["Settings"].waitForExistence(timeout: 3))
}

For testing the audio pipeline without a real microphone:

final class MockAudioCaptureService: AudioCapturing {
var mockAudioData: [Float] = []
func startCapture() async throws {
// Simulate audio callback with mock data
delegate?.audioCaptureService(self, didCaptureAudio: mockAudioData)
}
}

See TESTING.md for the complete testing guide.


  1. Product > Profile (⌘I) in Xcode
  2. Choose the relevant template:
TemplateUse For
Time ProfilerFinding CPU hotspots during inference
AllocationsTracking memory usage for model loading
LeaksDetecting memory leaks in C bridging code
Metal System TraceGPU utilization for whisper.cpp/llama.cpp
Energy LogBattery impact during dictation
┌────────────────────────┬───────────────┬───────────────┐
│ Metric │ Target │ Alert │
├────────────────────────┼───────────────┼───────────────┤
│ Idle memory │ <50 MB │ >100 MB │
│ Whisper model loaded │ <500 MB │ >1 GB │
│ Whisper + LLM loaded │ <2 GB │ >3 GB │
│ Transcription latency │ <2s (5s clip) │ >5s │
│ LLM processing │ <3s │ >8s │
│ Text injection │ <100ms │ >500ms │
│ Idle CPU │ ~0% │ >2% │
│ App launch time │ <1s │ >3s │
└────────────────────────┴───────────────┴───────────────┘
import os
let signpost = OSSignposter(subsystem: "com.vaultype.app", category: "whisper")
func transcribe(audio: [Float]) async throws -> String {
let state = signpost.beginInterval("transcription", id: signpost.makeSignpostID())
defer { signpost.endInterval("transcription", state) }
return try await whisperContext.transcribe(audioData: audio)
}

View in Instruments > os_signpost to see exact timing per transcription.


C bridging code is the most common source of memory leaks. Follow these practices:

  1. Run with the Leaks template in Instruments
  2. Perform several dictation cycles
  3. Check for leaked whisper_context or llama_context objects
// Always pair init/free in a class with deinit
final class WhisperContext {
private let ctx: OpaquePointer
init(path: String) throws {
guard let ctx = whisper_init_from_file(path) else {
throw WhisperError.loadFailed
}
self.ctx = ctx
}
deinit {
whisper_free(ctx) // ALWAYS free in deinit
}
}

SwiftUI closures and Combine publishers can create retain cycles:

// BAD: retain cycle in Combine sink
cancellable = audioService.audioLevelPublisher
.sink { self.updateLevel($0) } // Strong capture of self
// GOOD: weak capture
cancellable = audioService.audioLevelPublisher
.sink { [weak self] level in
self?.updateLevel(level)
}

Add to Xcode scheme > Run > Arguments > Environment Variables:

MallocStackLogging = 1
MallocScribble = 1
ASAN_OPTIONS = detect_leaks=1