# Pattern-Driven Knowledge Bases with AI Assistance A methodology for building scalable, consistent knowledge bases that leverage both AI creativity and programmatic efficiency. ## The Core Insight **Traditional approach:** Create files however, let chaos accumulate, manually organize later **Pattern-driven approach:** Identify content patterns early, codify them as templates, enforce them programmatically **Result:** Consistency without rigidity, automation without losing creative flexibility ## The Three-Layer System ### Layer 1: Content Patterns (Human Intelligence) Analyze existing content to identify recurring structures: - **Feature Guides:** What It Does → How to Use It → When to Use It → How It Works → Pro Tip - **Conceptual Explainers:** Core Principle → How It Works → Examples → Misconceptions - **Workflow Guides:** Prerequisites → Steps → Troubleshooting - **Troubleshooting Guides:** Problem → Symptoms → Solution → Prevention **Why patterns matter:** - Readers know what to expect - Writers have clear frameworks - Content is scannable and predictable - Quality remains consistent ### Layer 2: AI-Assisted Creation (AI Intelligence) Use AI agents to: 1. **Detect** which pattern fits the content being created 2. **Apply** the appropriate template structure 3. **Propose** new patterns when existing ones don't fit 4. **Generate** frontmatter with proper timestamps and metadata **What AI does well:** - Pattern matching and recognition - Content generation within structure - Detecting anomalies ("this doesn't fit any pattern") - Proposing improvements **What AI shouldn't do:** - Validate 500 files (use bash scripts) - Enforce schema compliance (use grep/yq) - Bulk refactoring (use sed/awk) ### Layer 3: Programmatic Enforcement (Machine Efficiency) Use scripts and command-line tools to: - Validate frontmatter schema - Find files missing required fields - Bulk update metadata - Enforce section ordering - Track validation status **Key enabler:** Structured frontmatter ```yaml --- created: 2025-11-08T14:57:13-0800 slug: cl3l2tmhjp5 template_type: feature-guide # ← Enables filtering by pattern schema_validated: 2025-11-08 # ← Tracks validation status --- ``` **Power move:** Use `template_type` to: ```bash # Find all feature guides grep -l "template_type: feature-guide" **/*.md # Add new section to all workflow guides for file in $(grep -l "template_type: workflow-guide" **/*.md); do # Programmatic update done ``` ## How the Layers Work Together ### Scenario: Creating a New Article **Step 1 (Human):** "I need to document the Capture Recording feature in Logic" **Step 2 (AI):** - Recognizes this is a software feature - Selects "Feature Guide" pattern - Generates file with template structure - Calculates frontmatter fields (timestamp, slug) - Sets `template_type: feature-guide` **Step 3 (Programmatic):** - Validation script confirms all required fields present - Git pre-commit hook checks schema compliance - File appears in `grep -l "template_type: feature-guide"` queries **Step 4 (Human):** Fills in content within the structure ### Scenario: Schema Evolution **Trigger:** "All gear guides should now include a `manufacturer_url` field" **Step 1 (Human):** Update template in `_templates/Gear Guide.md` **Step 2 (Programmatic):** ```bash # Find all gear guides for file in $(grep -l "template_type: gear-guide" **/*.md); do # Add manufacturer_url after manufacturer field sed -i '' '/^manufacturer:/a\ manufacturer_url: ' "$file" done ``` **Step 3 (AI):** - New articles automatically include `manufacturer_url` - AGENTS.md documents the field requirement **Step 4 (Validation):** ```bash # Verify all gear guides have the field grep -l "template_type: gear-guide" **/*.md | xargs grep -L "manufacturer_url" # Should return empty ``` ### Scenario: Discovering a New Pattern **Trigger:** AI creates 3rd article with similar non-standard structure **Step 1 (AI):** "This content doesn't fit existing patterns. Should we create Pattern 11: Comparison Matrix?" **Step 2 (Human):** Reviews, confirms new pattern makes sense **Step 3 (Human):** Documents pattern in pattern guide **Step 4 (Human):** Creates template in `_templates/` **Step 5 (Programmatic):** Retrospectively tags existing files: ```bash # Find files with comparison tables grep -l "| Feature | Option A | Option B |" **/*.md # Add template_type: comparison-matrix ``` ## Key Design Principles ### 1. Patterns Emerge, Don't Prescribe **Don't:** Create 20 theoretical patterns upfront **Do:** Document patterns as they appear organically (3+ uses = document it) **Why:** Premature abstraction creates unused patterns. Let content needs drive pattern creation. ### 2. Flexible Structure, Strict Metadata **Flexible:** Content patterns are guidelines, not laws. Adapt when needed. **Strict:** Frontmatter schema is enforced. Every file MUST have: - `created`, `updated`, `slug` - `template_type`, `schema_validated` **Why:** Flexible content allows creativity. Strict metadata enables automation. ### 3. Right Tool for the Job | Task | Tool | Why | |------|------|-----| | Content creation | AI | Creative, context-aware | | Pattern detection | AI | Recognizes similarities | | Schema validation | bash/grep | Fast, free, deterministic | | Bulk updates | sed/awk/yq | Efficient, scriptable | | Version control | git | Tracks changes over time | **Anti-pattern:** Using AI to iterate through 500 files to check frontmatter ### 4. Documentation as Code Pattern definitions live in markdown files: - `AGENTS.md` - Instructions for AI agents - `Article Structure Patterns.md` - Detailed pattern documentation - `_templates/` - Executable templates **Benefits:** - Version controlled - Searchable - Self-documenting - AI-readable ### 5. Validate On-Demand, Not Continuously **Don't:** Track `schema_version` in every file **Do:** Track `schema_validated` date, run validation when needed **Why:** Git already tracks schema evolution. Avoid redundant versioning. **When to validate:** - After schema changes - Before publishing - On-demand via `validate-schema.sh` ## Implementation Guide ### Phase 1: Pattern Discovery (Weeks 1-2) 1. Analyze 20-30 existing articles 2. Identify 3-5 recurring structures 3. Document patterns with examples 4. Create templates for each pattern **Deliverable:** Pattern guide with template files ### Phase 2: Schema Definition (Week 3) 1. Design frontmatter schema 2. Add `template_type` and `schema_validated` fields 3. Update templates with new schema 4. Document in AGENTS.md **Deliverable:** Standardized frontmatter across templates ### Phase 3: AI Integration (Week 4) 1. Train AI agents on pattern selection 2. Configure AI to calculate timestamps/slugs 3. Set up pattern proposal workflow 4. Test with 10-20 new articles **Deliverable:** AI can create files with correct patterns ### Phase 4: Programmatic Tooling (Week 5) 1. Write validation scripts 2. Create bulk update scripts 3. Set up git hooks 4. Document command-line workflows **Deliverable:** `validate-schema.sh`, `fix-schema.sh` ### Phase 5: Iteration and Refinement (Ongoing) 1. Monitor for new pattern emergence 2. Refine existing patterns based on usage 3. Update tooling as needs evolve 4. Document learnings ## Real-World Benefits ### Consistency at Scale **Before:** 500 articles with varying structures, hard to navigate **After:** 500 articles following 8 clear patterns, predictable and scannable ### Efficient Bulk Operations **Before:** "I need to add a field to all gear guides" → manually edit 50 files **After:** One-line bash script updates all 50 files in seconds ### AI Compute Savings **Before:** AI validates every file on every run → expensive, slow **After:** AI creates content, scripts validate → cheap, fast ### Pattern Evolution **Before:** New content types = chaos **After:** New patterns emerge, get documented, become reusable ### Collaborative Authoring **Before:** Each contributor writes differently **After:** Templates guide consistent structure, easier to review/edit ## Common Pitfalls ### Over-Patterning **Symptom:** 30 hyper-specific patterns that are rarely used **Fix:** Merge similar patterns, wait for 3+ uses before documenting ### Schema Bloat **Symptom:** Frontmatter with 20+ fields, most unused **Fix:** Only add fields that serve a clear purpose (filtering, tracking, automation) ### AI Over-Reliance **Symptom:** Using AI for tasks bash does instantly **Fix:** Profile your workflows - use the right tool for each job ### Rigid Enforcement **Symptom:** Rejecting good content because it doesn't fit a pattern **Fix:** Patterns are guidelines. When content truly needs different structure, create a new pattern or make a one-off ## Tools and Technologies ### Core Stack - **Obsidian:** Knowledge base interface - **Markdown:** Content format - **Git:** Version control and change tracking - **bash/grep/sed:** Validation and bulk operations - **yq:** YAML parsing and manipulation - **AI (OpenCode/Claude):** Content creation and pattern detection ### Optional Enhancements - **Templater plugin:** Dynamic template syntax in Obsidian - **Dataview plugin:** Query frontmatter metadata - **Obsidian Publish:** Share content online - **CI/CD:** Automated validation on git push ## Measuring Success ### Quantitative Metrics - **Pattern coverage:** % of files with `template_type` - **Schema compliance:** % of files passing validation - **Time to create:** Average time from idea to published article - **AI cost:** Token spend per article created ### Qualitative Metrics - **Findability:** Can readers quickly locate relevant content? - **Consistency:** Do similar topics have similar structures? - **Maintainability:** How easy is it to update 100 articles? - **Contributor experience:** Can new writers create conformant content? ## Future Directions ### Advanced Pattern Detection Train AI to: - Analyze content and suggest pattern improvements - Detect pattern drift (articles marked as X that look like Y) - Propose pattern mergers when overlap is high ### Smart Validation Scripts that: - Auto-detect probable `template_type` from content structure - Suggest fixes (not just report errors) - Learn from corrections ### Pattern Analytics Dashboard showing: - Most-used patterns - Pattern effectiveness (page views, time-on-page) - Orphaned patterns (defined but unused) ### Cross-Vault Patterns Reusable pattern libraries: - Export patterns as shareable format - Import patterns from community - Pattern marketplace for common knowledge base types ## Related Concepts ### Similar Methodologies - **Design Systems:** UI pattern libraries (applies to content) - **Style Guides:** Writing consistency (we add structure) - **Schema.org:** Structured data for SEO (similar metadata approach) - **Docs-as-Code:** Documentation in version control (our foundation) ### Complementary Practices - **Zettelkasten:** Note-taking method (can use our patterns) - **PARA Method:** Organization system (orthogonal to our patterns) - **Atomic Design:** Component-based thinking (similar philosophy) ## Conclusion Pattern-driven knowledge bases combine: - **Human insight** (pattern identification) - **AI creativity** (content generation within patterns) - **Machine efficiency** (validation and enforcement) The result: A knowledge base that's consistent without being rigid, automated without losing quality, and scalable without chaos. **Start small:** Document 3 patterns, create templates, validate with scripts **Iterate:** Let new patterns emerge from actual content needs **Scale:** Leverage `template_type` for programmatic operations as vault grows ## Related Articles - `_Nakul/5. Coding Actions/midimaze/Article Structure Patterns.md` - Full pattern documentation - [[Programmatic Schema Validation and Refactoring]] - Validation workflows - [[Template Systems - Templater vs AI Agent Creation]] - Template processing - [[Verifying File Changes with Terminal Inspection]] - Validation techniques