Prompt:
Build me a personal CRM system that automatically tracks everyone I interact with, with smart filtering so it only adds real people — not newsletters, bots, or cold outreach.
Data sources:
- Connect to my email (Gmail API or IMAP) and scan the last 60 days of messages.
- Connect to my calendar (Google Calendar API) and scan the last 60 days of events.
- Run this ingestion on a daily cron schedule.
Contact extraction from email:
- Extract sender/recipient email addresses and names from messages.
- Estimate the number of exchanges (back-and-forth threads, not just raw message count): Math.min(Math.floor(totalMessages / 2), threadCount).
- Collect sample subject lines and message snippets for classification.
Contact extraction from calendar:
- Only include meetings with 1-10 attendees (skip large all-hands).
- Only include meetings at least 15 minutes long (skip quick check-ins that are really just reminders).
- Extract attendee names, emails, and the meeting title.
Filtering — this is critical. Most contacts from email are noise. Use a two-stage filter:
Stage 1 — Hard filters (always reject):
- My own email addresses and domains.
- Emails from family or personal contacts I've explicitly excluded (configurable list).
- Contacts already in the CRM or previously rejected.
- Generic role-based inboxes: info@, team@, partnerships@, collabs@, noreply@.
- Marketing/transactional domains matching patterns like: noreply@, tx., cx., mail., email. prefixes.
Stage 2 — AI classification (use a fast, cheap LLM like Gemini Flash or Haiku):
Send the candidate's name, email, exchange count, and sample subject lines/snippets to an LLM with these rules:
- REJECT clearly automated or notification-only senders.
- REJECT if all sample subjects look like newsletters, digests, or automated reports ("weekly roundup", "monthly update", "AI news").
- REJECT cold outreach with low engagement — if exchanges are low relative to total emails, it's one-way pitching.
- REJECT if snippets show repetitive promotional content (product launches, feature announcements, affiliate reports).
- APPROVE only if it looks like a real person with genuine two-way interaction or a meaningful business relationship.
- Higher confidence for real back-and-forth conversations with varied, substantive topics.
Contact scoring (used for ranking, not filtering):
- Base score: 50
- +5 per email exchange (max +20)
- +3 per meeting (max +15)
- +15 if their title matches preferred titles (CEO, Founder, VP, Head of, Engineer, Partner, etc.)
- +10 if they appeared in small meetings (≤3 attendees)
- +10 if last interaction was within 7 days, +5 if within 30 days
- +25 bonus if the person appears in both email AND calendar (stronger signal)
- +10 if they have a recognizable role, +5 if they have a company
For each approved contact, store:
- Name, email(s), company, role/context
- Interaction timeline with dates
- Last-touch timestamp (auto-updated)
- Contact score
- Tags or categories
Learning system:
- Maintain a learning.json config with:
- skip_domains: domains to always reject (populated over time from rejections)
- prefer_titles: titles that boost contact scores
- skip_keywords: subject-line keywords that indicate spam
- min_exchanges: minimum exchange threshold (default 1)
- max_days_between: max days since last interaction (default 60)
- max_attendees: meeting size cap (default 10)
- min_duration_minutes: meeting length minimum (default 15)
- When I reject a contact, learn from it — add their domain to skip_domains if appropriate.
Deduplication:
- When a new contact is found, check by email, then by name+company combination.
- Merge records rather than creating duplicates.
Semantic retrieval:
- Generate embeddings for each contact record.
- Let me ask natural-language questions like:
- "Who did I meet from [company] last month?"
- "When did I last talk to [name]?"
- "Show contacts I haven't spoken to in 30+ days."
Storage: SQLite with WAL mode and foreign keys enabled.
Notifications: After each ingestion run, send a summary of new contacts, merges, rejections, and any issues.
Follow me ► x.com/matthewberman
Comments
Want to join the conversation?
Loading comments...