How Events Graph turns raw news into ranked, personalised intelligence
Events Graph continuously scrapes news sources and passes each article through
GPT-4o-mini to produce a structured event record: title, summary,
entities, importance score, sentiment, and category. Each event is then converted
into a 1,536-dimension vector embedding using OpenAI's
text-embedding-3-small model, capturing the semantic meaning of the event
rather than its literal words. At query time, the user's search string is embedded
using the same model and compared against all stored event vectors via
cosine similarity. When a client profile is present, a stored interest
vector for that client is blended with the query vector to personalise the ranking —
surfacing events that match both the topic and the client's known interests.
The result is a ranked list of events sorted by relevance, not recency.
Every news article follows this two-stage pipeline before being queryable:
News Article
│
▼
┌─────────────────────────────────────┐
│ GPT-4o-mini Structuring │
│ title, summary, entities, │
│ importance (1-5), sentiment, │
│ category, source_urls │
└─────────────────┬───────────────────┘
│
▼
┌─────────────────────────────────────┐
│ Embedding Generation │
│ text-embedding-3-small │
│ "title. summary. entities..." │
│ → 1,536-dimension float32 vector │
└─────────────────┬───────────────────┘
│
▼
Stored in DB
event_embeddings
When a search request arrives, the query string undergoes the same embedding process and is compared against all pre-loaded event vectors in memory:
User Query: "China export controls rare earth" │ ▼ ┌─────────────────────────────────────┐ │ Query Embedding │ │ text-embedding-3-small │ │ → 1,536-dimension query vector │ └─────────────────┬───────────────────┘ │ ▼ ┌─────────────────────────────────────┐ │ Cosine Similarity │ │ score = q⃗ · e⃗ / (|q⃗| × |e⃗|) │ │ │ │ Computed against all ~1,000+ │ │ pre-loaded event vectors │ │ (numpy matrix multiply, ~1ms) │ └─────────────────┬───────────────────┘ │ ▼ Top-K events by similarity → Apply filters (category, days, importance) → Return ranked results
A profile is a stored interest vector. It is created once from a
plain-text description of a client's interests using the same
text-embedding-3-small model. The resulting vector captures what that
client cares about semantically. Once stored, it is applied to every query that
includes that client's client_id — with no additional API calls at runtime.
zariff-daily arthur-straits-signal tom-ree "Malaysia business "MY+SG editorial "REE mining processing macro deals AI..." newsletter deals..." Malaysia Lynas NdPr..." │ │ │ ▼ ▼ ▼ [vector₁] [vector₂] [vector₃] stored in DB stored in DB stored in DB
When a client_id is passed in a query, two similarity scores are computed for every candidate event and combined into a single final score:
When client_id is passed in a query: QUERY VECTOR │ │ cosine similarity ▼ ┌─────────────────────┐ Event A ──▶ │ query_score = 0.92 │ Event B ──▶ │ query_score = 0.87 │ Event C ──▶ │ query_score = 0.61 │ └─────────────────────┘ PROFILE VECTOR │ │ cosine similarity ▼ ┌─────────────────────┐ Event A ──▶ │ profile_score = 0.45│ Event B ──▶ │ profile_score = 0.91│ Event C ──▶ │ profile_score = 0.88│ └─────────────────────┘ BLEND ┌─────────────────────────────────────┐ Event A ──▶ │ 0.7 × 0.92 + 0.3 × 0.45 = 0.779 │ → Rank #2 Event B ──▶ │ 0.7 × 0.87 + 0.3 × 0.91 = 0.882 │ → Rank #1 Event C ──▶ │ 0.7 × 0.61 + 0.3 × 0.88 = 0.691 │ → Rank #3 └─────────────────────────────────────┘
Events Graph uses cosine similarity as its core relevance metric. It measures the angle between two vectors in high-dimensional space, independent of their magnitude.
Base URL: https://events.straitssignal.com
{
"query": "China export controls rare earth",
"top_k": 10,
"days": 30,
"min_importance": 3
}
{
"query": "China export controls rare earth",
"client_id": "tom-ree",
"top_k": 10,
"days": 30,
"min_importance": 3,
"profile_weight": 0.3
}
→ Returns list of all entity names extracted from events, sorted by frequency of appearance.
GET /events/entity/Lynas%20Corporation → Returns all events where this entity appears, sorted by importance then recency.
{
"client_id": "tom-ree",
"description": "REE mining and processing, Malaysia, Lynas Corporation,
NdPr pricing, rare earth supply chains, China export controls"
}
→ Embeds the description and stores the resulting vector.
Called once per client. No re-embedding needed unless
interests change.
→ Returns a structured briefing for AI agents: available categories, entity index, recent high-importance events, and query usage examples. Use as context injection before starting a research session.