Events Graph — Technical Reference

Semantic Ranking Methodology

How Events Graph turns raw news into ranked, personalised intelligence

01 / Overview

How it works

Events Graph continuously scrapes news sources and passes each article through GPT-4o-mini to produce a structured event record: title, summary, entities, importance score, sentiment, and category. Each event is then converted into a 1,536-dimension vector embedding using OpenAI's text-embedding-3-small model, capturing the semantic meaning of the event rather than its literal words. At query time, the user's search string is embedded using the same model and compared against all stored event vectors via cosine similarity. When a client profile is present, a stored interest vector for that client is blended with the query vector to personalise the ranking — surfacing events that match both the topic and the client's known interests. The result is a ranked list of events sorted by relevance, not recency.

02 / Ingestion

Event Ingestion Pipeline

Every news article follows this two-stage pipeline before being queryable:

News Article
     │
     ▼
┌─────────────────────────────────────┐
  GPT-4o-mini Structuring            
  title, summary, entities,          
  importance (1-5), sentiment,       
  category, source_urls              
└─────────────────┬───────────────────┘
                  │
                  ▼
┌─────────────────────────────────────┐
  Embedding Generation               
  text-embedding-3-small             
  "title. summary. entities..."      
  → 1,536-dimension float32 vector   
└─────────────────┬───────────────────┘
                  │
                  ▼
          Stored in DB
          event_embeddings
03 / Query

Query Pipeline

When a search request arrives, the query string undergoes the same embedding process and is compared against all pre-loaded event vectors in memory:

User Query: "China export controls rare earth"
     │
     ▼
┌─────────────────────────────────────┐
  Query Embedding                    
  text-embedding-3-small             
  → 1,536-dimension query vector     
└─────────────────┬───────────────────┘
                  │
                  ▼
┌─────────────────────────────────────┐
  Cosine Similarity                  
  score = q⃗ · e⃗ / (|q⃗| × |e⃗|)       
                                     
  Computed against all ~1,000+       
  pre-loaded event vectors           
  (numpy matrix multiply, ~1ms)      
└─────────────────┬───────────────────┘
                  │
                  ▼
          Top-K events by similarity
          → Apply filters (category, days, importance)
          → Return ranked results
04 / Profiles

User Profiles

A profile is a stored interest vector. It is created once from a plain-text description of a client's interests using the same text-embedding-3-small model. The resulting vector captures what that client cares about semantically. Once stored, it is applied to every query that includes that client's client_id — with no additional API calls at runtime.

  zariff-daily           arthur-straits-signal       tom-ree
"Malaysia business     "MY+SG editorial             "REE mining processing
 macro deals AI..."     newsletter deals..."         Malaysia Lynas NdPr..."
       │                      │                           │
       ▼                      ▼                           ▼
  [vector₁]             [vector₂]                   [vector₃]
  stored in DB          stored in DB                stored in DB
05 / Blend

Personalised Ranking — The Blend

When a client_id is passed in a query, two similarity scores are computed for every candidate event and combined into a single final score:

When client_id is passed in a query:

                    QUERY VECTOR
                         │
                         │  cosine similarity
                         ▼
              ┌─────────────────────┐
  Event A ──▶   query_score = 0.92 
  Event B ──▶   query_score = 0.87 
  Event C ──▶   query_score = 0.61 
              └─────────────────────┘

                    PROFILE VECTOR
                         │
                         │  cosine similarity
                         ▼
              ┌─────────────────────┐
  Event A ──▶  profile_score = 0.45
  Event B ──▶  profile_score = 0.91
  Event C ──▶  profile_score = 0.88
              └─────────────────────┘

                         BLEND
              ┌─────────────────────────────────────┐
  Event A ──▶  0.7 × 0.92 + 0.3 × 0.45 = 0.779    → Rank #2
  Event B ──▶  0.7 × 0.87 + 0.3 × 0.91 = 0.882   Rank #1
  Event C ──▶  0.7 × 0.61 + 0.3 × 0.88 = 0.691    → Rank #3
              └─────────────────────────────────────┘
Event B ranked #1 even though it had a lower raw query match — because it strongly matches this user's stored interests. This is how profiles shift results toward what actually matters to the client, not just what keywords were in the query.
06 / Math

The Math

Events Graph uses cosine similarity as its core relevance metric. It measures the angle between two vectors in high-dimensional space, independent of their magnitude.

similarity(A, B)  =  (A · B)  /  (‖A‖ × ‖B‖)
A · B   = dot product (sum of element-wise products)
‖A‖     = L2 norm (magnitude of vector A)
‖B‖     = L2 norm (magnitude of vector B)
Result range
1.0 = identical direction (perfectly relevant)
0.0 = orthogonal (unrelated)
-1.0 = opposite direction (irrelevant)
Final score  =  0.7 × query_similarity  +  0.3 × profile_similarity
07 / Design

Why This Approach

08 / API

API Quick Reference

Base URL: https://events.straitssignal.com

Query events — no profile

POST/events/query
{
  "query": "China export controls rare earth",
  "top_k": 10,
  "days": 30,
  "min_importance": 3
}

Query events — with personalised profile blend

POST/events/query
{
  "query": "China export controls rare earth",
  "client_id": "tom-ree",
  "top_k": 10,
  "days": 30,
  "min_importance": 3,
  "profile_weight": 0.3
}

List all tracked entities

GET/events/entities
→ Returns list of all entity names extracted from events,
  sorted by frequency of appearance.

Get events for a specific entity

GET/events/entity/{name}
GET /events/entity/Lynas%20Corporation

→ Returns all events where this entity appears,
  sorted by importance then recency.

Create a client profile

POST/events/profile/create
{
  "client_id": "tom-ree",
  "description": "REE mining and processing, Malaysia, Lynas Corporation,
    NdPr pricing, rare earth supply chains, China export controls"
}

→ Embeds the description and stores the resulting vector.
  Called once per client. No re-embedding needed unless
  interests change.

Agent onboarding briefing

GET/events/agent/onboard
→ Returns a structured briefing for AI agents:
  available categories, entity index, recent high-importance
  events, and query usage examples.
  Use as context injection before starting a research session.