Graph–Text Fusion for Ad Recommendations: Cold-Start, Reranking, and Real-World Ops

2025-09-01

TL;DR — We built a multilingual recommendation system that fuses graph signals (Node2Vec on user/search/item interactions) with text semantics (bi-encoder). It serves via Qdrant/HNSW, uses dynamic fusion to survive cold-start, and shows fusion > unimodal in offline metrics. Cross-encoder reranking helps only on weak text-only setups.


Why this system?

Given a job/advert item i, we want a ranked list of similar items users would plausibly click/apply next. Two complementary notions of “closeness” matter:

  • Behavioral proximity (graph): co-clicks, purchases, co-views within the same searches/sessions
  • Semantic proximity (text): similar roles/skills/seniority inferred from titles + descriptions

Key constraints: data sparsity, multilingual/noisy text, cold-start items, and strict latency SLOs. We therefore:

  • keep text/graph encoders separate (late fusion),
  • rely on implicit feedback only,
  • serve via ANN (Qdrant/HNSW) with a small reranking budget.

Data schema & cleaning

Events

(event_type, client_id, item_id, ds_search_id, timestamp)event_type ∈ {click, purchase}

Item text

(item_id, pozisyon_adi, item_id_aciklama)

Text normalization checklist

  • Strip HTML/scripts/entities → collapse whitespace → lowercase
  • Concatenate title + description; up-weight title (e.g., duplicate tokens with weight=2)
  • Optional: de-dup, light stopwording, preserve domain terms

ID hygiene

  • Cast IDs to strings, trim, prefix types (u:, i:, s:)
  • Parse timestamps permissively (e.g., errors="coerce")

Building the interaction graph

Nodes: users (u:), items (i:), searches (s:)
Edges:

  • u→i interaction: weights click=1, purchase=5
  • u→s (optional): user performed search
  • s→i: item displayed in a search
  • i↔i co-view: items co-appearing under the same ds_search_id (undirected)

Co-view weighting

c_{ij} = Σ_s 1{i∈s}·1{j∈s}, w_{ij} = log(1 + c_{ij}) or w_{ij} = c_{ij}·γ, with c_{ij} ≥ c_min.

  • Use c_min ≥ 2 to avoid pair explosion
  • Log-scale to reduce popularity bias
  • Co-views act as weak supervision that densifies neighborhoods early

Operational knobs

  • Cap per search (pair only top-m results)
  • Keep i↔i undirected unless you really need directionality

Embeddings (graph + text)

Graph side — Node2Vec

  • Biased random walks over a weighted heterogeneous graph
  • Tune (p, q) for a mild BFS bias (stays local → good for similar-item retrieval)
  • Up-weight reliable ties (purchases > clicks; frequent co-views > rare)

Items with no edges yield near-zero vectors — we’ll handle that via dynamic fusion.

Text side — Multilingual bi-encoder

  • Encode cleaned (title + description) → t_i
  • L2-normalize: stability + cosine reduces to dot product
  • Pros: robust cross-lingual semantics, strong zero-shot, excellent for cold-start
  • Cons: may over-rank semantically similar but commercially weak items if used alone

Fusion that survives cold-start

Concat fusion preserves modality axes:

z_i = [α·t̂_i ; β·ĝ_i] (with both halves L2-normalized)

  • Search with cosine over z_i
  • Choose (α, β) to reflect business priors (more graph when available)

Dynamic fusion rule

if ‖g_i‖ < τ → (α, β) = (1, 0) (text-only fallback)
else → (α, β) = tuned weights

Alternatives

  • Weighted sum (needs same dims for halves)
  • Learned fusion: small MLP over [t; g] trained on CTR/CVR
  • GNNs (GraphSAGE/PinSAGE): richer context, higher serving cost

Serving: Qdrant + HNSW

  • Index z_i with HNSW (cosine); payload keeps item_id + metadata
  • Millisecond top-K at high recall with good parameters

Ops tips

  • Batch ingestion (500–2000) to avoid timeouts
  • Monitor ANN recall/latency; reindex if params change
  • Keep ID mapping consistent from embedding to index

Should you add a reranker?

Cross-encoder rereads (query, candidate) jointly, often improving NDCG@top-K — but adds latency.

  • Budget K ∈ [20, 100]
  • Cache texts and consider mixed precision

When it helps / hurts

  • On text-only retrieval → small gains (sharper top of list)
  • On strong fusion retrieval → can hurt (misaligned with behavior-driven utility, thin descriptions, or “don’t fix what isn’t broken”)

Offline results (highlights)

Retrieval (Recall@10 / NDCG@10 / MAP@10 / Precision@10)

  • Text-only: 0.1712 / 0.1963 / 0.0966 / 0.0409
  • Graph-only: 0.6877 / 0.6027 / 0.4521 / 0.1967
  • Fusion: 0.7120 / 0.6214 / 0.4636 / 0.2010
  • Fusion (+clean): 0.7119 / 0.6257 / 0.4639 / 0.1987

Rerank on fusion (top-50)

  • Baseline Fusion: Recall@10 0.7312, NDCG@10 0.4469
  • Fusion + Rerank: Recall@10 0.3874, NDCG@10 0.2068 → degrades

Rerank on text-only

  • Baseline: NDCG@10 0.0909
  • +Rerank: 0.0988 (small lift; still far below graph/fusion)

Engineering playbook

Graph construction

edges_ui = aggregate_user_item(events, weights={"click":1,"purchase":5})
edges_si = from_search_results(search_id, item_id, weight="count")
edges_ii = coview_pairs(search_id→items, cmin=2, weight="log1p")
G = build_graph(nodes=[users,items,searches], edges=[edges_ui,edges_si,edges_ii])

Node2Vec

n2v = Node2Vec(G, p=1.0, q=0.75, walk_length=40, num_walks=10, weighted=True)
g = n2v.fit(dim=dg)

Text encoder

t = bi_encoder.encode(clean(title) + " " + clean(desc), normalize=True)

Fusion & index

z = concat(alpha * t_hat, beta * g_hat)
qdrant.upsert(id=item_id, vector=z, payload=meta)

Dynamic fusion query

if ‖g_hat‖ < τ: vec = concat(t_hat, 0)
else: vec = concat(alpha * t_hat, beta * g_hat)
results = qdrant.search(vec, topK=50)

Optional rerank

if pipeline == "text_only":
scores = cross_encoder.score(query_text, candidate_texts)
results = sort_by(scores)


Diagnostics

  • Ego-nets: visualize subgraphs; confirm weights (purchase > click) and coherent clusters
  • UMAP on neighbors: compact clusters ≈ better perceived relevance
  • Segmented metrics: track Recall@K by item age, category, locale

Production notes

  • Monitor latency (P50, P95) of ANN
  • Track zero-graph rate; enrich co-views if high
  • Apply log-scaling on edges to mitigate popularity bias
  • Track multilingual fairness; watch embedding drift

Limitations & roadmap

Current gaps

  • No user cold-start (anonymous/new users)
  • No recency weighting (all interactions equal)
  • Fusion is heuristic, not learned end-to-end

Next

  • Temporal dynamics (recency, sessions)
  • Learned fusion (MLP over [t; g])
  • Graph neural recommenders (GraphSAGE, PinSAGE)
  • Online A/B for CTR/CVR validation

Key takeaways

  1. Graph rules, text rescues — fusion wins across segments
  2. Dynamic fusion handles cold-start smoothly
  3. Search-derived co-views are cheap, effective enrichment
  4. Rerankers help text-only, can hurt strong fusion
  5. Modular stack = offline embeddings + ANN + optional rerank

Further Reading

Full technical report:
Advertisement Suggestion System — Methods-First Guide to Graph–Text Fusion with Cold-Start Awareness and Re-Ranking
Report PDF