One analyst. One day of MDO Surge prompts. Built the factors, wired the pipelines,
ready to test any frequency, any history, any universe. In this case: 50 factors,
25 years of history, 3,000 global stocks, point-in-time accurate, live in your Snowflake.
AI generates answers. MDO ensures they’re correct.
50
Production-grade factors built in one day — a starting point, not a ceiling
25Y
History · Jan 2000 – present · monthly rebalance
3,000
Global stocks · top market cap · survivorship-free
912K
Rows in your Snowflake · PIT-aligned · FX-normalized
Before MDO · After MDO
Typical in-house build: a project. Engineering, planning, vendor integrations — the factory before the factors. With MDO: skip the factory. Use the engine.
Worldscope · IBES Estimates · Datastream |
Built in one day with MDO Surge + the Financial Intelligence Engine
The Engineering Tax
Six problems your team solves before testing a single factor.
None of these are factor research. All of them must be solved first.
MDO ships every one of them built in.
Problem
Why it’s hard
How MDO handles it
Point-in-time alignment
Raw warehouse data is restated — look-ahead bias by default.
✓dateAlign is a built-in parameter on every fundamental call — prelim, report-date, fiscal-period-end, and other PIT conventions are supported out of the box. Look-ahead bias is never an afterthought.
Survivorship-bias-free universe
Delisted and bankrupt names disappear from current constituent lists.
✓ QAD-sourced monthly constituent history rebuilds the top-3,000 universe at each month-end.
Cross-currency normalization
Global factors need FX as-of the calculation date, per security, every call.
✓toCurrency converts to USD, EUR, JPY, GBP, or any supported currency inline — FX rates applied as-of the factor date. No separate FX pipeline, regardless of base currency.
TTM across reporting frequencies
US quarterly, Japan semi-annual, EU annual — one calc must handle all three.
✓ Batched Q/S/A pulls with ttm=True and Q→S→A coalesce. No bespoke logic.
Reporting lag enforcement
A Q3 filing published in Feb can’t be usable in Oct without dedicated lag logic.
✓ Preliminary filing lag enforced per company. restated=False locks the originally reported figure.
Snowflake-native execution
Pandas-based factor logic forces data movement and breaks at scale.
✓ Factors run as Snowpark UDFs. Data never leaves your account. Scales with your warehouse tier.
Pipeline Architecture
From raw data to 912,000 factor rows — one pipeline, your infrastructure.
Three LSEG sources through MDO’s Financial Intelligence Engine, into a point-in-time
accurate factor dataset in your Snowflake. Every number below is real — measured
end-to-end on a standard Snowflake warehouse.
Measured compute times — 25-year full backfill, 3,000 securities
Universe build (top 3,000 global)
139seconds · 25Y full backfill
FL_MOM_SURP (IBES surprise, fastest)
40seconds · 25Y full backfill
FL_VAL_EVEBITDA (EV/EBITDA, median)
170seconds · 25Y full backfill
FL_RSK_BETA (12M OLS regression, slowest)
234seconds · 25Y full backfill
50
factors loaded into your Snowflake
91 min
factor compute Snowpark UDFs · no pandas
~8 min
incremental daily update just the latest month
Your cloud. Your control.
Every factor table lives in your Snowflake account. Data never moves, never replicates,
never egresses. Compute scales with the warehouse tier you already pay for —
compliance-friendly by construction, no new vendor surface area to audit.
Meet Surge · MDO's AI Agent
Two paths to the same answer. Both correct by construction.
AI generates answers. MDO ensures they’re correct. Same engine, two interfaces:
For quants & engineers
MDO Surge → editable Python
Prompt in plain English, get production Python code you can audit, modify, and extend.
"Price acceleration — is the stock moving faster in the last 3 months versus the 3 months before that?"
↓ Surge generates production code
FL_TEC_PRICEACC · Price Acceleration
def FL_TEC_PRICEACC(universe, params=None):
import mdosnow as mdofrom snowflake.snowpark.functions importexpr
dims = ['day', 'security']
universe = mdo.as_universe(universe[dims], dimensions=dims)
# One call fetches both 3M and 6M returns simultaneouslymdo.PlugDataChange(universe, items={'RET': 'DS_TOTRET'},
shift=[-3, -6], periodType='month',
toCurrency='USD', recent=-10, units='P')
# Acceleration = current 3M minus prior 3M (non-overlapping)# Prior 3M ≈ 6M - current 3M => 2×3M - 6M
universe.with_column('FL_TEC_PRICEACC',
expr('2.0 * RET_T3_CHG - RET_T6_CHG'))
returnmdo.as_universe(universe[dims + ['FL_TEC_PRICEACC']], dimensions=dims)
Analyst Prompt to MDO Surge
"Return on equity — TTM net income over average shareholder equity. Make sure semi-annual reporters and companies with only one year of equity history are still included."
↓ Surge generates production code
FL_QL_ROE · Return on Equity
def FL_QL_ROE(universe, params=None):
import mdosnow as mdofrom snowflake.snowpark.functions import coalesce, expr
dims = ['day', 'security']
ws = dict(dateAlign='prelim', restated=False, toCurrency='USD', units='M')
universe = mdo.as_universe(universe[dims], dimensions=dims)
# TTM net income — Q/S/A in one call, coalesced for global coverage# (US quarterly · Japan/Korea semi-annual · EU annual all handled)mdo.FundamentalData(universe, {
'NI_Q': 'WS_1651_Q', 'NI_S': 'WS_1651_S', 'NI_A': 'WS_1651_A',
}, period=0, ttm=True, **ws)
universe.with_column('NI_TTM', coalesce('NI_Q', 'NI_S', 'NI_A'))
# Current book equity — Q/S/A coalesced to most-recent reported valuemdo.FundamentalData(universe, {
'EQ_Q': 'WS_3995_Q', 'EQ_S': 'WS_3995_S', 'EQ_A': 'WS_3995_A',
}, period=0, **ws)
universe.with_column('EQ', coalesce('EQ_Q', 'EQ_S', 'EQ_A'))
# Prior-year equity (annual at period=-1; MDO writes column as EQ_T1).# If prior is missing, fall back to current equity so the row isn't dropped —# critical for recent IPOs and stub-year filings.mdo.FundamentalData(universe, items={'EQ': 'WS_3995_A'}, period=-1, **ws)
universe.with_column('FL_QL_ROE',
expr('NI_TTM / NULLIFZERO((EQ + COALESCE(EQ_T1, EQ)) / 2.0)'))
returnmdo.as_universe(universe[dims + ['FL_QL_ROE']], dimensions=dims)
Analyst Prompt to MDO Surge
"Earnings surprise — how much did the company beat or miss analyst consensus last quarter?"
↓ Surge generates production code
FL_MOM_SURP · Earnings Surprise
def FL_MOM_SURP(universe, params=None):
import mdosnow as mdofrom snowflake.snowpark.functions import coalesce
dims = ['day', 'security']
universe = mdo.as_universe(universe[dims], dimensions=dims)
# IBES pre-computed standardized surprise: (Actual - Consensus) / |Consensus|# Q/S/A in one call — US quarterly, Japan/Korea/Australia semi-annual,# EU annual all coalesced into one signal. dateAlign='prelim' enforces PIT.mdo.FundamentalData(universe, {
'SURP_Q': 'IB_ACT_EPS_SURPRISE_Q',
'SURP_S': 'IB_ACT_EPS_SURPRISE_S',
'SURP_A': 'IB_ACT_EPS_SURPRISE_A',
}, period=0, dateAlign='prelim', restated=False)
universe.with_column('FL_MOM_SURP', coalesce('SURP_Q', 'SURP_S', 'SURP_A'))
returnmdo.as_universe(universe[dims + ['FL_MOM_SURP']], dimensions=dims)
Not just an AI chatbot on top of raw data. Surge knows the correct LSEG
item codes, needs NULLIFZERO in ratio formulas, and handles Japanese semi-annual reporters in
global TTM calcs. The financial domain knowledge lives in the Financial Intelligence Engine
— not the prompt — so every output is correct, reproducible, and auditable.
New · MDO + Claude via MCP
Investment research without code — now inside Claude.
MDO is now an MCP server integrated with Claude. Ask in plain English —
factor research, screening, backtesting — and Claude runs it against MDO’s hardened
data layer. The responsibility shifts to asking the right question.
“Run a Russell 3000 value-momentum backtest, last 10 years, monthly rebalance.”
→ Claude calls MDO via MCP
→ Point-in-time accurate, FX-normalized
→ Pre-built, audited workflow
→ Reproducible Python in your Snowflake
Factor Library
18 of the 50. Every formula is plain, editable Python.
A curated cross-section — three per category — from a library built in one day.
Add, modify, or extend any factor: the framework scales with your research.
Factor
Signal
Category
Source
Formula
Coverage
FL_VAL_EP
Earnings Yield
Value
Worldscope
WS_1651_Q (TTM NI) / DS_MKTCAP
98%
FL_VAL_BP
Book-to-Price
Value
Worldscope
WS_3995_A / DS_MKTCAP
98%
FL_VAL_DIVYIELD
Dividend Yield
Value
Datastream
DS_DY (direct)
98%
FL_QL_ROE
Return on Equity
Quality
Worldscope
TTM NI / Avg Equity
98%
FL_QL_ROA
Return on Assets
Quality
Worldscope
TTM NI / Avg Total Assets
98%
FL_QL_OPMGN
Operating Margin
Quality
Worldscope
TTM EBIT / TTM Revenue
96%
FL_GR_EPSREV
EPS Revision Score
Growth
IBES
(Up30D − Down30D) / NumEst
94%
FL_GR_SALESGROW
Sales Growth 1Y
Growth
Worldscope
TTM Revenue YoY
94%
FL_GR_EPSGROW1Y
EPS Growth 1Y
Growth
Worldscope
TTM NI YoY / |Prior|
94%
FL_MOM_12M1M
12M1M Price Momentum
Momentum
Datastream
12M Return − 1M (skip-month)
97%
FL_MOM_SURP
Earnings Surprise
Momentum
IBES
IB_ACT_EPS_SURPRISE_A
92%
FL_MOM_52WHIGH
52-Week High Proximity
Momentum
Datastream
Price / Rolling 52W High
100%
FL_RSK_BETA
Market Beta
Risk
Datastream
OLS β vs EW universe (12M)
97%
FL_RSK_IDIOVOL
Idiosyncratic Volatility
Risk
Datastream
StdDev of Market-Adj Residuals
97%
FL_RSK_VOLRET
Return Volatility
Risk
Datastream
12M StdDev of Monthly Returns
100%
FL_TEC_52WHIGH
Annual Range Position
Technical
Datastream
(Price − 52W Low) / (High − Low)
100%
FL_TEC_PRICEACC
Price Acceleration
Technical
Datastream
2 × 3M Return − 6M Return
98%
FL_TEC_TURN
Share Turnover
Technical
Datastream
3M Avg Volume / Implied Shares
100%
A note on coverage: percentages reflect what global
companies actually report — not MDO limitations. Lower-coverage factors apply where the
underlying data exists; the survivorship-safe universe is preserved otherwise.
Full 50-factor library available on request.
Custom Formulas
Plain Python, registered as a managed MDO Custom Formula.
Every factor isn’t just code — it’s a Custom Formula in your Snowflake.
Once registered, anyone in your user group can query it like any native MDO data item,
with full PIT, FX, and survivorship guarantees baked in.
1
Write plain Python
defFL_VAL_MYMETRIC(universe, params=None):
Analyst writes the factor as a function. Editable, auditable, in source control.
Values computed and persisted to a managed table in your account.
→
4
Anyone queries it
mdo.CFData(universe, name='FL_VAL_MYMETRIC', ...)
Other analysts, models, and dashboards consume it like any MDO item.
This is not a folder of scripts. It’s a versioned, managed catalog of factors
living inside your Snowflake. Edit the Python → re-register the Custom Formula →
reload — downstream consumers keep working without code changes. The factor library becomes
institutional infrastructure, not a side project.
From there it’s your workflow: connect the factor tables (or the consolidated CSV) to your
existing backtesting, portfolio-construction, and risk-attribution tools. Your data, your tools,
your edge.
Why MDO
AI generates answers. MDO ensures they're correct.
Generic AI on raw data writes plausible code. MDO’s investment-ready framework makes
the answer institutional-grade: point-in-time accurate, survivorship-safe, FX-normalized,
reproducible.
Capability
Generic AI on Your Data
MDO Surge + Financial Intelligence Engine
Financial domain knowledge
◦ Generates syntactically valid code — financial correctness depends on the prompt
✓ Generates code with correct LSEG item codes validated against the FIE catalog
Point-in-time alignment
◦ Has no concept of when data was available — look-ahead bias is possible
✓DateAlign is a built-in parameter on every data call — PIT is never an afterthought
Global reporting frequencies
◦ Treats all companies as quarterly reporters by default
◦ FX conversion is left to your code — rates, timing, and source are your problem
✓toCurrency='USD' on any data call — rates sourced from Datastream as-of the factor date
Survivorship bias
◦ Generated code queries your current universe — delisted stocks require separate handling
✓ QAD-sourced universe includes all delisted tickers through their last active date
Execution environment
◦ Generates pandas code — runs locally or requires ETL into your warehouse
✓ Pushes execution into Snowflake UDFs — data never leaves your cloud
Adding new data items
◦ Model may hallucinate item codes; validation requires manual lookup
✓ Item codes are catalogued in the FIE; Surge references only validated codes
The intelligence isn’t just the language model.
It’s the Financial Intelligence Engine underneath — point-in-time alignment, the
survivorship-safe universe, the validated item catalog, the Snowflake-native execution
layer. Across over 70 data sources, any time series, any universe.
Your Next Step
Ready to build your factor library?
MDO connects directly to over 70 data sources inside Snowflake — point-in-time
accurate, FX-normalized, institutional-grade. Now available with Surge and Claude
via MCP. No data movement, no replication lag, no infrastructure to manage.