IMPLEMENTATION_PLAN.md 11 KB

Pacifica Delta-Neutral Scalping Implementation Plan

Overview

This document consolidates the product requirements, technical architecture, and execution roadmap for the Pacifica delta-neutral market-making plus micro-scalping system. It targets a TypeScript/Node 22 stack and covers connector integration, risk controls, strategy logic, observability, and operational practices under a strict compliance posture.

Documentation contract
本计划与以下支持性文档配套使用:docs/API_CONNECTOR_SPEC.md, docs/MODULE_INTERFACES.md, docs/SEQUENCE_FLOW.md, docs/CONFIG_REFERENCE.md, docs/TESTING_PLAN.md, docs/OPERATIONS_PLAYBOOK.md。所有实现、配置或流程改动若偏离这些文档,需先修改文档再执行开发任务。

Objectives & Constraints

  • Maintain net delta ≈ 0 while capturing spread and maker rebates across BTC/ETH/SOL perpetual markets.
  • Enforce self-trade prevention, audit logging, and regulatory safeguards across all order flows.
  • Prioritise risk: enforce notional/inventory/order caps, intraday drawdown kill-switch, latency and data-gap protection.
  • Deliver production-grade observability (Prometheus metrics, alerting, traceable decisions) and hot-reloadable configuration with audit trails.

Functional Scope

  • Exchange connectivity: Pacifica REST (and optional WS) adapter, Ed25519 signing, rate-limit handling, structured error surfacing.
  • Market data: shadow order book with mid/spread/OBI/short-term RV derivations; latency tracking.
  • Execution: order router with limit/IOC modes, slippage guard, post-only support, deterministic clientId, and STP cross-checks.
  • Risk & compliance: order pre-checks, inventory/notional/order-size constraints, realised PnL accumulation, kill-switch triggers, append-only audit logs.
  • Strategy layer: passive market making (multi-layer reprice loop) and micro-scalping (spread expansion + flow imbalance triggers).
  • Hedging: PI-controlled cross-venue/account hedger with throttle and funding-rate bias adjustments.
  • Trigger/OCO: take-profit/stop-loss legs, timeout exits, shared STP/risk pipeline.
  • Telemetry: Prometheus metrics (maker_ratio, delta_abs, latency_p99, pnl_intraday, hedge_cost_bps, ev_estimate, cancel_rate, stp_hits) and alert definitions.
  • Backtesting: event replay (books+trades), fee/funding models, EV evaluation, parameter search, sharpe and bucketed reports.
  • Configuration: .env secrets, config.yaml strategy parameters (symbols, mm, scalper, risk, hedge), zod validation, hot reload with audit entries.

Non-Functional Targets (KPI / SLO)

  • EV_p50 ≥ 1.5 bps; EV_p10 ≥ 0.
  • |delta| P95 ≤ 0.5 × max_base_abs; taker_ratio ≤ 35%; hedge_cost_bps ≤ 0.4 × edge_bps.
  • 对冲效率(新增):hedge_success_rate > 98%; hedge_latency_p50 < 500ms, p99 < 2s; hedge_slippage_bps < 0.5.
  • 资金费率(新增):funding_rate_correlation > 0.8; funding_cost_net_bps < 1 per 8h; funding_same_sign_ratio < 20%.
  • Latency_p99 within venue SLA; zero self-trade incidents; compliant audit log coverage.

Milestones & Deliverables

  1. M1 – Core Skeleton & Multi-Account Infrastructure (≈1–2 weeks)
    • Type/domain definitions, Pacifica adapter with signing placeholder replaced by official implementation.
    • Multi-account adapter registry:支持 Account A (maker) / Account B (hedger) 独立配置与管理。
    • Shadow book pipeline, latency metrics, baseline Prometheus exporter.
    • Symbol Registry & Allocator (新增):多标的生命周期管理与全局风险预算分配。
    • Global Order Coordinator (新增):跨账户订单注册表与 STP 检查基础框架。
    • OrderRouter + SlippageGuard with post-only/STP checks.
    • RiskEngine v1 covering inventory/notional/order caps and enhanced kill-switch(跨账户聚合、多维度触发器)。
  2. M2 – Strategy Loop & Enhanced Hedging (≈1 week)
    • Strategy Coordinator (新增):信号聚合与冲突检测,MM 和 Scalper 提交 intent 而非直接下单。
    • MarketMaker v1 (mid±δ layers, periodic reprice) and MicroScalper v1 (spread + flow triggers, IOC exit fallback).
    • PositionManager & HedgeEngine (PI-controlled cross-account hedging, throttle, funding bias options).
    • 对冲延迟预算与重试机制(新增):P50/P99 追踪,超时强制市价,失败自动重试(最多 2 次)。
    • 资金费率监控(新增):双 venue funding rate 抓取、相关性计算、同向支付检测。
    • TriggerEngine for OCO tp/sl and timeout exits; funding rate ingestion.
    • 降级策略状态机(新增):实现降级矩阵中的核心场景(数据断流、对冲失败、Delta 失控)。
  3. M3 – Backtest & Parameterisation (≈1 week)
    • Event replay harness reusing production interfaces; fee/funding/slippage modelling.
    • 延迟注入模块(新增):记录实盘延迟分布,回测时注入真实延迟,防止前瞻偏差。
    • EV/Sharpe/bucket reporting, parameter sweeps (grid/random) with config snapshots.
    • 配置热更新金丝雀流程(新增):zod 校验 → 回测烟囱测试 → 单标的试运行 10 分钟 → KPI 验证 → 全量发布或自动回滚。
    • Audit trail for parameter edits (who/when/what/result).
  4. M4 – Compliance Hardening & Multi-Venue (≈1–2 weeks)
    • Second venue or sub-account integration; cross-venue STP; consistent snapshot+incremental recovery.
    • 完善 Global Order Coordinator:跨账户经济 STP、冲突检测、降级策略触发全流程集成。
    • Append-only audit log store with trace IDs across signal→order→fill→hedge.
    • 流动性冲击成本监控(新增):top10_depth 追踪、clip 动态调整、实际滑点 vs 预期滑点告警。
    • 资金费率筛选器(新增):标的上线前评估双 venue funding rate 相关性,自动拒绝不合格 venue 对。
    • Adaptive parameter policy reacting to RV/OBI regimes; expanded symbol onboarding automation.
  5. M5 – Operations & Resilience (continuous)
    • 完整降级策略矩阵实现:覆盖所有 9 种故障场景(见 ARCHITECTURE_DESIGN.md 5.5 节)。
    • SRE playbooks (volatility shock, data gap, signing failure, hedge miss, funding rate anomaly) and regular drills.
    • 对冲效率指标监控面板:hedge_success_rate, hedge_latency_p50/p95/p99, hedge_slippage_bps, cross_venue_basis 实时展示与报警。
    • Risk parameter auto-rollback and alert-driven mitigation; blue/green deploy + canary configs.
    • 手动干预接口/api/override-degradation (需认证),允许运维人员强制退出降级模式。
    • Periodic KPI reviews (EV, hedge_cost, delta_abs, hedge efficiency, funding cost) feeding strategy tuning roadmap.

Work Cadence & Governance

  • Each milestone concludes with code, self-tests (simulation/backtest), and documentation updates.
  • apps/runner exposes CLI for live, dry-run, and replay modes; CI runs lint/test/backtest smoke.
  • PRs must attach KPI snapshots or logs; doc updates tracked under docs/ with revision history.
  • Maintain change log of strategy parameters and incidents; schedule monthly compliance review.

Immediate Next Steps

  1. Finalise Pacifica signing specification and implement signRequest in packages/connectors/pacifica/src/signing.ts.
  2. Wire Runner / MarketData to pass API credentials for private WebSocket channels, then run live smoke tests against Pacifica orders/fills/account feeds以验证签名负载。
  3. Stand up the shadow book + metrics pipeline, ensuring data feeds for MarketMaker/Scalper prototypes。
  4. Define baseline Prometheus dashboard and risk parameter defaults to support M1 testing。

Upcoming Iteration Focus

  • Incremental grid maintenance(替代“全撤全布”)
    • GridMaker 中实现 reconcileGrid(),对比目标价位与 grids 状态,区分“轻微移动”“新增”“删除”三类操作。
    • 优先尝试局部 modify(或 cancel+place 同 tick 内完成),仅在偏移超过阈值或局部更新失败次数过多时退回全量重布。
    • GridLevel 增加 dirty/pending 标记与超时回滚逻辑,避免重复提交;对齐 OrderRouter 接口,补充 modifyLimitChild/批量执行能力。
  • Placement throttling & latency guard
    • 在 WS gateway 内引入令牌桶(burst ≤ NrefillMs)与可配置批处理,将 20+ 并行请求分批排队,缓解交易所限流。
    • 记录每次 create_order/cancel 的 RTT,指标化 placement_latency_{p50,p95,p99};超过阈值时自动降速或暂缓后续批次。
    • 撤单同样按批次执行(例如每 50–100ms 处理 5–10 张),确保网格在更新阶段仍有大部分档位在线。
  • Multi-account, multi-instance grid orchestration
    • 扩展 config.yaml,允许 grid.instances[] 指定多个 {symbol, account_id, 参数},并记录审计 Trail。
    • 引入 GridFleetManager 统一创建 ShadowBook、GridMaker、定时器,路由 fills/orders 到 (accountId, symbol) 维度;实例化过程需支持热加载与优雅停机。
    • 更新 GlobalOrderCoordinator / OrderRouter,在跨实例间执行 STP/冲突检测、全局节流,避免不同账户或标的互踩。
  • Observability & ops playbooks
    • 新增实例级指标:grid_active_orders, grid_pending_levels, grid_incremental_rate, order_gateway_queue_depth, stp_conflicts.
    • 定义告警:连续多个 tick pending_levels>0placement_latency_p95>5s 或实例重试超限时触发降级(回退全量模式或暂停实例)。
    • 更新运维剧本,覆盖多实例启停、动态扩缩容、手动降级/恢复、以及在增量模式失败时的回滚步骤。
  • Ultra-tight micro-grid configuration
    • 研究低至 1–2 bps 的步长与 0–1 bps post_only_cushion_bps 组合对限流及 post-only 成功率的影响,配合多账户实例进行分层部署。
    • 为极端贴盘口模式制定节流/批次下单策略(降低 burst、引入异步确认),并提供回滚配置模板(如 micro_grid_extreme.yaml)。
    • 输出风控指引:如何在多账户环境下分配层级(内圈账号、外圈账号),以及极限模式下的 delta/对冲阈值调优建议。
  • 成交缺失驱动的自适应调节
    • 引入“连续 N 个 tick 无成交”与average_fill_interval 等指标,作为步长/范围动态压缩的触发条件。
    • 设计压缩策略:在成交缺失时短暂降低 min_grid_step_bps / 增加 min_layers,成交恢复后渐进回调,避免频繁震荡。
    • 在多账户/多实例场景下协调压缩动作,防止多个实例同时过度收紧导致限流;配套指标与运维回滚步骤。