You Can't Fix What You Can't See
In fintech, monitoring isn't optional — it's how you sleep at night. A commission engine silently miscalculating for 6 hours means thousands of dollars in discrepancies. A payment webhook processing delay means clients seeing stale balances. A memory leak in the WebSocket gateway means disconnected traders.
New Relic has been our monitoring backbone. Here's how we configure it for fintech workloads.
Application Performance Monitoring
Standard APM setup captures request times and error rates. For fintech, we need custom instrumentation on critical paths:
// Custom transaction tracking for commission calculations
class CommissionCalculator
{
public function calculate(Trade $trade): array
{
newrelic_start_transaction('commission-calculation');
newrelic_add_custom_parameter('trade_id', $trade->id);
newrelic_add_custom_parameter('symbol', $trade->symbol);
newrelic_add_custom_parameter('volume', $trade->volume);
$startTime = microtime(true);
$commissions = $this->processHierarchy($trade);
$duration = microtime(true) - $startTime;
newrelic_add_custom_parameter('calculation_time_ms', $duration * 1000);
newrelic_add_custom_parameter('levels_processed', count($commissions));
newrelic_end_transaction();
return $commissions;
}
}
This lets us track:
- Average commission calculation time
- Calculations per minute (throughput)
- Which symbols take longest to process
- How hierarchy depth affects performance
Custom Dashboards
We build dashboards for each stakeholder:
Engineering Dashboard
-- Slowest database queries
SELECT average(databaseDuration) FROM Transaction
WHERE appName = 'fintech-crm'
FACET databaseCallCount SINCE 1 hour ago
-- Error rate by endpoint
SELECT percentage(count(*), WHERE error IS true) FROM Transaction
WHERE appName = 'fintech-crm'
FACET request.uri SINCE 1 hour ago
Business Dashboard
-- Deposits per hour
SELECT count(*) FROM Transaction
WHERE name = 'WebTransaction/deposits/store'
AND response.status = 200
TIMESERIES 1 hour SINCE 24 hours ago
-- Average deposit processing time
SELECT average(duration) FROM Transaction
WHERE name = 'WebTransaction/webhooks/payment'
SINCE 1 hour ago
Operations Dashboard
-- Queue depth over time
SELECT latest(queue_depth) FROM QueueMetric
FACET queue_name TIMESERIES 5 minutes SINCE 6 hours ago
-- Failed jobs in the last hour
SELECT count(*) FROM CustomEvent
WHERE eventType = 'JobFailed'
FACET job_class SINCE 1 hour ago
Alerting Strategy
Alerts need to be actionable, not noisy. We tier alerts by severity:
Critical (Pages On-Call)
- Payment webhook processing time > 10 seconds
- Error rate > 5% on any financial endpoint
- Database connection pool exhausted
- Commission calculation failure rate > 0.1%
- MT4/MT5 connectivity lost
Warning (Slack Notification)
- Response time P95 > 2 seconds
- Queue wait time > 30 seconds
- Memory usage > 80% on any service
- SSL certificate expiring within 14 days
- Disk usage > 75%
Info (Dashboard Only)
- Deployment completed
- Daily reconciliation report
- Weekly performance summary
// NRQL alert condition example
SELECT count(*) FROM Transaction
WHERE appName = 'fintech-crm'
AND request.uri LIKE '/webhooks/payment%'
AND duration > 10
SINCE 5 minutes ago
Distributed Tracing
A single deposit flows through multiple services:
Client Portal → API Gateway → Payment Service → PSP
↓
Webhook Handler → Balance Service → Notification
Distributed tracing shows the entire journey:
// Tag traces with business context
newrelic_add_custom_parameter('transaction_id', $deposit->id);
newrelic_add_custom_parameter('client_id', $deposit->client_id);
newrelic_add_custom_parameter('amount', $deposit->amount);
newrelic_add_custom_parameter('psp', $deposit->provider);
When a client reports "my deposit didn't arrive," support searches by transaction ID and sees every service the request touched, where it slowed down, and where it failed.
Custom Events for Business Metrics
Track business events alongside technical metrics:
// Record custom events for business intelligence
newrelic_record_custom_event('Deposit', [
'amount' => $deposit->amount,
'currency' => $deposit->currency,
'provider' => $deposit->provider,
'client_country' => $deposit->client->country,
'processing_time_ms' => $processingTime,
]);
newrelic_record_custom_event('CommissionPaid', [
'ib_id' => $commission->ib_id,
'amount' => $commission->amount,
'depth' => $commission->hierarchy_depth,
'trade_symbol' => $commission->trade->symbol,
]);
This creates a queryable stream of business events. Finance can ask "what was our total deposit volume by PSP this month?" directly from New Relic without hitting the production database.
Synthetic Monitoring
Proactively test critical user flows:
- Login flow — every 5 minutes from 3 regions
- Deposit initiation — every 10 minutes (up to PSP redirect)
- Portfolio API — every 2 minutes, verify response time < 500ms
- WebSocket connection — every 5 minutes, verify handshake succeeds
Synthetic failures trigger alerts before clients report issues.
Key Takeaways
- Custom instrumentation on financial operations — commission calculations, payment processing, balance updates
- Stakeholder-specific dashboards — engineering, business, and operations each need different views
- Tiered alerting — critical pages on-call, warnings go to Slack, info stays on dashboards
- Distributed tracing with business context — trace transactions end-to-end across services
- Custom events for business metrics — deposits, commissions, and withdrawals as queryable events
- Synthetic monitoring for proactive issue detection before clients notice
Monitoring in fintech is an investment in trust. When you can detect, diagnose, and resolve issues in minutes instead of hours, you earn the confidence of clients, partners, and regulators.
