FSE | Technology

The Accuracy Hierarchy

Not All Predictions Are Equal

MAPE = Mean Absolute Percentage Error (lower is better)

Traditional Human Expert Estimation

10-25%

10%

Experience

ML Linear Regression

15-30%

12%

50+ projects

ML Random Forest

8-15%

6%

200+ projects

ML XGBoost / Gradient Boosting

6-12%

5%

200+ projects

Deep Neural Network (MLP)

8-15%

6%

500+ projects

Deep LSTM (Time-Series)

5-10%

<5%

500+ with temporal

Ensemble Stacked Models

4-8%

3%

500+ projects

FSE's Approach: We don't pick one method—we use an ensemble that combines multiple models, letting each contribute where it's strongest. The meta-learner figures out which model to trust for each prediction.

How It Works

Three Layers of Intelligence

From raw data to calibrated predictions

1

Structured Analysis

XGBoost + Random Forest

Tree-based models that excel at finding patterns in your project data: size, type, location, contract structure. These methods handle missing data naturally and tell us exactly which features matter most.

6-12% Typical MAPE

Minutes Training Time

High Interpretability

2

Temporal Patterns

LSTM Neural Networks

Construction projects unfold over time. LSTM networks "remember" patterns across sequences—seasonal cost variations, market trends, phase dependencies. They capture what static models miss.

<5% Best Case MAPE

Hours Training Time

Medium Interpretability

3

Ensemble Stacking

Meta-Learner Combination

Each model makes different errors. The meta-learner combines their predictions optimally—learning which model to trust in which situations. Stacking reduces both bias and variance, consistently outperforming any single model.

3-8% Typical MAPE

Combined Training Time

High Accuracy

Key Capabilities

Beyond Point Estimates

Prediction Intervals

Instead of "$5M", you get: "10% chance below $4M, 50% around $5M, 10% above $7M." Quantile regression provides calibrated uncertainty bounds for risk-aware decisions.

Joint Cost + Duration

Multi-task learning predicts cost and schedule together. The model learns their relationship: delays increase costs, scope changes affect both. One model, two outputs, shared intelligence.

SHAP Explainability

See exactly how each factor contributes: "Floor area added $1.2M, NYC location added $0.8M, fixed contract saved $0.3M." No black boxes—full transparency on every prediction.

Transfer Learning

Pre-train on broad construction data, fine-tune on your specific projects. Limited data? The model brings general construction knowledge, then adapts to your patterns.

Technical Details

Architecture Deep Dive

For the engineers who want specifics

Input Processing

Text Description Fine-tuned BERT → Feature Vector

Structured Data Size, type, location → Normalized

Time Series Cost indices over time → LSTM Encoder

Ensemble Layer

XGBoost 100-500 trees, 6 levels deep

Neural Network 256→128→64 neurons, ReLU

LSTM 256 units, temporal sequences

Meta Learner

Stacking Ridge regression combines predictions

Multi-task Joint cost + duration heads

Quantile P10, P50, P90 outputs

Cost: $5.2M [±$0.8M at 80% CI]

Duration: 14 months [±2 months]

Key Input Features

Size

Total floor area
Number of floors
Building height

Design

Compactness ratio
Percentage of openings
Structural system

Project

Contract type
Tendering method
Provisional sums

Market

Inflation rate
Material price indices
Labor market conditions

Location

Geographic region
Soil conditions
Local labor rates

The Technology