Getty Images/iStockphoto

Tip

Guidelines for AI-driven legacy code modernization

AI will not be able to refurbish legacy systems at the push of a button. Still, with proper guidance and oversight, AI tools can speed up code modernization projects.

Nick Femia, Veruna

Published: 03 Sep 2025

Most enterprise organizations run on yesterday's software.

A significant chunk of core business systems are legacy and a large portion of IT budgets are often dedicated to keeping them alive. Legacy tech can restrict growth and block strategy, with many companies identifying it as a major barrier to digital transformation and a main driver of IT spend. This impact team productivity directly as they end up devoting time to legacy system maintenance.

AI won't fix that magically. It does change the slope of a team's productivity curve when used well, however. Instead of flat output or incremental gains, AI can help teams modernize faster over time.

Generative AI can read code and logs, suggest safe refactors, generate tests, and sketch platform moves. People still make the calls -- set the target architecture, check behavior and decide when to ship. Learn some practical ways AI can drive legacy code modernization efforts.

Challenges of legacy code

Legacy code hurts in two ways: what's in the code and what's around it.

Inside the code, there can be years of tangled dependencies, hidden side‑effects, mixed concerns, deprecated libraries and little in the way of tests or documentation. Data and behavior are coupled in surprising places; a seemingly harmless change in a handler ripples into reporting jobs or batch interfaces.

Around the code, environments are difficult to recreate, deploys are fragile, observability is thin and compliance and security posture lags. Addressing this isn't as simple as modifying the code. It's important to keep SLAs, manage risk and coordinate teams while choosing the right approach.

The hardest part of working around fragile systems is getting the order of changes -- code, configuration and data -- right and making sure the system continues to function. That means setting clear requirements, testing before changing anything, breaking work into safe steps, moving data without hiccups and avoiding the temptation to throw everything out and start over. Teams should work incrementally with guardrails and rollback mechanisms to gain progress without betting the business.

7 approaches to legacy code modernization

Modernizing legacy applications involves more than a rewrite plan. It is a business decision about how to reduce risk, improve operations and unlock new strategies. A good starting point is application rationalization. Take inventory, score each system on value, cost to maintain, technical fitness and business risk, then decide which ones to invest in, consolidate, keep as-is or retire. With that short list in hand, development teams can match each system to a practical application modernization path that fits the budget, timeline and cloud roadmap.

1. Encapsulate

Wrap the legacy system with stable APIs or events so teams can build new features around it without changing the core. This protects consumers from outdated interfaces and buys time to plan deeper changes. Encapsulation often pairs well with a strangler-style migration, where new capabilities live outside while old ones shrink over time.

2. Rehost

Lift and shift the workload to new infrastructure with minimal code changes -- such as moving a VM from a data center to the cloud for example. This approach provides quicker wins in reliability, cost transparency and basic automation but the legacy code and architecture stay the same. Use this when speed matters and deeper changes would add undue risk.

3. Replatform

Move to a modern runtime or managed service with small, targeted changes. Common examples are migrating to containers, a managed database or serverless front doors such as AWS API Gateway. This approach can improve operations and scalability without redesigning the application. This is a solid middle ground that provides cloud benefits quickly while only slightly modifying the code.

4. Refactor

Restructure the code to improve maintainability without changing external behavior. Refactoring can involve splitting large modules, adding tests, removing dead code and updating libraries. The goal is to make the legacy code easier to maintain and safer to evolve. Refactoring pays off when the business logic is still valuable but the codebase has aged.

5. Rearchitect

Change the system's design to meet new quality goals such as scalability, resilience or speed of delivery. Examples include moving from a monolith to well-bounded services, adopting event-driven patterns or decoupling data stores. This is where architecture leads the transformation and often unlocks the most long-term agility.

6. Rebuild

Rewrite the application from the ground up while preserving scope and core behavior. This approach offers a clean slate for technology choices and testing practices, which can dramatically improve velocity. Rebuilds are best when the current implementation blocks change, yet the domain logic still matches how the business works.

7. Replace

Retire the custom system and adopt a commercial or SaaS product. This can cut ongoing cost and risk if the market provides what the business needs. Replacing works well for commodity capabilities such as HR, finance or CRM. The tradeoff is customization flexibility, so validate the replacement's fit and integration early.

These paths are not mutually exclusive. Many teams encapsulate first, rehost or replatform to stabilize operations, then refactor and rearchitect targeted areas. Rebuild or replace can follow once the organization is ready. The sequence depends on risk tolerance, available talent and the urgency of digital transformation.

How can AI enable legacy code modernization?

AI is not a magic wand. It is an efficient colleague who reads everything, keeps perfect notes and does the boring parts without complaining. Used well, AI shortens the path from "we should modernize" to running software. Below are eight hands-on examples of AI-assisted code modernization tasks, generated by Claude Code CLI.

Used well, AI shortens the path from 'we should modernize' to running software.

1. Turn legacy code into living specs and tests

Capture what the system really does today in plain English and executable tests. This approach freezes current behavior so developers can improve structure, fix design issues or change technology without losing business rules.

Prompt:

"Scan ./cobol/ and ./jcl/ plus sample outputs in ./samples/ratings.csv. Produce a concise human spec of rating rules and a Python test pack that locks in current behavior. Use golden files based on ratings.csv. Keep the language business friendly."

Output:

Premium rules v1
- If driver_age < 25 and vehicle_class in {"sport","performance"} apply surcharge S3 = 0.12
- If prior_claims >= 2 apply penalty P2 = 0.07
- Multi-vehicle discount 0.05 if household_vehicles >= 2
- Round to nearest cent after all adjustments

—-

# tools/build_golden.py
import csv, json, os
os.makedirs("tests/golden", exist_ok=True)
with open("samples/ratings.csv") as f:
    for i,row in enumerate(csv.DictReader(f), start=1):
        case = {"input": row, "expected_total": float(row["total_premium"])}
        with open(f"tests/golden/case_{i}.json","w") as o:
            json.dump(case, o, indent=2)
print("golden files written")

—

# tests/test_rating_rules.py
import glob, json, math
from legacy_adapter import calc_premium  # wrapper around COBOL batch calc

def cases():
    for path in glob.glob("tests/golden/*.json"):
        with open(path) as f:
            yield path, json.load(f)

def money_eq(a, b): return math.isclose(a, b, rel_tol=0, abs_tol=0.01)

def test_golden_parity():
    for path, case in cases():
        got = calc_premium(case["input"])
        assert money_eq(got, case["expected_total"]), f"mismatch in {path}"

def test_young_driver_surcharge_applies():
    base = {"driver_age": 22, "vehicle_class": "sport", "prior_claims": 0, "base": 1000}
    assert money_eq(calc_premium(base), 1120.00)

2. Apply minor, targeted code changes at scale

Make many small, mechanical edits across one or more codebases to meet a platform or runtime goal. Examples include dependency upgrades, API swaps, logging standards or container readiness updates. The business logic stays the same while the code becomes consistent and compatible.

Prompt

Create an OpenRewrite recipe to move services from Java 8 to 17 and Spring Boot 2.7 to 3.2. Replace Apache HttpClient with java.net.http.HttpClient.

Output

# rewrite.yml
type: specs.openrewrite.org/v1beta/recipe
name: company.upgrade-java17-spring3-httpclient
recipeList:
  - org.openrewrite.java.migrate.UpgradeToJava17
  - org.openrewrite.java.spring.boot3.UpgradeSpringBoot_3_2
  - org.openrewrite.java.ChangeType:
      oldFullyQualifiedTypeName: org.apache.http.client.HttpClient
      newFullyQualifiedTypeName: java.net.http.HttpClient
  - org.openrewrite.java.migrate.JavaxToJakarta

3. Build API facades and strangler seams

Put a clean API in front of a legacy system and route traffic through it. The team can attach new features to the facade while gradually moving or rewriting internals behind the scenes. This reduces coupling and buys time to evolve the core safely.

Prompt

From ./traffic/*.har derive OpenAPI v3 for Orders with POST /orders and GET /orders/{id}. Generate an Express facade that maps to the legacy endpoints.

Output

openapi: 3.0.3
info: { title: Orders API, version: '1.0' }
paths:
  /orders:
    post:
      requestBody:
        required: true
        content:
          application/json:
            schema: { $ref: '#/components/schemas/CreateOrder' }
      responses: { '201': { description: Created } }
  /orders/{id}:
    get: { responses: { '200': { description: OK } } }
components:
  schemas:
    CreateOrder:
      type: object
      required: [customerId, items]
      properties:
        customerId: { type: string }
        items:
          type: array
          items: { type: object, required: [sku, qty], properties: { sku: {type: string}, qty: {type: integer} } }

—

// api/facade.js
import express from "express";
import fetch from "node-fetch";
const app = express(); app.use(express.json());

const toLegacyCreate = b => ({ custId: b.customerId, lines: b.items.map(i => ({ sku: i.sku, q: i.qty })) });
const fromLegacy = r => ({ id: r.ordId, status: r.stat, total: r.total });

app.post("/orders", async (req, res) => {
  const r = await fetch(process.env.LEGACY_URL + "/createOrder", {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify(toLegacyCreate(req.body))
  });
  if (!r.ok) return res.status(502).json({ error: "legacy error" });
  res.status(201).json(fromLegacy(await r.json()));
});

app.get("/orders/:id", async (req, res) => {
  const r = await fetch(process.env.LEGACY_URL + `/order?id=${req.params.id}`);
  if (!r.ok) return res.status(404).end();
  res.json(fromLegacy(await r.json()));
});

app.listen(8080);

4. Discover usage patterns and guide application rationalization

Use real usage and cost data to decide what to keep, consolidate or retire. The goal is to shrink scope before modernizing to ultimately invest in the few applications that matter and remove redundant or idle ones.

Prompt

Analyze api_logs.csv and service_costs.csv. List endpoints unused in 180 days, cluster services by usage and cost and suggest retire or consolidate actions.

Output

import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeans

logs = pd.read_csv("api_logs_features.csv")      # req_per_day, latency_p95_ms, distinct_flows, service
costs = pd.read_csv("service_costs.csv")         # monthly_cost, service
df = logs.merge(costs, on="service")
X = StandardScaler().fit_transform(df[["req_per_day","latency_p95_ms","monthly_cost","distinct_flows"]])
df["cluster"] = KMeans(n_clusters=4, n_init="auto").fit_predict(X)
candidates = df[(df["req_per_day"] < 5) & (df["monthly_cost"] > 2000)]
print("retire candidates:\n", candidates[["service","monthly_cost"]].sort_values("monthly_cost", ascending=False))

—

"Retire: hr-letter-2, hr-letter-3 (zero traffic 6 months). Consolidate into hr-letter-1.

Replace: expense-legacy with SaaS-X (85 percent feature overlap). Integration cost low.

Keep: payroll-core (high traffic, unique features)."

5. Synthesize infrastructure, runbooks and smoke tests for safer rehosting

Turn tribal setup knowledge into infrastructure as code (IaC), a clear cutover plan and quick health checks. This approach lift and shifts with fewer surprises because provisioning and validation are scripted, repeatable and easy to audit.

Prompt

From /runbooks/windows-farm.md, /scripts/*.ps1, and /configs/*.ini generate Terraform for EC2, a runbook titled Cutover Tuesday 7 pm, and a bash smoke test that hits /healthz and creates a sample order.

Output

# infra/main.tf
module "orders" {
  source        = "terraform-aws-modules/ec2-instance/aws"
  name          = "orders-svc"
  instance_type = "m6i.large"
  ami           = data.aws_ami.al2023.id
  user_data     = file("bootstrap.ps1")
  vpc_security_group_ids = [aws_security_group.orders.id]
}

resource "aws_security_group" "orders" {
  name = "orders-sg"
  ingress { from_port=8080, to_port=8080, protocol="tcp", cidr_blocks=["10.0.0.0/16"] }
  egress  { from_port=0, to_port=0, protocol="-1", cidr_blocks=["0.0.0.0/0"] }
}

—

Pre checks

- Freeze deploys. Confirm latest AMI. Verify backups < 24h old.

Cutover 19:00

- Drain target group orders-tg.

- Deploy canary to one instance. Run smoke tests.

- Shift 10 percent traffic. Observe 15 minutes. If clean, ramp to 100 percent.

Rollback

- Point target group back to asg-orders-old. Restore snapshot if needed.

—

#!/usr/bin/env bash
set -euo pipefail
BASE=${1:-http://orders.internal:8080}
curl -fsS "$BASE/healthz" | grep "ok"
curl -fsS -X POST "$BASE/orders" -H 'Content-Type: application/json' \
  -d '{"customerId":"demo","items":[{"sku":"ABC","qty":1}]}' | jq -e '.id'
echo "smoke tests passed"

6. Extract domains and events to unlock a better architecture

Identify natural boundaries in code and data, then define the events those domains share. For example, teams could split a monolith into well-bounded services in small steps, improving scalability and delivery speed without a risky big bang.

Prompt

Analyze ./src and docs/erd.png. Propose bounded contexts for catalog, pricing, inventory, checkout. Produce JSON schemas for key events and minimal Kafka producer and consumer samples

Output

{
  "event": "PriceChanged",
  "version": 1,
  "product_id": "SKU-123",
  "old_price": 19.99,
  "new_price": 17.99,
  "currency": "USD",
  "occurred_at": "2025-08-20T12:34:56Z",
  "source": "pricing-service"
}

—

from kafka import KafkaProducer
import json, time
producer = KafkaProducer(bootstrap_servers=["kafka:9092"], value_serializer=lambda v: json.dumps(v).encode())
def publish_price_changed(sku, old, new):
    evt = {"event":"PriceChanged","version":1,"occurred_at":time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
           "product_id":sku,"old_price":old,"new_price":new,"currency":"USD","source":"pricing-service"}
    producer.send("price-events", evt)

—

from kafka import KafkaConsumer
import json
consumer = KafkaConsumer("price-events", bootstrap_servers=["kafka:9092"], value_deserializer=lambda v: json.loads(v.decode()))
for msg in consumer:
    evt = msg.value
    # TODO update read model or invalidate cache

—

Catalog -> emits ProductUpdated
Pricing -> emits PriceChanged
Inventory -> emits InventoryAdjusted
Checkout -> consumes ProductUpdated, PriceChanged, InventoryAdjusted

7. Guide a rebuild with scaffolds, seeds and property-based tests

Start fresh on a modern stack while proving the new behavior matches the old one during the transition. Scaffolding speeds up setup, dual-run or replay. Property-based tests catch edge cases early.

Prompt

Generate a Go HTTP service with /price. Add a Python parity test that compares new pricing to the legacy function. Provide a small dual-write adapter for order creation."

Output

// cmd/pricing/main.go
package main
import ("encoding/json"; "log"; "net/http"; "time")
type PriceRequest struct{ SKU string; Qty int }
type PriceResponse struct{ SKU string; Price float64; Currency string }
func priceHandler(w http.ResponseWriter, r *http.Request) {
  start := time.Now()
  var req PriceRequest
  if err := json.NewDecoder(r.Body).Decode(&req); err != nil { w.WriteHeader(400); return }
  resp := PriceResponse{SKU: req.SKU, Price: 17.99, Currency: "USD"} // TODO call domain
  json.NewEncoder(w).Encode(resp)
  log.Printf("path=/price sku=%s qty=%d latency_ms=%d", req.SKU, req.Qty, time.Since(start).Milliseconds())
}
func main() { http.HandleFunc("/price", priceHandler); log.Fatal(http.ListenAndServe(":8081", nil)) }

# tests/test_parity.py
from hypothesis import given, strategies as st
from new_impl import price as new_price
from legacy_impl import price as legacy_price

@given(sku=st.from_regex(r"[A-Z0-9\-]{3,12}"), qty=st.integers(min_value=1, max_value=10))
def test_new_matches_legacy_for_many_inputs(sku, qty):
    assert round(new_price(sku, qty), 2) == round(legacy_price(sku, qty), 2)

—

import requests
def create_order(new_order):
    r1 = requests.post("http://new/orders", json=new_order, timeout=2)
    r2 = requests.post("http://legacy/order_create", json=map_to_legacy(new_order), timeout=2)
    return r1.status_code == 201 and r2.ok

8. Make buy-versus-build decisions with total cost of ownership and fit simulations

Compare long-term cost, feature fit and integration effort across three paths: keep plus refactor, rebuild or replace with SaaS. The outcome is a clear, numbers-backed plan that shows where custom code still adds value.

Prompt

Using usage.csv, vendor_pricing.csv, and run_costs.csv, compute five-year NPV for keep plus refactor, rebuild, and replace with SaaS at a 10 percent discount rate. List top integration hotspots and generate a TypeScript adapter stub for the vendor journal API.

Output

import numpy as np, pandas as pd

# thousands of dollars per year
keep_refactor = [300, 120, 120, 120, 120]
rebuild       = [800, 90, 90, 90, 90]
saas          = [400, 70, 70, 70, 70]

def npv(stream, rate=0.10):
    return sum(v / ((1 + rate) ** t) for t, v in enumerate(stream))

print("NPV keep+refactor:", round(npv(keep_refactor), 1))
print("NPV rebuild:", round(npv(rebuild), 1))
print("NPV replace with SaaS:", round(npv(saas), 1))

—

1. Journal posting and reconciliation
2. User provisioning and SSO mapping
3. Historical report backfill

—

// adapters/vendorLedger.ts

type JournalLine = { account: string; debit?: number; credit?: number };

type JournalEntry = { id: string; date: Date; lines: JournalLine[] };

function mapAccount(a: string): string {

const map: Record<string,string> = { "1000": "Assets:Cash", "2000": "Liabilities:AP" };

return map[a] ?? a;

}

export async function postJournal(entry: JournalEntry): Promise<string> {

const payload = {

date: entry.date.toISOString().slice(0,10),

lines: entry.lines.map(l => ({ account: mapAccount(l.account), debit: l.debit, credit: l.credit })),

externalId: entry.id

};

const res = await fetch(`${process.env.VENDOR_URL}/journals`, {

method: "POST",

headers: { "Authorization": `Bearer ${process.env.TOKEN}`, "Content-Type":"application/json" },

body: JSON.stringify(payload)

});

if (!res.ok) throw new Error(`Vendor error ${res.status}`);

const { id } = await res.json();

return id;

}

At its best, AI turns modernization plans into working code the user can see and trust. It writes plain-English specs and tests that lock in business rules, runs code mods that clean up APIs and fronts brittle systems with simple facades to let teams improve the core behind them. It trims scope by showing what to keep or retire, scripts rehosting with IaC and smoke checks, and suggests clear domain boundaries with events.

When a rebuild is right, it proves parity before cutover. A smart replacement strategy backs the call with numbers and adapter stubs. The result is steady, low-risk progress commits, tests and runbooks that make legacy code easier to maintain and primed for a cloud-ready architecture.

Limitations of AI-driven code modernization

AI can speed up modernization, but it has real limits to plan around.

Context and data quality. Models only see what they're fed. Outdated specs, missing logs and edge cases outside the sample set lead to wrong conclusions and brittle code.
Behavior coverage. Generated tests and golden files can lock in existing bugs or miss cross-system side effects. SMEs, exploratory testing and production safeguards are still necessary.
Security, privacy and compliance. Code generation can introduce vulnerabilities, mishandle secrets or reuse licensed snippets. Guardrails, SAST/DAST, SBOMs and clear data handling rules are still necessary.
Architecture and change management. AI can propose edits, not own trade-offs, sequencing or organizational alignment. Without governance, CI quality bars and rollback paths, mass changes raise risk.

Used with these limits in mind, AI turns strategy into steady, observable progress, but it is a power tool, not an autopilot.

Nick Femia is a Tech Lead and full-stack engineer with over six years of experience driving product engineering and AI innovation.

Guidelines for AI-driven legacy code modernization

AI will not be able to refurbish legacy systems at the push of a button. Still, with proper guidance and oversight, AI tools can speed up code modernization projects.

Challenges of legacy code

7 approaches to legacy code modernization

1. Encapsulate

2. Rehost

3. Replatform

4. Refactor

5. Rearchitect

6. Rebuild

7. Replace

How can AI enable legacy code modernization?

1. Turn legacy code into living specs and tests

2. Apply minor, targeted code changes at scale

3. Build API facades and strangler seams

4. Discover usage patterns and guide application rationalization

5. Synthesize infrastructure, runbooks and smoke tests for safer rehosting

6. Extract domains and events to unlock a better architecture

7. Guide a rebuild with scaffolds, seeds and property-based tests

8. Make buy-versus-build decisions with total cost of ownership and fit simulations

Limitations of AI-driven code modernization

Dig Deeper on Software design and development

Microsoft Teams monitoring tips for admins

Optimize Amazon Athena performance with these 5 tuning tips

CLI for Microsoft 365: A guide for administrators

Understanding protocol buffers vs. JSON