How to Analyze a Loan Tape Step by Step
How to analyze a loan tape step by step: scrub addresses, flag missing data, request seller info, and prepare a bulk portfolio for indicative pricing.
What Does It Mean to Analyze a Loan Tape?
Knowing what the columns on a data tape represent and knowing how to actually work with the data are two different skills. If you need a primer on the fields themselves — UPB, lien position, charge-off balance, and the rest — start with How to Read a Non-Performing Mortgage Loan Data Tape. This guide picks up where that one leaves off.
Analyzing a loan tape is the hands-on process of scrubbing raw seller data into something clean enough to price, flagging what is missing, and communicating with the seller to fill the gaps. It is the bridge between receiving a spreadsheet and submitting an indicative bid. For portfolios with 100 to 200+ loans, this process needs to be systematic — otherwise you waste time on bad data, miss red flags on high-balance assets, and lose credibility with the seller by asking questions the tape already answers.
The workflow covered here follows a real bank tape of 207 non-performing loans across both first and second lien positions. Every step translates directly to any bulk portfolio you receive from an institutional seller.
How Do You Start the Initial Review?
Open the tape and resist the urge to start pricing. The first pass is purely observational. You are cataloging what you have, what you do not have, and what does not make sense.
Add Dynamic Subtotals
Before filtering or sorting anything, add a subtotal row at the top of every dollar-denominated column. Use Excel's SUBTOTAL function with function number 9 (equivalent to SUM) rather than a plain SUM formula. The difference matters: SUBTOTAL(9, range) recalculates dynamically when you apply filters, showing you the total for only the visible rows. A regular SUM always returns the total for the entire column regardless of filtering, which will mislead you the moment you start segmenting the data.
Apply this to the unpaid principal balance column, the charge-off balance column, and any other dollar fields. On the example portfolio, the UPB subtotal immediately confirmed a $7.38 million portfolio — a figure you can verify with the seller to ensure everyone is working from the same dataset and the same loan count.
Scan Every Column Using the Filter Dropdowns
Click the filter dropdown on each column header and review the distinct values. This is faster than scrolling through 200 rows and gives you an immediate picture of data quality. On a bank tape, you will encounter fields that exist purely for internal use — investor IDs, line-of-business codes, vendor-specific scrub results — that mean nothing to an outside buyer. Identify them, note which ones might contain useful information buried under opaque labels, and move on.
Pay attention to columns with mixed values. On this tape, a "PROLI results" column contained secured-status indicators from the bank's own vendor: "address valid, subject lien matched," "address invalid," and "address valid but no match to property data." That is valuable intelligence for your own due diligence — but trust and verify. You will run your own secured scrub regardless.
Verify the Lien Position and UPB Labels
Bank tapes frequently mislabel columns in ways that would derail your pricing if taken at face value. In this example, a blended portfolio of first and second mortgages listed every loan under a column called "First Principal Balance." That label reflected the bank's internal system, not the actual lien position. The column was actually the subject loan's UPB regardless of whether it was a first or second lien.
This is why step one is always observation, not calculation. If you had built pricing formulas on the assumption that "First Principal Balance" was the senior lien balance, every second lien in the portfolio would have been mispriced.
What Should You Request from the Seller?
Once you have inventoried the tape, build a checklist of what is present and what is missing. The bare minimum data points for pricing — often called the "need-to-have" fields — are the loan number, lien position, borrower name, UPB, verified property address, and the last payment or next due date. Everything beyond that either accelerates your process or refines your numbers.
Identify the Critical Gaps
On this 207-loan tape, several important data points were absent:
- Senior lien balances — Without these, you cannot calculate equity on the second liens, and equity is arguably the biggest pricing driver after secured status
- Property tax balances — Taxes are a super-priority lien that eats directly into your equity coverage
- Estimated fair market values — No property values means no LTV calculation and no way to determine collateral coverage
- Property type — Single-family, condo, multi-family, and vacant land all carry different risk profiles
Some of these you can gather yourself through vendor orders, but doing so adds time. If the seller has estimated values and senior balances available — and many institutional sellers do — requesting them upfront compresses your timeline significantly.
Frame Your Questions Strategically
The way you communicate with the seller matters as much as what you ask. Sellers at banks and large institutions appreciate buyers who demonstrate competence by figuring out the obvious independently and only escalating genuine ambiguities. Pair your data requests with higher-level questions that show you understand the transaction:
- Collateral custody: Who holds the collateral files, and are the assignments of mortgage recorded to the seller's entity?
- Servicing: Where is servicing currently held? Will the loans transfer to a new servicer or remain with the existing one?
- Representations and warranties: Does the loan sale agreement include reps and warranties?
- Legal activity: The tape showed foreclosure status codes and loss mitigation codes on roughly 21 loans — ask the seller to clarify what those statuses mean and whether any legal actions are currently pending
Keep your questions concise and easy for the seller to answer. You are not outsourcing your due diligence to them. You are gathering the information needed to move quickly and professionally.
Confirm Your Participation in the Trade
Whether the offering is competitive or exclusive, notify the seller that you are actively working the tape and give them a realistic timeline. A typical cadence: approximately one week for indicative pricing, acceptance and exclusivity within 10 days, and final due diligence plus closing within 30 days after that. Being upfront about timing sets expectations and builds the kind of relationship that gets you access to the next portfolio.
How Do You Clean the Property Addresses?
Address data quality is one of the most overlooked problems in tape analysis — and one of the most consequential. Every vendor order you submit downstream (BPOs, tax research, title searches, secured-status scrubs) depends on clean addresses. Bad input produces bad output, or no output at all.
Bank tapes are notorious for address formatting issues. The example tape had three recurring problems: leading spaces (one address had three spaces before the street number), missing spaces between directionals and street names ("125S Main" instead of "125 S Main"), and double spaces scattered throughout.
Remove Excess Spaces
The most reliable method is to use Excel's text-to-columns function rather than find-and-replace. Copy the entire property address column to a new worksheet tab, then run text-to-columns using a space delimiter. This separates each component of the address — street number, directional, street name, suffix, unit — into individual cells and strips all excess spacing in the process.
Once separated, concatenate the fields back together using the ampersand operator with explicit space characters between each field: =A2&" "&B2&" "&C2&" "&D2&" "&E2. This reconstructs every address with exactly one space between each component, eliminating both leading spaces and double spaces in a single operation.
A trailing space at the end of a rebuilt address will not cause problems for your vendors, so there is no need to trim those.
Fix Missing Directional Spaces Manually
After rebuilding, scroll through and look for addresses where a directional (N, S, E, W, NW, SW, NE, SE) is concatenated with the street name or number. These require manual correction — add the missing space in each case.
While this manual pass is tedious, it serves a secondary purpose: you are getting eyes on every property address in the portfolio. You start to develop a feel for the geographic distribution, the types of properties (the presence of unit numbers suggests multi-family or condos), and any addresses that look incomplete or unusual. That situational awareness pays off later when you are reviewing vendor data and something does not match.
Fix Truncated Zip Codes
Four-digit zip codes are a common artifact of Excel formatting. When a five-digit zip code starts with zero (common in the northeastern United States), Excel's default number formatting drops the leading zero. Fix this by adding a new column with the TEXT function: =TEXT(A2, "00000"). This forces every zip code to five digits, restoring any dropped zeros.
Flagging Red Flags During the Scrub
The initial scrub is not just about cleaning data — it is your first opportunity to identify loans that need deeper investigation or different pricing treatment.
Statute of Limitations Exposure
If the tape includes a next payment due date or last payment date, compare it against the statute of limitations for the property's state. A loan where the last payment was six or more years ago may be approaching or past the statutory window, which fundamentally changes its value and your legal remedies. Flag these immediately rather than discovering them during advanced due diligence when you have already committed resources.
Inconsistent or Missing Data
Watch for fields populated with placeholder values. On this tape, the loan term column contained "999" entries — filler data from the bank's system, not actual loan terms. Interest rate columns with zeros may indicate adjustable-rate loans where the current rate is not populated, or they may indicate missing data entirely. Note these inconsistencies for follow-up with the seller.
Active Legal Statuses
Foreclosure status codes and loss mitigation codes are signals that legal proceedings may be in progress. Banks typically scrub active bankruptcies from sale tapes due to regulatory constraints, but foreclosure and loss mitigation activity can still be present. Isolate these loans and request clarification from the seller before pricing them.
Occupancy and Property Type Anomalies
If the tape includes occupancy data, filter by status and review the distribution. A portfolio that is 80% owner-occupied prices differently than one that is 50% investment properties. If occupancy data is missing, that becomes a vendor order you need to place before finalizing your indicative pricing.
What Does the Completed Scrub Look Like?
By the end of this process, your internally updated tape should have:
- Dynamic subtotal rows on all dollar columns using
SUBTOTAL(9, range)so filtered views reflect accurate totals - Cleaned property addresses with excess spaces removed, directionals properly separated, and any unusual addresses noted
- Corrected zip codes with leading zeros restored via the
TEXTfunction - A completed checklist documenting which data points are present and which are missing
- Flagged loans — any assets with statute of limitations concerns, placeholder data, active legal statuses, or missing critical fields isolated for follow-up
- A seller communication — questions sent, data requests submitted, participation confirmed, and timeline expectations set
This scrubbed tape is what you carry into cursory research and indicative pricing, where you will layer on property values, equity calculations, and pricing formulas to generate a loan-level bid. The quality of that pricing depends entirely on the quality of the data underneath it — which is why this scrub exists in the first place.
How Long Should This Process Take?
For a portfolio of 200 loans, an experienced analyst can complete the initial scrub and seller communication in a single working session — roughly three to four hours. The address cleanup is the most time-consuming piece, particularly the manual directional spacing review. First-time analysts should budget a full day.
The temptation is to skip the scrub and jump straight to pricing. Do not. Every hour spent cleaning the tape here saves multiples in rework later. A misspelled address that produces a failed BPO order costs you the order fee plus the turnaround time to resubmit. A misidentified UPB column that inflates your bid by 15% costs you real money. The scrub is not overhead — it is the foundation that every downstream decision rests on.
Where Does the Initial Data Scrub Fit in the Bulk Process?
This step is the first of several in the bulk due diligence workflow. After the scrub, the sequence continues:
- Initial data scrub and seller requests — what this guide covers
- Cursory research — layering on property values, senior balances, and equity calculations (see the due diligence checklist for the full sequence)
- Indicative pricing — building a pricing matrix and generating loan-level bids
- Letter of intent — submitting your bid and negotiating exclusivity
- Advanced due diligence — ordering BPOs, title searches, credit reports, and verifying every assumption from the earlier stages
- Final pricing and closing — adjusting your bid based on findings and funding the transaction
Each stage builds on the work of the previous one. The data you clean and the questions you ask during the initial scrub directly shape the speed and accuracy of everything that follows. Sellers notice which buyers show up with organized questions and clean data — and those are the buyers who get access to the next deal.
Get personalized guidance for your note investing strategy from industry experts.