
10 Min
Every serious BFSI institution in India is investing in AI right now. The PwC India survey data is unambiguous - 90% of Indian financial institutions have made AI a core technology priority. Budgets have been approved. Vendors have been selected.
Pilots are running or have concluded. The sector is, collectively, in the middle of the largest simultaneous technology buildout it has ever attempted.
And yet the performance gap between institutions is not narrowing.
If anything, it is widening. Loan processing TATs that should be converging as AI adoption spreads are not converging. Claims settlement speeds that should be compressing as automation scales are compressing for some players and staying flat for others.
Customer acquisition costs that should be falling as targeting improves are falling sharply at some institutions and barely moving at others.
The variable that explains this gap is not the AI model.
It is the data infrastructure that feeds it.
The institutions pulling ahead are not running more sophisticated machine learning. In most cases, they are running comparable models to their peers. What they have that their competitors do not is a real-time, reliable, unified data layer that gives their models current, verified inputs to work from. The model is the engine. The data pipe is the fuel.
And right now, the institutions that built better pipes are outrunning the ones that built better engines.
Why data infrastructure is a moat, not a commodity
The instinct in most BFSI technology discussions is to treat data infrastructure as a commodity - the plumbing that enables the interesting work, not the interesting work itself. Models, agents, decisioning systems, customer experience layers - these are the visible outputs that get presented to boards and featured in annual reports. The data pipes that feed them are invisible by design, and their invisibility has led many institutions to underinvest in them relative to the models built on top.
This is a strategic error, and the evidence is becoming hard to ignore.
Consider two NBFCs running similar credit models for salaried personal loans. The first pulls employment and income data from applicant-submitted documents - salary slips, employment letters - processed through OCR and entered into the credit system. The second pulls the same data directly from the applicant’s employer’s HRMS via a live API, with consent, at the moment of application.
The models are comparable. The data is not. The first NBFC is underwriting on a document that was accurate when it was issued, which may have been three weeks ago. The second is underwriting on the current state of the applicant’s employment relationship. The second NBFC approves faster, with less ops overhead, on better-quality data.
Its default rate on the marginal cases - the applicants where employment stability is a key swing factor - is lower. Its conversion rate is higher because the decision comes in minutes rather than days.
Over time, this data quality advantage compounds. Better data produces better model training data. Better model training data produces better models. Better models produce better decisions and lower defaults. Lower defaults allow tighter pricing, which improves win rates on better applicants. The cycle reinforces itself - and it starts with the data pipe, not the model.
90% of Indian BFSI institutions now prioritising AI as a core investment | 2x AI investment in Indian BFSI forecast to double by 2026 | Most of that investment is going into models, not the data feeding them |
Three places the data pipe advantage shows up
Speed to decision. Real-time data connectivity is the single largest driver of TAT reduction in salaried lending and insurance underwriting. The bureaus, the fraud checks, the internal policy rules - these are already fast.
What slows the decision down is waiting for human-sourced data: a salary slip the applicant hasn’t uploaded, an employment verification the ops team hasn’t completed, an HRMS export that hasn’t arrived yet. Institutions with live API connections to employment data sources eliminate this wait entirely. The decision is as fast as the slowest automated check - which, in a well-built pipeline, is seconds.
Accuracy at the margin. The applicants where data quality matters most are not the clear approvals or the clear declines. They are the borderline cases - the applicants whose bureau score is acceptable but employment stability is uncertain, whose stated income is within range but needs verification, whose risk profile is fine if they are currently employed but problematic if they are not.
Real-time employment data resolves these cases correctly. Document-based verification resolves them approximately.
The difference in default rates on the marginal book is where the data pipe advantage is most financially meaningful.
Cross-sell conversion. The institutions with live HRMS connections to their corporate clients know, in real time, when employees get promoted, when salaries change, when new joiners appear.
These are the triggers that drive intelligent, timely cross-sell - the pre-approved loan offer sent the week after a salary revision, the credit card upgrade triggered by a designation change, the wealth management conversation initiated when a senior hire’s joining appears in the HRMS.
Institutions without this data layer send campaign-based offers on a schedule, to a population defined by last month’s database, at moments that bear no relationship to the customer’s actual financial situation. The conversion difference is not marginal. It is structural.
The insurance parallel
In insurance, the data pipe advantage expresses itself differently but is equally real.
Insurers with live HRMS connections to their corporate group clients know their covered population in real time. New joiners are added to coverage immediately.
Exits trigger deactivation. Salary changes update sum assured. The premium calculation is always based on current data. The claims experience on this book is cleaner - fewer contested claims tracing back to stale enrollment data, fewer coverage disputes arising from endorsement lags, fewer reconciliation arguments at renewal.
Insurers without this connectivity are managing their group book on periodic data snapshots. The covered population they think they have and the covered population they actually have diverge continuously between sync cycles. This divergence is not just an operational inconvenience - it is a pricing risk and a claims liability. Over a large enough portfolio, the financial impact of systematic data lag is material.
The insurer that fixes this at the infrastructure level - connecting directly to employer HRMS systems via a unified API rather than managing periodic file exchanges - is not just running more efficiently. It is managing its group book on fundamentally better information. That is a risk management advantage as much as an operational one.
Why most institutions are still underinvesting here
If the data pipe advantage is this clear, why are most BFSI institutions still underinvesting in it relative to models and AI applications?
Three reasons, all of which are understandable and all of which are strategic mistakes in the current environment.
The first is visibility. AI models produce outputs that can be demonstrated in a board presentation. Data infrastructure produces faster, more accurate decisions - but the causal link is invisible. The CDO who built the real-time employment data layer does not get a slide crediting them for the 15% improvement in default rates on the marginal salaried book. The model team does.
The second is the build illusion. Most BFSI technology teams believe they can build data connectivity in-house when needed. In isolation, they are right - any competent engineering team can build an HRMS integration.
What they underestimate is the breadth of the problem: 50+ HRMS platforms, each with its own API design, authentication method, and maintenance requirements. The first integration is straightforward. The tenth is an ongoing commitment. The fiftieth is a full-time programme.
The third is sequencing. AI model projects get approved and staffed before the data infrastructure question is fully resolved. The model goes into production. The data quality problems surface. The remediation happens reactively, at higher cost, while the model is already live and underperforming.
The institutions getting this right are doing it in the opposite order - investing in real-time, unified data connectivity first, then deploying AI on top of a foundation that is actually capable of supporting it.
The infrastructure decision is a strategy decision
The framing that matters here is not operational. It is strategic.
Real-time data connectivity - unified API access to employment data, income data, HRMS data across the full diversity of enterprise systems - is not an IT project. It is a competitive capability. The institution that builds it creates a compounding advantage: better decisions today produce better training data tomorrow, which produces better models next year, which produce better decisions the year after.
The institution that does not build it is running the same models as its competitors on worse data, and the gap widens with every decision cycle.
This is where Tartan’s HyperSync sits - not as a vendor solving an integration problem, but as the data infrastructure layer that makes real-time, unified employment and income data accessible to any BFSI institution through a single API. Eighty-plus HRMS platforms, one connection, current data at the moment it is needed, with consent management and audit trail built in.
The BFSI institutions that will look back on 2025 as a turning point will not be the ones that deployed the most AI. They will be the ones that built the data infrastructure that made their AI actually work. The model is the product. The data pipe is the moat. And right now, the moat is being built - or not - by the decisions being made this year.
Tartan helps teams integrate, enrich, and validate critical customer data across workflows, not as a one-off step but as an infrastructure layer.




