What if you could achieve the output of a much larger engineering team without the associated hiring costs? That’s the strategic advantage many companies are looking for. For Ping Data Intelligence, a platform that uses ML to clean up complex insurance data, this wasn't just a goal—it was a necessity. With a lean team, they needed to build a sophisticated AI product without getting bogged down by infrastructure management. By using Cake to manage their open-source MLOps stack, they saved the equivalent of two to three full-time engineers. This case study explores how that partnership empowers them to innovate faster.
Ping saves the equivalent of 2-3 FTE engineers by managing AIOps through Cake.
With Cake's support, Ping has built a seamless, end-to-end data extraction pipeline, empowering all stakeholders across the Insurance value chain. Together, we're transforming the Commercial Property Insurance ecosystem.
Data collection moves several times faster than before, enabling quicker model development.
Cake’s managed open source approach offers significant savings for Ping vs. managed cloud-based SaaS products.
Ping, a data intelligence platform, has used ML capabilities to power state-of-the-art insurance underwriting since 2021. Ping applies advanced ML to cleanse data and calculate accurate premiums on property insurance quotes. With ML at its core, Ping uses Cake as its MLOps platform to build sophisticated products with a small team and stay at the cutting edge of machine learning.
At its core, data intelligence is the practice of turning raw information into a strategic advantage. Think about all the data your business collects from sales, marketing, customer interactions, and product usage. Data intelligence is the process that organizes and analyzes this information to help you understand customer behavior, track product performance, and get a clear picture of your finances. It’s about transforming messy, complex data into clean, accurate insights that your team can actually trust and use. The ultimate goal is to empower everyone in the organization to make smarter, faster decisions based on a solid foundation of reliable information, rather than just gut feelings or guesswork. It’s less about just having data and more about making that data work for you.
It’s easy to confuse data intelligence with data analytics, but they serve different purposes. Data analytics is like looking in the rearview mirror; it examines past information to identify patterns and figure out what happened and why. It’s descriptive and diagnostic. Data intelligence, on the other hand, looks ahead. It uses historical data not just to understand the past, but to predict future trends, identify potential risks, and spot new opportunities. It acts as the bridge between raw data and forward-thinking business strategy, helping you decide what to do next. While analytics tells you where you’ve been, intelligence helps you chart the course for where you’re going.
To see data intelligence in action, look no further than the commercial property insurance industry. It’s a field swimming in complex, variable data, which makes it a perfect place for innovation. This is where Ping comes in. The company provides smart data tools specifically for this sector, with a clear mission: to deliver rich, accurate, and reliable data directly into their clients' daily workflows. For insurance underwriters, who have to assess risk based on mountains of information, getting clean data quickly is a game-changer. Ping’s platform takes on the heavy lifting of data processing, allowing insurers to focus on what they do best—making informed underwriting decisions without getting bogged down in tedious data cleanup.
The challenge Ping tackles is massive. Globally, the insurance industry spends billions of dollars each year just cleaning up risk data. This information often arrives in unstructured formats—think PDFs, lengthy email chains, and inconsistent spreadsheets. Manually sifting through these documents to extract key details is slow, expensive, and prone to human error. This messy data creates bottlenecks, delays quotes, and can lead to inaccurate risk assessments. It’s a systemic problem that affects everyone from the broker to the carrier, making it harder to price policies correctly and serve customers efficiently. Ping addresses this multi-billion dollar headache head-on by automating the cleanup process.
Ping’s mission is to bring order to the chaos of insurance data. As their LinkedIn profile states, they create advanced technology for the insurance industry, using Artificial Intelligence (AI) and Machine Learning to make insurance data clean, organized, and easy to use. By applying sophisticated ML models, they can extract, standardize, and enrich data from highly variable sources, turning unstructured information into a structured, usable asset. To accomplish this at scale, Ping relies on a powerful and efficient MLOps platform. Having a streamlined infrastructure like the one Cake provides allows their team to build, deploy, and manage these complex models effectively, ensuring they can deliver the speed and accuracy their clients depend on.
Ping uses machine learning primarily to transform unstructured data into structured data. As a service for insurance companies, Ping’s models typically combine several qualitative attributes about buildings as inputs—for example, occupancy, construction materials, and location.
This information usually arrives in email bodies, PDFs, and other unstructured and highly variable formats. Structured spreadsheets and forms also have variability from year to year (e.g., shifting locations of a specific checkbox or poor data entry), and even individual data points can have inconsistent formatting that can complicate the modeling process (e.g., non-standard address patterns).
“We started this project building a purely heuristic approach,” shared Scott Stafford, Chief Enterprise Architect at Ping. “Well-designed heuristics can be effective in many scenarios, but the current crop of ML tools are an absolutely essential component to solve the long tail of problematic inputs.”
For anyone in commercial property insurance, forms like Statements of Values (SOVs) and ACORDs are a daily reality. The problem is that the critical data within them is often a mess. This information is frequently buried in emails, PDFs, and spreadsheets with highly variable formatting. Even when the data is in a structured form, inconsistencies from year to year—like a checkbox moving its location or simple data entry errors—can throw a wrench in the works. Details like property addresses often come in non-standard formats, making it incredibly difficult to automate the data modeling process and get an accurate picture of risk.
To solve these data challenges, Ping developed a suite of smart data tools specifically for the commercial property insurance industry. Their core mission is to deliver rich, accurate, and reliable data directly into their clients' daily workflows, making the underwriting process faster and more precise. The product suite is designed to handle the entire data journey, from pulling information out of messy documents to enriching it with external data and visualizing it for better risk assessment. This end-to-end approach transforms a slow, manual process into a streamlined, data-driven operation.
The first step in Ping’s process is getting the data out of its original source. Ping.Extraction is the tool that pulls key information from documents like SOVs and PDFs. Once the data is extracted, Ping.Data takes over by gathering additional property data from the web. This enriches the initial dataset, filling in gaps and adding valuable context that wasn't in the original forms. Together, these tools create a comprehensive and structured foundation for analysis.
Understanding exactly where a property is located is fundamental to assessing its risk. Ping.Location is used to look up and verify specific building details, ensuring the highest level of accuracy. This clean location data then feeds into Ping.Maps, which helps companies visualize their property risks on interactive maps. This visual context allows underwriters to better understand their portfolio's exposure to various hazards, from flood zones to wildfire-prone areas.
For large-scale risks, Ping offers Ping.Catastrophe. This tool connects the enriched property data to sophisticated models that predict the impact of major disasters. By integrating their clean, reliable data with catastrophe models, insurance companies can more effectively manage their exposure to large-scale events and make smarter decisions about their portfolios. This is where accurate data becomes essential for managing massive financial risks.
Two key technologies power Ping’s accuracy. The first is Ensemble Geocoding, a smart method for determining a property's precise location, even when the original address data is incomplete or confusing. This overcomes the common issue of non-standard address formats. The second is the PING Taxonomy, which acts as a universal translator for property insurance risk data. It standardizes the information, creating a common language that makes it easier for different systems to work together seamlessly. This ensures that the data is not only accurate but also interoperable across the entire insurance ecosystem.
Ping needed to scale its training data collection process but did not have the necessary ML infrastructure internally—“we just had an S3 drive and a dream,” as Stafford described it.
Off-the-shelf commercial ML tools either did not meet Ping’s requirements, were too inflexible, or were too expensive to consider. Due to PII concerns, Ping could not send data to LLM vendors or other managed data extraction tools. High source data variability led to regular quality concerns with standard tools. Ping had also considered using a collection of cloud-hosted SaaS tools; however, as Machine Learning Engineering Lead Bill Granfield described, “we'd end up investing significantly in a range of hosted services.”
With a lean team and a need to build its own services, Ping selected Cake to manage its ML infrastructure. Working together, Cake helped select the appropriate stack of components from the open source ML ecosystem and offered a simple unified approach to ML application development.
As Stafford outlined the challenge, “We knew what we needed at a high level, but we also knew we didn't want to build out a team of 10 to do all this work. Cake came along, and we’ve been really happy with the whole engagement so far.”
Composed of AI/ML infrastructure experts, the Cake team served as a resource for Ping during the initial implementation. Cake begins engagements with consultative recommendations on which open source systems and tools might be most applicable to a particular challenge.
As Stafford described, “The expertise that Cake provided was extremely beneficial. We described our situation and what we were struggling with, and the Cake team had a comprehensive knowledge of the entire open source space in order to make tailored recommendations."
“We've got a list of ML desires a mile long - and with Cake, those moved instead of standing still.”—Scott Stafford, Chief Enterprise Architect at Ping
Among other projects, the Ping team has used Cake to build a PDF data extraction pipeline, and core parts of its algorithm use annotated data created with a Cake-managed version of an open source data labeling solution. Ping uses these image annotation capabilities to annotate the different versions of PDFs and build an algorithmic OpenCV-style extraction library. Ping’s system can now automatically parse PDFs and successfully extract the needed data.
“With Cake, we're collecting this data several times faster than we were before,” Granfield explained, “in a way that makes it easy to analyze what we've got, understand the provenance of annotations, and measure the performance of different annotators. It’s enabled us to release a new model much faster than we would have otherwise.”
Ping unlocked additional efficiency by consolidating tools and services into one Kubernetes environment. The ‘before’ state was a “mishmash” of vendor-managed cloud services and SageMaker models that lacked central management.
Cake streamlined the environment for the Ping team. Bill Granfield outlined the benefits of this new environment, stating, “We know everything is managed with GitOps, and there is one single MLOps repo with the config for our entire environment. Not only do we have version control history for all the changes to our environment, but it’s much faster to make changes over time.”
“Our partnership with Cake has been a clear strategic choice – we're achieving the impact of two to three technical hires with the equivalent investment of half an FTE."—Scott Stafford, Chief Enterprise Architect at Ping
For Ping, a significant benefit of working with Cake has been resource savings. The current market for hiring MLOps talent is highly competitive, and it is challenging to find people who understand machine learning theory, can implement it, and are creative enough to help tackle new problems. If not for Cake, the Ping team estimates they would have likely hired at least two more full-time senior level engineers for its current challenges.
“Cake empowers us to achieve more with our existing resources,” Stafford explained. “While expanding the team would be ideal, finding, onboarding, and integrating skilled hires can be both time-consuming and costly. With Cake, our current team can focus on high-impact work, as they’re no longer tasked with building and maintaining infrastructure—effectively doubling our operational efficiency.”
Ping’s engineers have been able to focus on higher level ML and business problems now that the underlying infrastructure is being handled reliably by Cake.
The open source approach initially presented concerns for Ping due to the sensitive nature of its customer data. Most open source tools lack security and privacy features such as role-based access controls and other features only available in cost-prohibitive enterprise versions.
Cake allowed the Ping team to deploy an open source stack safely. As Stafford described, “Security is a top priority for Ping, and as we developed this system, Cake’s enhancements to the basic security measures found in open-source tools were crucial. Their work added an essential layer of protection for client data, providing the level of security any client-focused company would demand.”
Manually cleaning and structuring data is a major operational expense for insurance companies, and it's a challenge Ping’s ML-powered solutions directly address. By automating the extraction and cleansing of information from highly variable sources, Ping helps its clients cut their costs for data cleaning by 50% or more. This level of efficiency isn't just the result of a clever algorithm; it's built on a solid foundation. Because Cake manages its MLOps stack, Ping’s team can dedicate its time to refining the models that produce these savings, rather than getting bogged down with infrastructure management. This focus allows them to deliver a powerful, cost-effective product that transforms a traditionally slow and expensive process.
For many organizations, processing huge volumes of unstructured data from sources like PDFs and emails is a bottleneck that can take days, if not weeks. Ping faced this exact problem, but with its infrastructure managed by Cake, the team turned it into a competitive advantage. Their data extraction pipeline can now process enormous datasets at a remarkable speed. As Machine Learning Engineering Lead Bill Granfield noted, "With Cake, we're collecting this data several times faster than we were before." This acceleration has a direct impact on their business, allowing them to develop and release new models much more quickly and deliver results to insurance clients in a fraction of the time.
Ping's powerful data intelligence platform doesn't operate in a vacuum. To deliver the most comprehensive insights to the insurance industry, Ping forms strategic partnerships with other specialized data and technology providers. These collaborations are key to extending its capabilities and reaching new markets. By integrating external data sources and technologies, Ping enriches its own offerings, providing a more complete picture of risk for its clients. This ability to seamlessly connect with partners and manage complex data flows is supported by the scalable and reliable ML infrastructure that Cake helps manage, ensuring that new integrations are smooth and efficient.
A great example of this strategy in action is Ping's work with AdvantageGo, a software provider for the global insurance and reinsurance industry. Together, they are tackling the time-consuming challenge of cleaning up messy property data. Insurance companies often receive crucial information in inconsistent formats, like Statements of Values (SOVs). Ping's technology automates the process of structuring this data, making it instantly usable. This partnership means insurers using AdvantageGo's platform can move much faster on essential tasks like catastrophe modeling and underwriting, bringing new levels of efficiency to the London market.
Ping has also joined forces with Property Guardian to address the growing challenge of wildfire risk for commercial insurers. Wildfires are a significant and complex peril, and accurate data is essential for effective underwriting. Through this collaboration, Property Guardian’s detailed, property-specific wildfire risk data is integrated directly into the Ping platform. This gives insurers immediate access to critical information, allowing them to make faster, more informed decisions about which properties to insure and how to price them. This team-up is a perfect illustration of how combining specialized data with a powerful intelligence platform creates immense value for the insurance ecosystem.
In contrast with the slow-moving insurance industry, cutting-edge AI/ML continues to evolve with breathtaking speed. Ping currently offers a best-in-class product for the insurance ecosystem by leveraging the latest ML technologies. Maintaining its leadership position requires matching the pace of innovation in the AI/ML ecosystem.
Cake offers Ping an easy route to maintaining a modern ML infrastructure stack. Cake continuously integrates new popular open source technologies (and new versions of existing technologies) for customers. Cake is a strategic partner for Ping due to its ability to support Ping’s AI-focused competitive differentiation.
Scott Stafford summarized the long-term benefit of the partnership: “Staying at the forefront of ML advancements is essential for us, even as a young company, to remain competitive and agile. Having Cake as a key partner on this journey provides invaluable confidence that we’re equipped to evolve alongside these rapid changes.”
What does Cake actually provide for a company like Ping? Think of us as your dedicated MLOps infrastructure team. Instead of your engineers spending their time building and maintaining the complex systems needed to run machine learning models, we handle all of that for you. We set up, manage, and secure the entire open-source stack, which lets your team focus completely on developing your core product and solving business problems.
Why did Ping choose Cake instead of just using various cloud-based AI tools? While you can certainly piece together different cloud services, it often leads to a complicated and expensive setup. Ping wanted to avoid managing multiple vendor services and the costs that come with them. We provide a single, unified platform built on powerful open-source tools. This streamlines their entire workflow, gives them one place to manage everything, and is more cost-effective than paying for a collection of separate SaaS products.
How can you ensure an open-source stack is secure enough for sensitive client data? This is a top priority for us and our clients. While many open-source tools are powerful, they often lack the robust security features that enterprises need right out of the box. We bridge that gap by adding an essential layer of security, including features like role-based access controls, on top of the open-source foundation. This gives companies like Ping the confidence to deploy these tools safely with their sensitive customer data.
Does our team need to be experts in MLOps to use Cake? Not at all, and that's really the point. Our team brings deep expertise in the entire open-source AI ecosystem to the table. We act as an extension of your team, guiding you on the best tools for your specific challenges and managing the infrastructure. This frees up your engineers to focus on what they do best—building great ML models and features—without needing to become specialists in MLOps themselves.
Is the main benefit of working with Cake just saving money on salaries? The resource savings are significant, but the bigger strategic advantage is speed. By taking infrastructure management off their plate, we enabled Ping's team to move much faster on their product goals. They could collect data more quickly, build new models sooner, and get their product into production without the typical delays. It's about empowering your existing team to do more, innovate faster, and stay ahead of the competition.