Most electronic health records are severely limited by their own design. When a doctor types a detailed observation about your symptoms, that text usually sits in a digital filing cabinet, entirely unreadable by automated research tools. Traditional systems only understand neat checkboxes and standardized billing codes.
That rigid structure leaves the most valuable human insights completely isolated. OMNY Health just changed the math on medical research by unlocking billions of these hidden documents for the entire industry.
80 Percent of Medical Data Was Sitting in the Dark
According to studies published in Healthcare Informatics Research, a staggering majority of patient information never makes it into a clean spreadsheet. About 80 percent of medical data is unstructured, existing only as free-text paragraphs typed in a rush by busy healthcare providers. This category includes everything from nuanced symptom descriptions to complex treatment rationales.
Historically, researchers who wanted to study these notes had to perform a manual chart review. A human being had to open each file, read the doctor’s handwriting or typed shorthand, and manually log the findings. This slow process meant that crucial details about adverse medication events or social determinants of health were routinely abandoned by medical centers.
Dr. Mitesh Rao, the founder and CEO of OMNY Health, explicitly compared the difficulty of searching raw clinical texts to looking for a needle in a haystack. The sheer volume of unstructured information made broad analysis impossible without the right technological infrastructure.
The consequences of ignoring this data are significant for medical advancement. When pharmaceutical companies look for trends in patient outcomes, they usually rely on structured claims data. However, claims data lacks the human context required to understand why a specific treatment failed or succeeded.
By capturing unstructured information, researchers gain access to several critical elements:
- Detailed accounts of disease progression over months or years
- Specific reasons why a doctor chose one medication over another
- Observations regarding a patient’s living situation or environmental risks
- Early warning signs of adverse reactions before a formal diagnosis

Turning Free Text Into Research Grade Information
Raw clinical notes are notoriously messy, filled with inconsistent abbreviations, typos, and incomplete sentences. Turning this chaotic resource into a reliable dataset requires substantial computing power. OMNY Health tackled this bottleneck by deploying large language models and proprietary natural language processing systems across its entire network.
The company built a data platform designed specifically for health tech and artificial intelligence developers. The system scans the free-text paragraphs, cleans the formatting, and extracts the vital clinical facts into structured fields. This allows an AI researcher to query the database for specific symptom clusters without reading a single paragraph manually.
“For years, critical patient insights have been locked away in free-text clinical notes, inaccessible to researchers and healthcare innovators. Our network’s expansion changes that by making this data available on a large scale.” – Dr. Mitesh Rao, MD, CEO and Founder of OMNY Health
Security and privacy present an equally difficult hurdle when handling billions of sensitive documents. Before any note enters the searchable network, the system strips out all personally identifiable information. OMNY Health adheres strictly to the strict de-identification protocols required by the HIPAA Privacy Rule to ensure patient anonymity.
| Data Type | Primary Function | Research Limitation |
|---|---|---|
| Structured Claims Data | Insurance billing and financial tracking | Lacks context regarding patient symptoms |
| Free-Text Observations | Detailed narrative of the patient visit | Difficult to search without AI tools |
| OMNY Structured Notes | Research and clinical trial acceleration | Requires intense processing resources |
The result is a longitudinal view of patient health that simply did not exist a few years ago. By translating raw data into a clean format, the healthcare industry finally gains a common language to discuss patient journeys.
100 Million Patient Journeys Connected Across All 50 States
The scale of this integration pushes health informatics into uncharted territory. The initial breakthrough involved 4 billion notes sourced directly from a vast network of academic medical centers and provider organizations. By mid-2025, that number expanded to 6.5 billion clinical documents representing 100 million de-identified patient records.
To achieve this volume, OMNY Health partners directly with major institutions rather than scraping disconnected databases. Health systems like Bon Secours Mercy Health, St. Luke’s University Health Network, and Johns Hopkins Medicine supply the raw information. The platform currently encompasses more than 500,000 providers across 200 specialties in all fifty states.
Here is a helpful video explaining the company’s approach to healthcare data directly from the CEO:
This level of data liquidity also helps hospitals navigate modern regulatory requirements. By standardizing free-text documents, healthcare networks have an easier time complying with modern Information Blocking Rules established by the Office of the National Coordinator for Health IT. The 21st Century Cures Act mandates that providers cannot artificially restrict the exchange of electronic health information, including clinical notes.
Organizing this much information creates a foundational resource for the entire medical sector. Biotech firms no longer have to build custom data pipelines for every new study they launch.
Speeding Up Drug Development and Disease Tracking
Accessing 4 billion structured notes allows researchers to compress timelines that used to take years. When a pharmaceutical company develops a new treatment, they need to identify exactly how a disease progresses in untreated patients. To support this, OMNY Health adds more than 300 clinical assessment measures directly to its platform.
Mark Townsend, the Chief Clinical Digital Ventures Officer at Bon Secours Mercy Health, referred to unstructured records as a treasure trove of untapped insights. Health organizations can now leverage this trove to drive operational efficiencies and foster genuine innovation at the bedside.
The applications for this clean data stretch far beyond basic academic research. Some of the immediate uses include:
- Training Diagnostic AI: Artificial intelligence models require massive amounts of clean, varied data to learn how to identify rare diseases accurately.
- Personalized Medicine: Doctors can tailor specific treatments to individual patients by matching their unique symptoms against historical cases documented in the network.
- Health Equity Studies: Researchers can finally quantify how social and environmental factors impact recovery rates across different geographical regions.
As the healthcare industry shifts toward personalized medicine, having a complete picture of a patient’s history is no longer optional. The data bottleneck that restricted innovation for decades is finally breaking apart.
Data management is rarely the most glamorous part of modern medicine, but it dictates the speed of every other scientific breakthrough. This milestone in #HealthcareInnovation proves that the cure for many ongoing medical challenges might already be written down in a doctor’s chart. With platforms like #OMNYHealth organizing those fragmented notes into a unified language, researchers can finally start reading the answers.
Disclaimer: This article discusses healthcare data regulations, medical research technology, and HIPAA compliance for informational purposes only. It does not constitute legal or regulatory advice. Healthcare organizations should consult qualified legal compliance professionals regarding data de-identification and the 21st Century Cures Act Information Blocking Rules.



