Leveraging Advanced Analytics to Identify Drivers of Endometriosis’ Onset

In the recent years, endometriosis, a common female health disorder, have been  brought to the forefront of women’s health area. The disease is characterized by  tissue  structures similar to the lining of the uterus that grow on other parts of the body, including ovaries, fallopian tubes, and large intestine. The condition is one of the most common causes of pelvic pain and infertility in females. Overall, in the U.S., one in every ten women of a reproductive age group has endometriosis [1]. The actual cause of the condition is still unknown, as it is quite difficult to diagnose and often goes untreated for years in female patients. There are several theories regarding the cause; however, there are still many questions in how to effectively diagnose endometriosis. The primary risks of infertility and other health complications related to the condition could be minimized to  a  greater  extent,  if the likelihood of endometriosis and the condition’s drivers were known well in advance.

Advancements in data science and machine learning techniques have provided the opportunity for an application of these approaches in the healthcare area [2]. In recent years, healthcare providers and Integrated Delivery Networks (IDNs) have also shown an interest towards leveraging these data science methods in the condition diagnosing procedures. Disease prediction, using data mining and machine learning approaches with a patient medical history, such as information on diagnosis of the disease, medical and surgical procedures, therapeutics, and treatment regimens, hospitalizations, etc., has been slowly introduced to aid the decision making processes in the healthcare industry [3, 4, 5]. As a result, the proper medical care and treatment can be given to the impacted patients well in advance, and therefore, improving patients’ survival rate and quality of life. Machine learning algorithms such as Decision Tree, Random Forest, Xtreme Gradient Boosting, Neutral Networks, Support Vector Machines models can be leveraged to define drivers of endometriosis onset as well as be applied for other condition diagnoses’ prediction.

When leveraging best performing machine learning approaches in predicting the disease occurrence, the key drivers of endometriosis onset can be identified and include a selected and relevant to the condition set of diagnosis codes, medical and surgical procedure codes, drugs, as well as physician specialties that often support patients through their healthcare journey.

For example, the application of the data science methods reveal that diagnosis codes, including ‘non inflammatory disorder of uterus’, ‘dysmenorrhea’, ‘pelvic and perineal pain’, ‘unspecified condition associated with female genital organs and menstrual cycle, can be found

to directly correlate with the risks and symptoms of endometriosis. In addition, diagnoses codes as ‘submucous leiomyoma of uterus’, ‘ovarian cyst’, ‘hypertrophy of uterus’, ‘excessive bleeding in the premenopausal period’, ‘deep dyspareunia’, ‘family history of malignant neoplasm of ovary’ are also highly significant to the process of the condition diagnosing. These features have also been identified in several medical publications as  important  medical features when confirming patient’s diagnosis [6]. Furthermore, recent clinical research studies noted that women of a reproductive age with ‘chronic stress’ are also at a higher risk of developing endometriosis [7].

Having had medical and surgical procedures such as ‘anesthesia of lower abdomen for laparoscopy’, ‘vaginal hysterectomy including biopsy’, ‘cystourethroscopy’, ‘laparoscopy, surgical with fulguration or excision of lesions of the ovary, peritoneal surface’ could also help identify the disease in advance, as these procedures often are conducted before the disease discovery [8, 9]. Furthermore, drugs such as Acetaminophen and Lidocaine hcl have been found also as strong predictors of endometriosis,  as  these medications are  often prescribed as analgesics, birth control and treatment of endometrial cancer as well as to numb the skin and muscles respectively [10, 11].

From the patient medical journey and healthcare access side, patients often consult with a variety of healthcare provider specialties, including ‘emergency medicine’, ‘family medicine’, and ‘obstetrics and gynecology,’ when experiencing endometriosis related symptoms and gynecological issues. As a result, having reliable predictive tools for diagnosing patients might help healthcare providers with a faster delivery of the proper treatment and care.

As presented above, leveraging data science and machine learning approaches can aid an  early prediction of the disease, and offer an opportunity for patients to receive the needed medical treatment earlier in the patient journey, therefore assisting with an improvement of patient care. Creating typing tools based on the data science methods  and  algorithms that can be integrated into the Electronic Health Records (EHR) systems and easily accessed by healthcare providers could further aid the objective of improving of the diagnosing activities and inform the diagnostic processes that would result in timely and precise conclusions on  the patient health state, and ultimately increasing patients’ access, delivery of care, and finally patients’ quality of life.

Keywords: endometriosis, prediction likelihood, disease onset, machine learning algorithms


  1. Endometriosis: Symptoms,                    Treatment,                    Diagnosis; https://www.uclahealth.org/obgyn/endometriosis
  2. Doupe P, Faghmous J, Basu S., Machine Learning for Health Services Researchers. Value Health. 22(7): 808-815,
  3. William H. Crown, PhD. Potential application of machine learning in health outcomes research and some statistical cautions. International Society for Pharmacoeconomics and Outcomes Research (ISPOR), 2015. 1098-3015$36.00, DOI: https://doi.org/10.1016/j.jval.2014.12.005
  1. Marzyeh Ghassemi, Tristan Naumann, Peter Schulam, Andrew L. Beam, Irene Y. Chen, Rajesh Ranganath. A review of challenges and opportunities in machine learning for health. arXivLabs. 2019 v4, https://arxiv.org/abs/1806.00388
  2. Varun H Buch, Irfan Ahmed, Mahiben Maruthappu. Artificial intelligence in medicine: current trends and future possibilities. British Journal of General Practice 2018; 68 (668): 143-144. DOI: https://doi.org/10.3399/bjgp18X695213
  3. Endometriosis – Risks, Signs, Symptoms, Diagnosis and Treatment
  4. Fernando M. Reis, Larissa M. Coutinho, Silvia Vannuccini, Stefano Luisi & Felice Petraglia, Is Stress a Cause or a Consequence of Endometriosis? Reproductive Sciences volume 27, pages39–45(2020). DOI https://doi.org/10.1007/s43032-019-00053-0
  5. OBG Manag. Endometriosis and infertility: Expert answers to 6 questions to help pinpoint the best route to pregnancy. Mdedge ObGyn 27(6):30-35 (2015). https://www.mdedge.com/obgyn/article/99912/surgery/endometriosis-and-infertility- expert-answers-6-questions-help-pinpoint/
  6. Jon k. Hathaway, MD, PhD, FACS. Decoding Coding. What is the Best Way to Code for           Endometriosis?        NewsScope,        volume       33,       issue       -2       (2019). https://newsscope.aagl.org/volume-33-issue-2/decoding-coding-what-is-the-best-way- to-code-for-endometriosis/
  7. Bo Liang, Yang-Gui Xie, Xiao Ping Xu, and Chun-Hong Hu1. Diagnosis and treatment of submucous myoma of the uterus with interventional ultrasound. NCBI, PMC Oncol Lett (2018). DOI: https://doi.org/10.3892/ol.2018.8122
  8. Endometriosis          Adenomyosis:         Similarities         and         Differences https://www.healthline.com/health/womens-health/adenomyosis-vs-endometriosis

About the Author

Ewa J. Kleczyk, PhD is an analytics leader with a proven record for establishing high performing analytics teams and delivering innovative analytics and solutions to the healthcare industry. Currently, Dr. Kleczyk is a Vice President leading the Advanced Analytics group at Symphony Health, a PRA Health Sciences Company. Her experience spans across commercial effectiveness, health economics, outcomes research, digital & media analytics, as well as forecasting & promotional impact measurement. Dr. Kleczyk is also a highly sought-after conference speaker with experience speaking at leading industry conferences, including Pharmaceutical Marketing Sciences Association, Intellus, DTC Perspectives, Conference for Business and Economics at the Harvard University, etc. She also has published in multiple academic & industry journals and is a board member of several peer-reviewed publications,including the Pharmaceutical Marketing Sciences Association Journal. Dr.  Kleczyk has been  an active advocate of mentoring future women leaders of the pharmaceutical industry for which she has been recognized with multiple leadership awards, including HBA’s ‘Rising Star’ & ‘Luminary’ recognitions. Dr. Kleczyk earned her PhD in Economics from Virginia Tech and  has been a part-time graduate faculty in the School of Economics at the University of Maine.

Ewa J. Kleczyk, PhD
Vice President of Advanced Analytics
Symphony Health, A PRA Health Sciences Company
Email: ewa.kleczyk@symphonyhealth.com



Building an E-Commerce Experience that Pushes All Boundaries

The human arsenal is surely expansive beyond all known...

Marketing the Smarter Way

We can be anything as individuals, but a big...

IT Compliance : focusing on data privacy and control of assets

Today the whole world has immediate access to information...

Adding a New Layer to the Crypto Picture

If there is one thing human beings know best,...

CSAA Insurance Group Joins the Institutes RiskStream Collaborative

The Institutes RiskStream Collaborative, the risk management and insurance...

3 Benefits of Conversational Commerce for Businesses and Customers

Consumer habits have shifted dramatically in the last 20...