Diagnositic Accuracy & Usability
Independent Accuracy Testing
(Using Gastro-Intestinal Disease Publishing Set and NxOpinion 1.0)
NIMS (Nizam’s Institute of Medical Sciences) Hyderabad, India and Central Michigan University, Mt. Pleasant, MI
In a collaborative effort between The Nizam’s Institute of Medical Sciences (NIMS) of Hyderabad, India and Central Michigan University (CMU) of Mt. Pleasant, Michigan, NxOpinion 1.0 was tested for diagnostic accuracy for Gastro-Intestinal (GI) Diseases by 4 general practice physicians treating a total of 190 GI patients. A repeated measures design was used balanced for order of the diagnostic process; namely, whether the subject saw a physician with NxOpinion first and without NxOpinion second or vice versa. All physician diagnoses were compared against the final decision of a GI specialist which was determined to be the ‘gold standard’ for accuracy.
The final analysis dated May 30, 2007 indicated that NxOpinion was independently able to achieve the correct diagnosis nearly as often as the general practitioner and that it even raised diagnostic accuracy 18% for certain general practitioners. Considered a success, a Letter of Intent was received from NIMS and the Government of India to proceed with field-testing NxOpinion 3.4 to be used in rural India by the physician extender.
The study
was considered successful enough for NIMS and the Government of India to
initiate a Letter of Intent and pursue additional field testing of NxOpinion by non-physicians for use as part of the Rural Health Initiative. This testing will begin in July, 2007 using NxOpinion Version 3.4.
(Using Rural Disease Publishing Set of 164 most common, global diseases and NxOpinion 3.0)
In-House Accuracy Testing
(Using Rural Disease Publishing Set of 164+ most common, global diseases and NxOpinion 3.0)
“Textbook Testing”
Target common diseases are selected from current medical textbooks and common presentation evidence is entered into the software.
Passing criteria:
1. Case diagnosis ranks in the top 2 with 10 relevant findings placed into evidence.
2. Additional impertinent evidence (3 pieces) does not alter the disease’s top 2 standing.
Latest Test Data:
50 Test Cases
· 96% pass
“Leak Testing”
“Leak factors” represent pieces of evidence which are not pertinent to refining a diagnosis and may dilute the strength of the probability of a correct diagnosis.
Both Case Study Testing and Textbook Testing check for leak factors. In this particular type of select leak testing, only innocuous patient information is added to relevant symptoms.
Latest Test Data:
· No diagnostic dilution occurred even with 5 relevant pieces of evidence to 10 impertinent pieces of evidence.
(Using Rural Disease Publishing Set of 188 diseases and NxOpinion 3.4 release)
“Case Study Testing”
After every new technical release and knowledgebase addition, diagnostics are tested by transcribing evidence from published patient case studies into the software, independently. The target cases are equally represented as either common or uncommon presentations of common diseases or common or uncommon presentations of uncommon diseases. Evidence not pertinent to each case’s diagnosis is then added to simulate an inexperienced user not knowing what evidence is pertinent.
Passing criteria:
1. Case diagnosis ranks in the top 5 with up to 10 relevant findings placed into evidence.
2. Additional impertinent evidence (5 pieces) does not alter the disease’s top 5 standing.
Latest Test Data:
16 Test Cases
· Common presentations of common diseases: 100% pass
· Uncommon presentations of common diseases: 75% pass
· Common presentations of uncommon diseases: 100% pass
· Uncommon presentations of uncommon diseases: 25% pass
“NxOpinion vs. Isabel Testing”
Data was entered into the Isabel Software on-line, utilizing a 30 day free trial. The NxOpinion diagnostic ranking of each disease tested was compared with like results for Isabel. Target common diseases were selected from current medical textbooks and common presentation evidence was entered into the software. Impertinent evidence was also added to simulate the inexperienced provider not knowing what evidence is relevant to a case.
Testing Assumption:
All diseases tested were diagnosable by both Isabel and NxOpinion. The diseases occurred in the disease list of NxOpinion by name and while the same type of disease list was not accessible in Isabel, every other indication suggested that the target diseases were included in Isabel’s database.
Passing criteria:
1. Case diagnosis ranks in the top 10 with 10 relevant findings placed into evidence.
2. Additional impertinent evidence (3 pieces) does not alter the disease’s top 10 standing.
Latest Test Data:
25 Test Cases
· NxOpinion: 92% pass, all ranking in the top 2 with pertinent and impertinent evidence
· Isabel: 84% pass, all ranking in the top 10
Disadvantages we saw with Isabel:
1. Isabel arranges diagnoses by body system and in no particular order other than the top 10.
2. Isabel requires correct spelling and correct medical terminology.
3. Isabel does not offer a smart search of terms.
4. Isabel does not inform the user which evidence is relevant and is being considered in the diagnosis and advises the user that it “may not have influenced the result”.
5. Isabel does not prompt the user to look at more evidence to further refine the diagnosis.
6. Isabel tends to provide general disease categories rather than the specific
7. Isabel includes the disclaimers; “if you use a lot of synonyms, the results may be skewed” meaning “leak” is an issue for Isabel.
8. No case storage in trial version.
9. Information retrieval requires extra steps to identify age and gender.
“NxOpinion vs. Epocrates Essentials”
Target
common diseases were selected from current medical textbooks and common
presentation evidence was entered into the software
Data was entered into Epocrates Sx Dx Version 1.1, via a
Portable Digital Assistant (PDA) while data was entered into NxOpinion using a
laptop computer. The diagnostic rankings
from each program were compared.
Unrelated evidence was then added to simulate the inexperienced provider
not knowing what evidence is relevant to a case.
Testing Assumption:
1. All diseases tested were diagnosable by both Epocrates and NxOpinion. The diseases occurred in the disease lists of both software programs.
2. It is assumed that Epocrates uses a rank order of diagnostic suggestions even though no statement to this effect is given.
Passing criteria #1:
1. Case diagnosis ranks in the top 2 with 10 relevant findings placed into evidence.
2. Additional impertinent evidence (3 pieces) does not alter the disease’s top 2 standing.
Latest Test Data:
25 Test Cases
· NxOpinion: 96% pass
· Epocrates: 61% pass
Passing criteria #2:
1. Case diagnosis ranks in the top 5 with 10 relevant findings placed into evidence.
2. Additional impertinent evidence (3 pieces) does not alter the disease’s top 5 standing
Latest Test Data:
25 Test Cases
· NxOpinion: 100% pass
· Epocrates: 83% pass
Disadvantages we saw with Epocrates:
1. Epocrates is intuitive primarily to the experienced user.
2. Disease rankings are ordinal and do not provide any information on the degree of confidence in the order.
3. Epocrates uses limited medical history, medications and family history evidence
4. Epocrates primarily relies on medical jargon/vocabulary
5. Epocrates lacks case storage features.
6. Important features are restricted to tabs that are easily missed and not well managed
7. Epocrates lacks direct links to add symptoms into evidence wherever they appear requiring excess navigational steps