This website describes projects that are part of the Harvard Program in Therapeutic Science (HiTS), at Harvard Medical School.




INDRA (Integrated Network and Dynamical Reasoning Assembler) is an automated model assembly system, originally developed for molecular systems biology and currently being generalized to other domains. INDRA draws on natural language processing systems and structured databases to collect mechanistic and causal assertions, represents them in a standardized form (INDRA Statements), and assembles them into various modeling formalisms including causal graphs and dynamical models. INDRA also provides knowledge assembly procedures that operate on INDRA Statements and correct certain errors, find and resolve redundancies, infer missing information, filter to a scope of interest and assess belief.
Code   Docs

Bob with Bioagents dialogue system

Bob with Bioagents is a machine partner you can chat with about molecular biology to solve problems together. Assume you want to explain an experimental observation, or get some ideas for a new hypothesis. You can talk with the machine agent in English language to discuss topics such as drugs, transcription factors, miRNs, and their targets, and various mechanisms described in the literature and databases. You can also build up a model of a mechanism by describing it during the dialogue, and then ask questions about the properties of the model being discussed to see if it behaves as expected.

INDRA-IPM (Interactive Pathway Map)

The INDRA-IPM allows you to build pathway maps using natural language descriptions. You simply describe the set of mechanisms to include in English, and then click a button to assemble and lay out a pathway map. The pathway can be exported into various formats like SBML, SBGN, Kappa and others.

INDRA DepMap Explainer

The DepMap Explainer builds on INDRA assembly of literature extractions by natural language processing systems to construct mechanistic explanations to correlations between genes involved in CRISPR screens of cancer cell lines found at


IndraBot is a chat bot which is deployed on the #indra_bot channel of the Harvard Program in Therapeutic Science Slack workspace. It can answer natural language questions about mechanisms such as "what phosphorylates ELK1?" or "does RHOA interact with MYL12B?" by querying a database of INDRA assemblies of mechanisms extracted from the literature and pathway databases.
Access: If you are interested in deploying the IndraBot in your Slack workspace, contact our team.



A REST API providing access to many of INDRA's key services, including reading, preassembly and model assembly.
Docs   API access:


A REST API providing access to the accumlated mechanistic knowledge derived from reading all the available medical literature with multiple reading systems, integrating existing pathway databases, and assembled using INDRA.
Docs   API access: please contact us for access details


Big Mechanism The DARPA Big Mechanism program set out to automate the reading, assembly and modeling of mechanisms from the scientific literature. We built INDRA, an automated model assembly system which draws on natural language processing systems, and assembles their output into various predictive and explanatory models.
Funded by the Defense Advanced Research Projects Agency under award W911NF-14-1-0397.

Communicating with Computers The DARPA Communicating with Computers (CwC) program develops technologies for a new generation of human-machine interaction in which machines act as proactive collaborators rather than merely problem solving tools. We are developing an interactive dialogue system which allows scientists to interact with a computer partner – one that is able to harness knowledge extracted from the biomedical literature – to construct and test hypotheses about molecular systems.
Funded by the Defense Advanced Research Projects Agency under award W911NF-15-1-0544.

Automated Scientific Discovery Framework The DARPA Automated Scientific Discovery Framework program (ASDF) will develop algorithms and software for reasoning about complex mechanisms operating in the natural world, explaining large-scale data, assisting humans in generating actionable, model-based hypotheses and testing these hypotheses empirically.
Funded by the Defense Advanced Research Projects Agency under award W911NF018-1-0124.

World Modelers The DARPA World Modelers program aims to develop automated information collection and computational modeling techniques to understand the complex dynamics of global processes such as food security, migration and public health. We are developing the INDRA-GEM (Integrated Network and Dynamical Reasoning Assembler for Generalized Ensemble Modeling) automated model assembly system, which integrates information from diverse sources and implements novel probabilistic assembly techniques that can account for the uncertain nature of information in models.
Funded by the Defense Advanced Research Projects Agency under award W911NF-18-1-0014.

Automating Scientific Knowledge Extraction The DARPA ASKE program is part of DARPA's broader Artificial Intelligence Exploration program with the goal of developing technologies for the "Third Wave" of AI. We are developing EMMAA (Ecosystem of Machine-maintained Models with Automated Assembly), a set of self-updating models of cancer biology that run analysis proactively, and report about meaningful changes in conclusions to users.
Funded by the Defense Advanced Research Projects Agency under award HR00111990009.

Panacea The STOP PAIN project, as part of DARPA’s Panacea program, aims to develop novel drugs for the treatment of pain and inflammation using innovative research platforms. Unlike many modern drug discovery campaigns, which are target focused, we combine target-agnostic screening with network inference tools to create causal and mechanistic networks used for identification of previously unknown target-chemical ligand relationships.
Funded by the Defense Advanced Research Projects Agency under award HR00111920022.


Gyori BM, Bachman JA, Subramanian K, Muhlich JL, Galescu L, Sorger PK. From word models to executable models of signaling networks using automated assembly. Molecular Systems Biology. 2017 13(11):954.
Summary video:

Bachman JA, Gyori BM, Sorger PK. FamPlex: a resource for entity recognition and relationship resolution of human protein families and complexes in biomedical text mining. BMC Bioinformatics. 2018 19(1):248.
Repository: FamPlex

Todorov PV, Gyori BM, Bachman JA, Sorger PK. INDRA-IPM: interactive pathway modeling using natural language with automated assembly. Bioinformatics. 2019.

Sharp R, Pyarelal A, Gyori BM, Alcock K, Laparra Egoitz, Valenzuela-Escárcega MA, Nagesh A, Yadav V, Bachman JA, Tang Z, Lent H, Luo F, Paul M, Bethard S, Barnard K, Morrison C, Surdeanu M Eidos, INDRA, & Delphi: From Free Text to Executable Causal Models NAACL, 2019

Hoyt C, Domingo-Fernández D, Aldisi R, Xu L, Kolpeja K, Spalek S, Wollert E, Bachman J, Gyori BM, Greene P, Hofmann-Apitius M Re-curation and rational enrichment of knowledge graphs in Biological Expression Language Database, 2019.


Our research on human-machine collaboration was featured in WIRED UK, in the article " The merging of humans and machines is happening now", written by then director of DARPA, Arati Prabhakar.

Ben Gyori and John Bachman were interviewed by The Guardian in the tech podcast " Siri of the Cell". Here we introduce our approach to human-machine communication and the assembly of models from the scientific literature.

Ben Gyori and John Bachman were interviewed for an article in Harvard Medicine Magazine. In "A Closer Read" (see section WALL-E), they talk about natural language processing and the INDRA system.


Our team is part of the Harvard Program in Therapeutic Science and the Laboratory of Systems Pharmacology at Harvard Medical School.

Core members

Peter Sorger
Peter Sorger, PhD

PI, Otto Krayer Professor of Systems Pharmacology

Benjamin Gyori
Benjamin Gyori, PhD

Project Lead, Research Associate in Therapeutic Science

John Bachman
John Bachman, PhD

Project Lead, Research Associate in Therapeutic Science

Patrick Greene
Patrick Greene

Scientific Software Developer

Klas Karis
Klas Karis

Scientific Software Developer

Albert Steppi
Albert Steppi, PhD

Scientific Software Developer

Diana Kolusheva
Diana Kolusheva

Scientific Software Developer

Catherine Luria
Catherine Luria, PhD

Program Manager


  • Robert Sheehan, PhD
  • Lily Chylek, PhD
  • Kartik Subramanian, PhD
  • Jeremy Muhlich
  • Artem Sokolov, PhD
  • Mohammed AlQuraishi, PhD
  • Joe Cunningham
  • Alyce Chen, PhD
  • Fabian Froehlich, PhD

Past members

  • Petar Todorov
  • Isabel Latorre, PhD
  • P.S. Thiagarajan, PhD
  • Daniel Milstein
  • William Chen, PhD