Reproducing the creation of hetionet #34

olszewskip · 2020-09-30T15:06:49Z

Hi! Not sure if this is the right place to ask this, but here goes:

I've read through https://think-lab.github.io/p/rephetio/. Hetionet seems extremely impressive and useful. I need something very similar, if not identical, but with the emphasis on diagnosing rare diseases, and I would also strongly prefer to have the ability of automatically updating or adding new data to my database, e.g. to include some new GWAS findings, tailoring specificity of the disease terms to my needs, or maybe adding other node types like genetic variants. Hence, I'm wondering, how hard would it be to reproduce something like hetionet from scratch, possibly in litttle steps (for a group of a couple of people)? I see that https://think-lab.github.io/p/rephetio/#methods has some detailed information about what steps where taken, and also quite a number of links to files hosted on Zenodo. Would You say that all information is there or should I also look elsewhere? Was the main "mode of operation" to download text files from the internet, parse/preprocess/unify/join the data using python scripts, and then inject into Neo4j?

Apologies for a vague question. Many thanks for any suggestions! :)

dhimmel · 2020-09-30T16:23:16Z

Sounds like you're most interested in https://github.com/dhimmel/integrate. This repo does the following:

download text files from the internet, parse/preprocess/unify/join the data using python scripts, and then inject into Neo4j?

Particularly, the integrate.ipynb notebook will be of interest.

Note that most datasets don't come directly from the upstream resource, but rather an intermediate repo that performs pre-processing. In total, there's dozens of repositories that work together to create Hetionet, but the creation is all orchestrated in the dhimmel/integrate repo.

olszewskip · 2020-09-30T20:07:13Z

Awesome! Thank You.

dhimmel closed this as completed Sep 7, 2021

dhimmel mentioned this issue Sep 7, 2021

How to add new disease and anatomy nodes #42

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducing the creation of hetionet #34

Reproducing the creation of hetionet #34

olszewskip commented Sep 30, 2020

dhimmel commented Sep 30, 2020 •

edited

Loading

olszewskip commented Sep 30, 2020

Reproducing the creation of hetionet #34

Reproducing the creation of hetionet #34

Comments

olszewskip commented Sep 30, 2020

dhimmel commented Sep 30, 2020 • edited Loading

olszewskip commented Sep 30, 2020

dhimmel commented Sep 30, 2020 •

edited

Loading