DeTikZify
_{^{Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ}}

Creating high-quality scientific figures can be time-consuming and challenging, even though sketching ideas on paper is relatively easy. Furthermore, recreating existing figures that are not stored in formats preserving semantic information is equally complex. To tackle this problem, we introduce DeTikZify, a novel multimodal language model that automatically synthesizes scientific figures as semantics-preserving TikZ graphics programs based on sketches and existing figures. We also introduce an MCTS-based inference algorithm that enables DeTikZify to iteratively refine its outputs without the need for additional training.

Showcase.mp4

News

2025-03-17: We release TikZero adapters which plug directly into DeTikZify_v2 (8b) and enable zero-shot text-conditioning, and TikZero+ with additional end-to-end fine-tuning. For more information see our paper and usage examples below.
2024-12-05: We release DeTikZify_v2 (8b), our latest model which surpasses all previous versions in our evaluation and make it the new default model in our Hugging Face Space. Check out the model card for more information.
2024-09-24: DeTikZify was accepted at NeurIPS 2024 as a spotlight paper!

Installation

Tip

If you encounter difficulties with installation and inference on your own hardware, consider visiting our Hugging Face Space (please note that restarting the space can take up to 30 minutes). Should you experience long queues, you have the option to duplicate it with a paid private GPU runtime for a more seamless experience. Additionally, you can try our demo on Google Colab. However, setting up the environment there might take some time, and the free tier only supports inference for the 1b models.

The Python package of DeTikZify can be easily installed using pip:

pip install 'detikzify[legacy] @ git+https://github.com/potamides/DeTikZify'

The [legacy] extra is only required if you plan to use the DeTikZify_v1 models. If you only plan to use DeTikZify_v2 you can remove it. If your goal is to run the included examples, it is easier to clone the repository and install it in editable mode like this:

git clone https://github.com/potamides/DeTikZify
pip install -e DeTikZify[examples]

In addition, DeTikZify requires a full TeX Live 2023 installation, ghostscript, and poppler which you have to install through your package manager or via other means.

Usage

Tip

For interactive use and general usage tips, we recommend checking out our web UI, which can be started directly from the command line (use --help for a list of all options):

python -m detikzify.webui --light

If all required dependencies are installed, the full range of DeTikZify features such as compiling, rendering, and saving TikZ graphics, and MCTS-based inference can be accessed through its programming interface:

DeTikZify Example

from operator import itemgetter

from detikzify.model import load
from detikzify.infer import DetikzifyPipeline

image = "https://w.wiki/A7Cc"
pipeline = DetikzifyPipeline(*load(
    model_name_or_path="nllg/detikzify-v2-8b",
    device_map="auto",
    torch_dtype="bfloat16",
))

# generate a single TikZ program
fig = pipeline.sample(image=image)

# if it compiles, rasterize it and show it
if fig.is_rasterizable:
    fig.rasterize().show()

# run MCTS for 10 minutes and generate multiple TikZ programs
figs = set()
for score, fig in pipeline.simulate(image=image, timeout=600):
    figs.add((score, fig))

# save the best TikZ program
best = sorted(figs, key=itemgetter(0))[-1][1]
best.save("fig.tex")

Through TikZero adapters and TikZero+ it is also possible to synthesize graphics programs conditioned on text (cf. our paper for details). Note that this currently only supported through the programming interface:

TikZero+ Example

from detikzify.model import load
from detikzify.infer import DetikzifyPipeline

caption = "A multi-layer perceptron with two hidden layers."
pipeline = DetikzifyPipeline(*load(
    model_name_or_path="nllg/tikzero-plus-10b",
    device_map="auto",
    torch_dtype="bfloat16",
))

# generate a single TikZ program
fig = pipeline.sample(text=caption)

# if it compiles, rasterize it and show it
if fig.is_rasterizable:
    fig.rasterize().show()

TikZero Example

from detikzify.model import load, load_adapter
from detikzify.infer import DetikzifyPipeline

caption = "A multi-layer perceptron with two hidden layers."
pipeline = DetikzifyPipeline(
    *load_adapter(
        *load(
            model_name_or_path="nllg/detikzify-v2-8b",
            device_map="auto",
            torch_dtype="bfloat16",
        ),
        adapter_name_or_path="nllg/tikzero-adapter",
    )
)

# generate a single TikZ program
fig = pipeline.sample(text=caption)

# if it compiles, rasterize it and show it
if fig.is_rasterizable:
    fig.rasterize().show()

More involved examples, for example for evaluation and training, can be found in the [examples](examples) folder.

Model Weights & Datasets

We upload all our DeTikZify models and datasets to the Hugging Face Hub (TikZero models are available here). However, please note that for the public release of the DaTikZ_v2 and DaTikZ_v3 datasets, we had to remove a considerable portion of TikZ drawings originating from arXiv, as the arXiv non-exclusive license does not permit redistribution. We do, however, release our dataset creation scripts and encourage anyone to recreate the full version of DaTikZ themselves.

Citation

If DeTikZify and TikZero have been beneficial for your research or applications, we kindly request you to acknowledge this by citing them as follows:

@inproceedings{belouadi2024detikzify,
    title={{DeTikZify}: Synthesizing Graphics Programs for Scientific Figures and Sketches with {TikZ}},
    author={Jonas Belouadi and Simone Paolo Ponzetto and Steffen Eger},
    booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
    year={2024},
    url={https://openreview.net/forum?id=bcVLFQCOjc}
}

@misc{belouadi2025tikzero,
    title={{TikZero}: Zero-Shot Text-Guided Graphics Program Synthesis},
    author={Jonas Belouadi and Eddy Ilg and Margret Keuper and Hideki Tanaka and Masao Utiyama and Raj Dabre and Steffen Eger and Simone Paolo Ponzetto},
    year={2025},
    eprint={2503.11509},
    archivePrefix={arXiv},
    primaryClass={cs.CL},
    url={https://arxiv.org/abs/2503.11509},
}

Acknowledgments

The implementation of the DeTikZify model architecture is based on LLaVA and AutomaTikZ (v1), and Idefics 3 (v2). Our MCTS implementation is based on VerMCTS. The TikZero architecture draws inspiration from Flamingo and LLaMA 3.2-Vision.

Name		Name	Last commit message	Last commit date
Latest commit History 192 Commits
detikzify		detikzify
examples		examples
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeTikZify
_{^{Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ}}

News

Installation

Usage

Model Weights & Datasets

Citation

Acknowledgments

About

Releases 2

Packages

Contributors 2

Languages

License

potamides/DeTikZify

Folders and files

Latest commit

History

Repository files navigation

DeTikZifySynthesizing Graphics Programs for Scientific Figures and Sketches with TikZ

News

Installation

Usage

Model Weights & Datasets

Citation

Acknowledgments

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Languages

DeTikZify
_{^{Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ}}

Packages