Skip to content

CorrelAid/h4sg25_cdl_challenge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 

Repository files navigation

CDL @ Hack4SocialGood 2025: Extracting Funding Amounts from State Funding Programs

  • Development of a method to automatically extract funding amounts from funding programs.
  • The problem can be framed as a Named Entity Recognition (NER) or information extraction task.
  • The data basis consists of funding programs that are part of the German federal government's "Förderdatenbank" (funding database). These have been scraped and published here.
  • The background is an attempt to quantify how much the German government spends on promoting democracy. As a first step, a classifier has already been developed to identify democracy funding programs. The next step is the extraction of funding amounts. You can find an article on the project (in German!) here.

Data

  • The data originates from the website: www.foerderdatenbank.de
  • A description of the scraped dataset, as well as the link to the data, can be found here.
  • An example of how the data can be read using Python is available here

Possible Approaches

  • NER using the Python package spaCy.
  • Fine-tuning language models like BERT.
  • In-context learning with generative LLMs.

Important Considerations

  • The method should be evaluated using suitable metrics such as the F1 Score or Accuracy.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published