Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unit tests profiling #822

Merged
merged 12 commits into from
Mar 6, 2024
Merged

Unit tests profiling #822

merged 12 commits into from
Mar 6, 2024

Conversation

GernotMaier
Copy link
Contributor

@GernotMaier GernotMaier commented Mar 4, 2024

Unit tests take are taking a long time on github (e.g., >2h in https://github.com/gammasim/simtools/actions/runs/8141547537/job/22249144146).

This PR solves this issue:

  1. pytest fixtures depend on the sequence. The call of simtelpath in the following line:
corsika_runner(corsika_config_data, io_handler, simtel_path, db_config):

overwrites (!) environmental variables to mock values. This leads to a re-definition of the DB-related configuration parameters and the runner spends a long time and trying to access the DB with wrong access values (and the gitlab repo). This was not noted locally, as in this case the .env file was read for the DB configuration parameters.
I've tried to fix this with this PR.

  1. SiteModel is generated several times and each time the DB is accessed. Removed here one obvious and easy to fix call to SiteModel (the other calls are part of the discussed improvement of ArrayModel/SiteModel/TelescopeModel configuration.

Tests require now 10-15 min instead of 2-3 hours. Due to the increased number of DB calls with the introduction of the new simulation model parameters, an increase compared to the main branch is expected.

@GernotMaier GernotMaier changed the base branch from main to unit-vs-units March 5, 2024 13:44
@GernotMaier GernotMaier linked an issue Mar 5, 2024 that may be closed by this pull request
@GernotMaier GernotMaier marked this pull request as ready for review March 5, 2024 15:26
Copy link
Contributor

@orelgueta orelgueta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the change made, see comment inline.

if simtel_path.exists():
return simtel_path
def simtel_path(tmp_test_directory):
_simtel_path = Path(str(tmp_test_directory) + "/simtel")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, but this is not the sim_telarray path, this is some non-existent directory in the test directory. The unit tests work with this because they never call sim_telarray. Not sure how the integration tests work with this (do they use something else?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the issue before was the calling simtel_path called mock_settings_env_vars which reset all environmental variables for the database. So this should not be the case and I have disentangled that. This is the line moved done from mock_settings_env_vars.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure but this function can now be replaced by return "" and it would do the same. I understand the need not to call mock_settings_env_vars again, but this function has no meaning now.

I think in general its name and the name of the function below it are confusing. It should be that the other function is the "real" one that returns the real sim_telarray path and this one should be "mock_simtel_path" if it's needed. Though I must admit I am a bit confused about it right now, so maybe I misunderstand the value of the two functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will have later another look. I didn't care so much about the meaning of simtel_path, as I was concentrating on the issue with the DB setting.

In general, I am very confused by a lot of pytest fixtures. I do think we have made them too complicated, with deep cascades of fixtures on calling another.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked this again:

  • integration tests test_applications_from_config.py are getting the simtel path and DB parameters from the environment and don't depend on these pytest fixtures
  • tests/integration_tests/test_ray_tracing.py has the only tests which depends on simtel_path_no_mock: this retrieves the correct setting and returns the simtel path (which is required to be set in when calling RayTracing); this is special about RayTracing, which is the only class we test additional outside of the application as a whole (I guess this is a historic thing)
  • all unit tests do not depend on simtelpath and are therefore fine to use the 'wrong' path in tmp directory. The unit test need however the correct DB settings, which has been restored in the PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you can essentially remove this entire fixture and it won't make a difference, no?
As it is, I don't understand the idea behind the code in this fixture.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need for some classes a "Path" object given as the simtel_source_path argument. Yes, it seems for the unit tests it can point to any path and is not checked during the tests (I guess because we don't run simtel during unit tests).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that in fact you are returning "" always.
So, I suggest to make it clearer and replace this fixture with:

@pytest.fixture
def simtel_path(tmp_test_directory):
    """
    This fixture does not really set the sim_telarray path because it is used only in unit tests which do not run sim_telarray
    """"
    return ""

If you really need a Path (I don't think so), the return can change to Path("").

This is a compromise though because I think this fixture can technically just be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK! Changed this accordingly.

Copy link
Contributor

@orelgueta orelgueta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, if tests pass, we can merge.

@GernotMaier GernotMaier merged commit 7f68a41 into unit-vs-units Mar 6, 2024
13 checks passed
@GernotMaier GernotMaier deleted the unit-tests-profiling branch March 6, 2024 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pytest fixture simtel_path changes DB configuration
2 participants