Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python gas-water simulator #6075

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from
Draft

Python gas-water simulator #6075

wants to merge 7 commits into from

Conversation

svenn-t
Copy link
Contributor

@svenn-t svenn-t commented Mar 13, 2025

Added a Python gas-water simulator to be able to run f.ex. CO2-/H2STORE data files. Also added the option to input command-line arguments in constructor so both ways of instantiating the simulator classes have that ability.

I open this PR in draft mode because most of the code is duplicate of the PyBlackOilSimulator implementation, which is not optimal. I've made futile attempts at making a base simulator that the other Python simulators can inherit. If anybody have suggestions on how to make a base simulator (or want to take up the mantle?), I'm happy to hear them!

svenn-t added 3 commits March 10, 2025 09:30
This is a duplicate of PyBlackOilSimulator and should be refactored to avoid too much code duplication
Updated docstrings with args parameter.
@hakonhagland
Copy link
Contributor

most of the code is duplicate of the PyBlackOilSimulator implementation, which is not optimal. I've made futile attempts at making a base simulator that the other Python simulators can inherit.

@svenn-t Interesting. It is not immediately clear to me what is the difficulty with making a base simulator class. Can you elaborate on that?

@svenn-t
Copy link
Contributor Author

svenn-t commented Mar 14, 2025

I made an attempt here: 45be9b8 which compiled but gave "undefined symbol" errors when I imported any of the "BlackOilSimulator" or "GasWaterSimulator" in Python. It might have something to do with how I made the base simulator but I also have next to no experience with pybind and it could be some issues with how I exported the classes.

Explicitly instantiate the base template classes in the .cpp file.
@hakonhagland
Copy link
Contributor

which compiled but gave "undefined symbol" errors

@svenn-t I think we need to instantiate the templated base class explicitly for the two template values we are using, see this commit: hakonhagland@b551864. This fixed the undefined symbol error for me.

@svenn-t
Copy link
Contributor Author

svenn-t commented Mar 17, 2025

which compiled but gave "undefined symbol" errors

@svenn-t I think we need to instantiate the templated base class explicitly for the two template values we are using, see this commit: hakonhagland@b551864. This fixed the undefined symbol error for me.

You are correct, this fixed the "undefined symbol" errors for me as well. I did get some other issues when running examples using the python classes that I need to look into before updating this PR. But thanks for fixing the immediate errors!

@hakonhagland
Copy link
Contributor

I did get some other issues when running examples using the python classes that I need to look into before updating this PR

@svenn-t Yes, I am getting this error when I run the unit tests: The MPI_Comm_free() function was called after MPI_FINALIZE was invoked.. I will have a look at why this happens.

@hakonhagland
Copy link
Contributor

hakonhagland commented Mar 17, 2025

The MPI_Comm_free() function was called after MPI_FINALIZE was invoked.

Seems like this happens because the derived class PyBlackOilSimulator destructor is called before the base class PyBaseSimulator destructor, and the derived class destructor implicitly frees its Opm::Main pointer which will cause MPI_Finalize() to be called, then the base class destructor is called which will implicitly free its Opm::FlowMain pointer, which will implicitly destroy the communicator, see:

std::unique_ptr<Parallel::Communication, DestroyComm> comm_;

@svenn-t
Copy link
Contributor Author

svenn-t commented Mar 17, 2025

Seems like this happens because the derived class PyBlackOilSimulator destructor is called before the base class PyBaseSimulator destructor, and the derived class destructor implicitly frees its Opm::Main pointer which will cause MPI_Finalize() to be called, then the base class destructor is called which will implicitly free its Opm::FlowMain pointer, which will implicitly destroy the communicator, see

Ok, so the issue is having the main_ pointer located in the derived classes, e.g. PyBlackOilSimulator, and flow_main_ located in the base class PyBaseSimulator? I couldn't figure out how to have main_ in the base class since it seemed to be very simulator specific...

Move the PyMain pointer from the derived class to the base class.
@hakonhagland
Copy link
Contributor

I couldn't figure out how to have main_ in the base class

@svenn-t The following seems to work for me: hakonhagland@36ccf4a

@svenn-t
Copy link
Contributor Author

svenn-t commented Mar 18, 2025

@svenn-t The following seems to work for me: hakonhagland@36ccf4a

Indeed, this worked for me as well. Thanks a lot for the fixes! Unfortunately, I get a lot of "violates the C++ One Definition Rule" errors when I compile, and running cases with the GasWaterSimulator crashes. Since there are so many of the ODR errors I suspect there is a fundamental issue like, e.g. the BlackOilTwoPhaseIndices for the gas-water simulator has not been properly included or some header file is missing. I'll look into it.

@hakonhagland
Copy link
Contributor

running cases with the GasWaterSimulator crashes

@svenn-t Do you have any test cases for the gas water simulator? Then I can also have a look at it

@svenn-t
Copy link
Contributor Author

svenn-t commented Mar 18, 2025

@svenn-t Do you have any test cases for the gas water simulator? Then I can also have a look at it

Yes, I have a modified SPE1CASE1 file here:
SPE1CASE1_GW.zip

@hakonhagland
Copy link
Contributor

@svenn-t Thanks for the test case!

I get a lot of "violates the C++ One Definition Rule" errors when I compile

I do not get any error when I compile, can you give more details? When I run a python script with these statements:

from opm.simulators import GasWaterSimulator
sim = GasWaterSimulator(filename="SPE1CASE1_GW.DATA")
sim.step_init()
sim.step()
sim.step_cleanup()

The step_init() above runs fine, but then sim.step() crashes with:

===============Saturation Functions Diagnostics===============

System:  Water-Gas system.
Relative permeability input format: Saturation Family III. 
                                          Only valid for CO2STORE or H2STORE cases with GAS and WATER.
Relative permeability input format: Saturation Family III (GSF/WSF).
Number of saturation regions: 1



================ Starting main simulation loop ===============


Report step  0/10 at day 0/3650, date = 01-Jan-2015
Using Newton nonlinear solver.

Starting time step 0, stepsize 1 days, at day 0/365, date = 01-Jan-2015
python3.12: /home/hakon/test/opm8/opm/opm-simulators/opm/simulators/wells/StandardWell_impl.hpp:66: Opm::StandardWell<TypeTag>::StandardWell(const Opm::Well&, const Opm::ParallelWellInfo<typename Opm::WellInterface<TypeTag>::Scalar>&, int, const typename Base::ModelParameters&, const typename Base::RateConverterType&, int, int, int, int, const std::vector<Opm::PerforationData<typename Opm::WellInterface<TypeTag>::Scalar> >&) [with TypeTag = Opm::Properties::TTag::FlowGasWaterProblem; typename Opm::WellInterface<TypeTag>::Scalar = double; typename Base::ModelParameters = Opm::BlackoilModelParameters<double>; Base = Opm::WellInterface<Opm::Properties::TTag::FlowGasWaterProblem>; typename Base::RateConverterType = Opm::RateConverter::SurfaceToReservoirVoidage<Opm::BlackOilFluidSystem<double, Opm::BlackOilDefaultIndexTraits, Opm::VectorWithDefaultAllocator, std::shared_ptr>, std::vector<int> >]: Assertion `this->num_components_ == numWellConservationEq' failed.
[hakon-Precision-3570:556886] *** Process received signal ***
[hakon-Precision-3570:556886] Signal: Aborted (6)
[hakon-Precision-3570:556886] Signal code:  (-6)

@svenn-t
Copy link
Contributor Author

svenn-t commented Mar 18, 2025

I do not get any error when I compile, can you give more details?

I see now that I get warnings and not errors. I have the build output here:
Build_log.zip
if you are interested, but essentially there are a lot of statements like these:

[build] /home/AD.NORCERESEARCH.NO/svtv/simulators/mupsi/opm-simulators/opm/simulators/flow/BlackoilModel.hpp:61:7: warning: type ‘struct BlackoilModel’ violates the C++ One Definition Rule [-Wodr]
[build]    61 | class BlackoilModel
[build]       |       ^
[build] /home/AD.NORCERESEARCH.NO/svtv/simulators/mupsi/opm-simulators/opm/simulators/flow/BlackoilModel.hpp:61: note: a different type is defined in another translation unit
[build]    61 | class BlackoilModel
[build]       | 

and

[build] /home/AD.NORCERESEARCH.NO/svtv/simulators/mupsi/opm-simulators/opm/simulators/wells/StandardWell_impl.hpp:2493: warning: type of ‘computeCurrentWellRates’ does not match original declaration [-Wlto-type-mismatch]
[build]  2493 |     StandardWell<TypeTag>::
[build]       | 
[build] /home/AD.NORCERESEARCH.NO/svtv/simulators/mupsi/opm-simulators/opm/simulators/wells/StandardWell_impl.hpp:2493:5: note: ‘computeCurrentWellRates’ was previously declared here
[build]  2493 |     StandardWell<TypeTag>::
[build]       |     ^
[build] /home/AD.NORCERESEARCH.NO/svtv/simulators/mupsi/opm-simulators/opm/simulators/wells/StandardWell_impl.hpp:2493:5: note: code may be misoptimized unless ‘-fno-strict-aliasing’ is used

I just do a simple make all (with VS Code cmake extension, but nevertheless) and make install afterwards.

The step_init() above runs fine, but then sim.step() crashes with:

I get the same crash report as you when I run the case as well.

@hakonhagland
Copy link
Contributor

I see now that I get warnings

@svenn-t Interesting. I do not get any warnings. Here is my build log build.zip. I compiled like this from the terminal:

$ cmake -DCMAKE_BUILD_TYPE=Debug -DOPM_ENABLE_PYTHON=ON -DOPM_ENABLE_EMBEDDED_PYTHON=ON  ..
$ make -j4

Assertion `this->num_components_ == numWellConservationEq' failed.

This is from

assert(this->num_components_ == numWellConservationEq);
and in gdb I can see that this->num_components_ is 2, whereas numWellConservationEq is 3.

@svenn-t
Copy link
Contributor Author

svenn-t commented Mar 19, 2025

@svenn-t Interesting. I do not get any warnings. Here is my build log build.zip. I compiled like this from the terminal

When i build with Debug variant, I also get no compilation warnings. The build log I attached was with Release variant.

and in gdb I can see that this->num_components_ is 2, whereas numWellConservationEq is 3.

I suspect that the part in PyGasWaterSimulator.cpp where we set BlackOilTwoPhaseIndices is not done properly or even set at all. I tried to move that part to PyBaseSimulator.cpp, but I get the same crash when running the test case.

@hakonhagland
Copy link
Contributor

The build log I attached was with Release variant.

@svenn-t I see, then I am able to reproduce the warnings also. There are many, the first one I get is:

/home/hakon/test/opm8/opm/opm-simulators/opm/simulators/wells/WellInterface.hpp:75:7: warning: type ‘struct WellInterface’ violates the C++ One Definition Rule [-Wodr]
   75 | class WellInterface : public WellInterfaceIndices<GetPropType<TypeTag, Properties::FluidSystem>,
      |       ^
/home/hakon/test/opm8/opm/opm-simulators/opm/simulators/wells/WellInterface.hpp:75:7: note: a type with the same name but different base type is defined in another translation unit
   75 | class WellInterface : public WellInterfaceIndices<GetPropType<TypeTag, Properties::FluidSystem>,
      |       ^
/home/hakon/test/opm8/opm/opm-simulators/opm/simulators/wells/WellInterfaceIndices.hpp:33:7: note: type name ‘Opm::WellInterfaceIndices<Opm::BlackOilFluidSystem<double, Opm::BlackOilDefaultIndexTraits, Opm::VectorWithDefaultAllocator, std::shared_ptr>, Opm::BlackOilTwoPhaseIndices<0u, 0u, 0u, 0u, false, false, 0u, 0u, 0u> >’ should match type name ‘Opm::WellInterfaceIndices<Opm::BlackOilFluidSystem<double, Opm::BlackOilDefaultIndexTraits, Opm::VectorWithDefaultAllocator, std::shared_ptr>, Opm::BlackOilIndices<0u, 0u, 0u, 0u, false, false, 0u, 0u> >’
   33 | class WellInterfaceIndices : public WellInterfaceFluidSystem<FluidSystem>
      |       ^

I believe this happens when building the Python shared library simulators.cpython-312-x86_64-linux-gnu.so since we are linking it with both PyBlackOilSimulator.cpp.o and PyGasWaterSimulator.cpp.o and the first uses Opm::BlackOilIndices and the latter uses Opm::BlackOilTwoPhaseIndices. So both variants of WellInterface gets linked into the shared object. An idea could be to create two separate shared objects, but there might be other solutions that lets us keep everything in a single shared object.

Split the simulators python shared object into two shared libraries
BlackOil and GasWater to avoid ODR violations in the simulators shared
library
@hakonhagland
Copy link
Contributor

An idea could be to create two separate shared objects

@svenn-t Here is an example of a splitting into two separate modules: hakonhagland@1ff143a. This compiles without ODR violations for me.

@hakonhagland
Copy link
Contributor

If we are going to split the python module into two, one issue is that each of the two python modules will duplicate much of opm-simulators and opm-common static libraries. To avoid this, we can compile opm-common, opm-grid and opm-simulators as shared libraries (option -DBUILD_SHARED_LIBS=ON) then the code will not be duplicated in the python libraries.

Restructured the pybind11 modules and python modules to make the
unit tests pass
@hakonhagland
Copy link
Contributor

Here is an example of a splitting into two separate modules: hakonhagland/opm-simulators-1@1ff143a.

To also make the unit tests pass I needed to make some further changes to the pybind11 module names and python import statements, see: hakonhagland@6d24fe1

@svenn-t
Copy link
Contributor Author

svenn-t commented Mar 20, 2025

@svenn-t Here is an example of a splitting into two separate modules: hakonhagland@1ff143a. This compiles without ODR violations for me.

To also make the unit tests pass I needed to make some further changes to the pybind11 module names and python import statements, see: hakonhagland@6d24fe1

With the new commits, I also do not get the ODR warnings, and running both the gas-water and blackoil versions of the SPE1CASE1 case, I do not get any crashes. Thanks a lot for making these fixes and getting the simulators to work!

If we are going to split the python module into two, one issue is that each of the two python modules will duplicate much of opm-simulators and opm-common static libraries. To avoid this, we can compile opm-common, opm-grid and opm-simulators as shared libraries (option -DBUILD_SHARED_LIBS=ON) then the code will not be duplicated in the python libraries.

I admit this is beyond my knowledge with pybind, cmake, etc. Are there any issues with doing this?

I will update this PR with the new commits since it seems to be compiling and working correctly.

@akva2
Copy link
Member

akva2 commented Mar 20, 2025

relying on dynamic linking to opm libraries is problematic for pip deployment. i much prefer if you track down the odr violation instead. something is instanced in the python module that should not be instanced in the python module.

@hakonhagland
Copy link
Contributor

relying on dynamic linking to opm libraries is problematic for pip deployment.

@akva2 Is it the auditwheel that is the problem? In principle we should be able to install all dependencies in a manylinux container, right? However, I assume it can be difficult find the correct versions of those that fit together?

i much prefer if you track down the odr violation instead

@akva2 It is quite a lot of the ODR violations, I think it can become a difficult task. As a workaround we could build the two python modules with the static opm libraries (i.e. include them into the shared objects). The drawback is then only the code duplication in the two modules (that is the size of the modules will be larger than strictly necessary)

@akva2
Copy link
Member

akva2 commented Mar 26, 2025

while we can in principle install shared dependencies with the python module, it is opening a can of worms i'm very reluctant of taking aboard. the problem is that in the wild west of python package management, there are ways users can screw themself over and end up loading wrong versions of the shared libraries from system paths and not the shipped libs, in particular if the don't use venvs and such.

while i'm usually a fan of shared linking, in this case i'll take larger modules any day to avoid the problems so if there is no other way that is what i'd do.

@hakonhagland
Copy link
Contributor

there are ways users can screw themself over and end up loading wrong versions of the shared libraries from system paths and not the shipped libs, in particular if the don't use venvs and such.

@akva2 Interesting. By looking at the source it seems auditwheel is adjusting the rpath so the wheel's internally shipped shared objects should then take precedence over any equivalent libraries from the user's system path. Also, it looks like it renames the libraries, see repair.py, so a conflict in names should then be even less likely, right?

It would be interesting to test this, can I just run docker build . from the python directory?

@akva2
Copy link
Member

akva2 commented Mar 26, 2025

right, so there are new hacks in the tools to try to fix the situation then. my trust in python packaging tools is still exactly zero though, so i'm sure there are some tool that screws it up somehow, somewhere in the world..
anyways,

basically yes, invocation on jenkins is

DOCKER_BUILDKIT=1 docker build -t manylinux_2_28_opm:built --build-arg build_jobs=16 --build-arg version_tag="$2" --build-arg version="$3" . -f Dockerfile --output $1/wheelhouse

where $1 is the output directory, $2 is the tag to append to the package version, and version is tag to build from (could be "master"). the build args default to tag="master" and version="", where i usually give .dev$DATE for the nightly builds, if not it will use the version (if given it's version from dune.module-$version)

@hakonhagland
Copy link
Contributor

DOCKER_BUILDKIT=1 docker build -t manylinux_2_28_opm:built --build-arg build_jobs=16 --build-arg version_tag="$2" --build-arg version="$3" . -f Dockerfile --output $1/wheelhouse

@akva2 Yes this worked for static opm libraries after a small change to the generate-pypi-package.sh script, see hakonhagland@1b690f9. However, adding -DBUILD_SHARED_LIBS=ON to the cmake flags and trying to build shared opm libraries failed. It seems like the first problem occurs when trying to build a shared libopmgrid.so due to only static versions of the lapack.a is available in the manylinux container, see

dnf install -y blas-static lapack-static suitesparse-static ninja-build

I will try to replace these packages with ones including the shared versions and see how far I get.

@hakonhagland
Copy link
Contributor

I will try to replace these packages with ones including the shared versions and see how far I get.

It seems to work after these changes: hakonhagland@84a9041. I will try to install the generated wheels in some containers to check if they also installs successfully.

@hakonhagland
Copy link
Contributor

I am no longer able to build in the container since it now uses cmake 4.0, see: #6122

@hakonhagland
Copy link
Contributor

I will try to install the generated wheels in some containers to check if they also installs successfully.

Have now tested the wheels with both shared and static opm libraries in an Ubuntu 22.04 docker container, see #6136. All unit tests passed there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants