-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESMF_RegridWeightGen not working for conservative fields #216
Comments
ESMF build notes (for v8.0.1) are here: |
We can't upgrade the openmpi version in conda because VDI doesn't have 4.0.2. We're trying to see if we can repackage ESMPY to point to the ESMF version in /g/data/hh5 and see if this would solve the openmpi inconsistency error. |
Shouldn't you be setting |
You could also retry your own build of ESMF using openmpi/4.0.1 instead of 4.0.2. |
Thanks, I'll try those suggestions. FYI I also tried putting
in
|
Why do you need to load |
|
I think the analysis3 would be needed for netcdf4. I'm using these modules:
And am getting an error like: [gadi-cpu-clx-0550:1274464:0:1274464] mm_ep.c:168 Fatal: Failed to attach to remote mmid:5474103322476718. Shared memory error This may not be the root cause though because the log files are showing: 20200817 220019.692 ERROR PET40 ESMF_Grid.F90:5222 ESMF_GridCreate Wrong argument specified - - Bad corner array in SCRIP file |
I tried compiling ESMF 8.0.1 with openmpi 4.0.1 like so:
and the relevant PET*.Log files contain
|
I'm not sure fully how the conversion to SCRIP format is done (or what is really going on in |
Well spotted @russfiedler, but I think the faulty mosaic info is an unrelated issue. The three resolutions It seems this has been true for a long time, e.g. it is also the case for Also |
Interesting - I'm able to make both weights (patch and conserve) at 1deg and 0.25deg it is just 0.1deg that has a problem. ESMF is difficult to debug but I do have some code that writes out the weights files in a more friendly format so I'm going to use that to check things. |
@nichannah that's great news - can you show me how you got conserve at 0.25deg to work? That's the case that's urgent right now. |
I'm pretty sure that the logic for detecting corners is completely wrong for the tripolar case when the corners are located at the tripole. In the 1 degree case each cell has 2 corners at (-280,65). That means there will always be a match between 2 corners for adjacent points in the j direction. Lines 5278 onward in the latest version. See the first and third lbock below. `
Column 1: GLATW is GLAT[J=1:2] If you decompose such that the southern most row is south of the tripole then the problem is avoided. i.e. use a lot les processors. as I cant see an obvious way to set the layout. Alternatively, we could hack the code to move the checks to 1 point east by adding |
Hmmm, interesting. Was the logic the same in v7.1.0r? We were using v7.1.0r successfully on raijin for a couple of years: @nichannah what version are you using? |
Ok. As suspected, the problem was that checking the orientation of corners failed on domains where the south-east corner lies on the tripole. The corners to the north also lie on the tripole so there are always some matches and locating the I developed a quick fix for See |
Awesome sleuthing @russfiedler :-) |
Hi @russfiedler and @aekiss this si presumably going to always carry over to the CICE grids and also to the lat_bnds where we have had just had issues in CMIP6. Have you tried any runs yest with these new grids are just testing out with these new weights. should I look elsewhere under GitHub for them. |
Hi @ofa001, this is for the atm -> cice coupling. AFAIK the ACCESS-CM model is still using SCRIP, rather than ESMF. If you would like to try ESMF we have tools that can help with that. |
For the record, Russ' tripole bug fix to |
I've put an executable using Russ' tripole bug fix here: This was built with https://github.com/COSIMA/access-om2/blob/eb2dcde1148b84ed7c8a2bc9a1539ec5a42270d1/tools/contrib/build_fixed_esmf_on_gadi.sh |
That build script makes an executable that doesn't link properly to libesmf.so, so I've replaced |
a |
We don't have a version of
ESMF_RegridWeightGen
on Gadi that can generate conservative remapping weights. This is needed for updating the 0.25deg land mask: #210.Gadi doesn't have the
esmf/7.1.0r-intel
module that was on raijin.ESMF is available via
but the conserve remap calculation fails with
when I run versions of
make_remap_weights.sh
andmake_remap_weights.py
in/scratch/v45/aek156/bathymetry/tools/025_deg_test/access-om2/tools/
that useESMF_RegridWeightGen
fromesmf-nuWRF/7.1.0
.They generate the 0.25 patch file
JRA55_MOM025_patch.nc
but fail when doing the conserve version.I understand this is an MPI problem in
/g/data/hh5/public/apps/esmf-nuWRF/7.1.0/bin/ESMF_RegridWeightGen
.This was compiled with
/apps/openmpi/4.0.2/lib/libmpi_cxx.so.40 (0x00007fe087a54000)
which is the default
openmpi/4.0.2
version picked up withmodule load openmpi
.I've also downloaded and built the latest (8.0.1) ESMF here:
/home/156/aek156/github/esmf-org/esmf
.I built this with
When I ran
make_remap_weights.sh
with/home/156/aek156/github/esmf-org/esmf/apps/appsO/Linux.gfortran.64.openmpi.default/ESMF_RegridWeightGen
for 0.25 deg this worked for the patch weights but failed for the conserve weights with the same error as above.
Building with
didn't help.
I have also tried following the instructions in
https://github.com/COSIMA/access-om2/wiki/Technical-documentation#creating-remapping-weights
with a new build script
/home/156/aek156/github/COSIMA/access-om2/tools/contrib/build_esmf_on_gadi.sh
based on
build_esmf_on_raijin.sh
but I've been unable to get it to compile.It fails with multiple
‘ESMCI_FortranStrLenArg’ has not been declared
errors such asI've tried gcc and different versions of the intel compilers to no avail.
The text was updated successfully, but these errors were encountered: