Skip to content

[flang][runtime] Refine state associated with child I/O #150461

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 25, 2025

Conversation

klausler
Copy link
Contributor

Child I/O state needs to carry a pointer to the original non-type-bound defined I/O subroutine table, so that nested defined I/O can call those defined I/O subroutines. It also needs to maintain a mutableModes instance for the whole invocation of defined I/O, instead of having a mutableModes local to list-directed child I/O, so that a top-level data transfer statement with (say) DECIMAL='COMMA' propagates that setting down to nested child I/O data transfers.

Fixes #149885.

Copy link
Contributor

@DanielCChen DanielCChen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my comment in issue #149885 .

Child I/O state needs to carry a pointer to the original
non-type-bound defined I/O subroutine table, so that nested
defined I/O can call those defined I/O subroutines.  It also
needs to maintain a mutableModes instance for the whole
invocation of defined I/O, instead of having a mutableModes
local to list-directed child I/O, so that a top-level data
transfer statement with (say) DECIMAL='COMMA' propagates that
setting down to nested child I/O data transfers.

Fixes llvm#149885.
@klausler klausler merged commit 918d6db into llvm:main Jul 25, 2025
9 checks passed
@klausler klausler deleted the bug149885 branch July 25, 2025 21:48
@ronlieb
Copy link
Contributor

ronlieb commented Jul 26, 2025

hi @klausler ran into a regression in our hpc2021 testing, bisects to this PR.
if you have access to the suite, should be reproducible with any of the MPI, OMP, or TGT builds
Failure seen in 519, 528.,535
if you need a smaller reproducer , i will try and produce one

Thread 1 "weather_base.cl" received signal SIGSEGV, Segmentation fault.
0x00007ffff74a547e in __GI___libc_free (mem=0x1) at ./malloc/malloc.c:3368
warning: 3368 ./malloc/malloc.c: No such file or directory
(gdb) bt
#0 0x00007ffff74a547e in __GI___libc_free (mem=0x1) at ./malloc/malloc.c:3368
#1 0x00007ffff7e406b0 in Fortran::runtime::io::UnitMap::CloseAll(Fortran::runtime::io::IoErrorHandler&) ()
from /opt/openmpi-5.0.6-flang-new/lib/libmpi_usempif08.so.40
#2 0x00007ffff7e3a0db in Fortran::runtime::io::ExternalFileUnit::CloseAll(Fortran::runtime::io::IoErrorHandler&) ()
from /opt/openmpi-5.0.6-flang-new/lib/libmpi_usempif08.so.40
#3 0x0000000000258c96 in CloseAllExternalUnits(char const*) ()
#4 0x00000000002590fd in _FortranAProgramEndStatement ()
#5 0x0000000000228012 in main () at miniWeather.fppized.f90:894
#6 0x00007ffff7429d90 in __libc_start_call_main (main=main@entry=0x228000

, argc=argc@entry=9, argv=argv@entry=0x7fffffffe6b8)
at ../sysdeps/nptl/libc_start_call_main.h:58
#7 0x00007ffff7429e40 in __libc_start_main_impl (main=0x228000 , argc=9, argv=0x7fffffffe6b8, init=,
fini=, rtld_fini=, stack_end=0x7fffffffe6a8) at ../csu/libc-start.c:392
#8 0x00000000002216b5 in _start ()

@ronlieb
Copy link
Contributor

ronlieb commented Jul 26, 2025

small reproducer, requires an openmpi compiler invocation, so some kind of interaction
$ cat goo.f90

                dt =1.0
                write(*,*) 'dt: ',dt
               end

$ OMPI_FC=/COD/2025-07-26/trunk/bin/flang /opt/openmpi-5.0.6-flang-new/bin/mpif90 -O2 goo.f90 -g
$ ./a.out
dt: 1.
Segmentation fault (core dumped)

searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Jul 26, 2025
…#150461)"

breaks 519, 528, 535 for MPI, OMP, and TGT
runtime error

This reverts commit 918d6db.
@klausler
Copy link
Contributor Author

klausler commented Jul 26, 2025

Works fine for me here. Could you try it with valgrind in your environment?

(This patch affects the input side of user-defined derived type I/O, so perhaps it exposes something in your setup, but I don't see anything in it that would be at risk of smashing memory in the runtime support library's open unit map, and your stack traceback looks like that's what led to the crash.)

@ronlieb
Copy link
Contributor

ronlieb commented Jul 26, 2025

==2250846== Conditional jump or move depends on uninitialised value(s)
==2250846== at 0x208772: Fortran::runtime::io::IoStatementState& Fortran::runtime::io::ExternalFileUnit::BeginIoStatement<Fortran::runtime::io::ExternalListIoStatementState<(Fortran::runtime::io::Direction)0>, Fortran::runtime::io::ExternalFileUnit&, char const*&, int&>(Fortran::runtime::Terminator const&, Fortran::runtime::io::ExternalFileUnit&, char const*&, int&) (in ./a.out)

@klausler
Copy link
Contributor Author

klausler commented Jul 26, 2025

Thanks for that. I'm not seeing it here with valgrind myself, using a flang-rt built with gcc 9.3.0 for x86-64. What are you building with? (It might be a red herring.)

@ronlieb
Copy link
Contributor

ronlieb commented Jul 26, 2025

i think its a mismatch of using an openmpi build that is slightly out of date with respect to latest flang sources. Just now I rebuilt openmpi using todays local build of llvm/main . test passes and valgrind errors no longer present.

i think we can close this out as stale build.

@klausler
Copy link
Contributor Author

I'm happy if you're happy. Thanks for the help.

searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Jul 28, 2025
Child I/O state needs to carry a pointer to the original non-type-bound
defined I/O subroutine table, so that nested defined I/O can call those
defined I/O subroutines. It also needs to maintain a mutableModes
instance for the whole invocation of defined I/O, instead of having a
mutableModes local to list-directed child I/O, so that a top-level data
transfer statement with (say) DECIMAL='COMMA' propagates that setting
down to nested child I/O data transfers.

Fixes llvm#149885.
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Jul 28, 2025
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Jul 28, 2025
Child I/O state needs to carry a pointer to the original non-type-bound
defined I/O subroutine table, so that nested defined I/O can call those
defined I/O subroutines. It also needs to maintain a mutableModes
instance for the whole invocation of defined I/O, instead of having a
mutableModes local to list-directed child I/O, so that a top-level data
transfer statement with (say) DECIMAL='COMMA' propagates that setting
down to nested child I/O data transfers.

Fixes llvm#149885.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[flang] Runtime failure at the READ statement of list-directed I/O with an array input item.
3 participants