Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rapid input causes LocalProcessTerminalView hosting REPL processes to hang #294

Closed
nkleemann opened this issue May 18, 2023 · 12 comments
Closed

Comments

@nkleemann
Copy link
Contributor

I"m using your package in SwiftUI on macOS to host a GHCI process. When rapidly sending large amounts of text to LocalProcessTerminalView, the view becomes unresponsive.

Steps to Reproduce

  1. Create an instance of LocalProcessTerminalView and start a REPL process like GHCI.
  2. Rapidly send large chunks of text to the instance.

Expected Behavior

The REPL process within the LocalProcessTerminalView should properly handle the incoming data and remain responsive, regardless of the input rate or size.

Actual Behavior

After rapidly sending large amounts of text, the LocalProcessTerminalView freezes and the hosted REPL process becomes unresponsive to further text input (manual or via send).


I"ve noticed the implementation of the send method uses DispatchIO. I"m wondering if it might be beneficial to provide a mechanism to notify the caller when the send operation is complete. This might potentially enable us to ensure that send is not called while it"s already running, particularly in the context of a SwiftUI @MainActor situation. This could be one route towards a resolution, although of course I understand there may be complexities that I"m not aware of.

@migueldeicaza
Copy link
Owner

I am unable to reproduce this. I suspect there is a race condition and this is why you are able to reproduce and not me.

Would you be so kind to stop the process when it hangs and get the stack traces for all the threads involved?

@nkleemann
Copy link
Contributor Author

Your are most certainly right about the problem being a race condition. I"m building an editor for music live coding. Users can, at any rate, send multiline strings of any size into a GHCI process I"m hosting with LocalProcessTerminalView.
As it"s send method uses DispatchIO, and the hang occurs every time when I"m calling send too often with large code blocks to send into GHCI, I think this is most certainly a threading problem.

Here"s the function I"m using that calls .send:

func runBlock() async {
    guard await bootTidal() else {
        log("tidal session was stopped")
        return
    }
    let blockContent = self.getCurrentBlockContent()        
    let formattedBlockContent = ":{\n\(blockContent)\n:}\n"

    // todo: find a way to get a callback when this finishes, or queue calls here
    self.ghciSessionView?.send(txt: formattedBlockContent)
}

One potential solution from my end could be to implement a queue, but it would greatly simplify things if callers of send could be notified upon its completion or if they could pass a completion handler along with the text.

In your opinion, would this be a useful feature for other situations as well, and do you think it could justify a change on your end? Or perhaps you could recommend another way to address this issue?

@migueldeicaza
Copy link
Owner

oh the issue is you are sending the data as opposed to the reading of data?

That could explain why I didn’t reproduce. How much data is being sent?

This might be even be simpler, just a deadlock based on buffer sizes. Annoying to address, but easier to reason about.

@nkleemann
Copy link
Contributor Author

nkleemann commented May 19, 2023

I ran a quick test, with this multiline string (tidal cycles expression):

let block = """
:{
do
    setcps (150/60/4)
    d1 $ every 3 rev $ fast 8 $ note "<a2 d3> <c d ds g a>" # s "plaits"
        # engine (slow 8 $ run 16)
        # sustain (slow 8 $ sine)
        # cut 1
        # level (slow 8 $ range 0.1 0.2 sine)
        # lpf "[300|800|400]"
        # mu 3
        # delay 0.2
        # delaytime "0.3333"
        # delayfeedback 0.1
    d2 $ n "[[~ ~ ~ 0 ~ ~ 0 ~], [4*4|4*16]]" # s "gpu2"
        # speed 0.86
        # lpf "1000 1200 1400"
        # gain 1.1
    d3 $ every 3 (rev) $ loopAt 2 $ splice 8 "0 <1 6 [3|2]> 2 [3|5] 7 0 5 [1|2]" $ s "break:9"
        # pan 0.4
        # hpf 200
        # room 0.2
        # sz 0.91
        # gain 0.7
        # cut 1
    d4 $ degrade $ note "{a5 d6}%2"
        # s "hs1:6"
        # gverb 1
        # gsize 200
        # gdelay 20
        # gfdbk 0.92
        # gain 0.5
        # lpf "{3000 2000 1000}%3"
        # hpf 300
        |-| note "{12 7 0}%3"
:}
"""

When I called .send(txt: block) 4 times, with a short delay between the calls (50-200ms) with this string its enough to make the REPL process unresponsive. I think it"s more than a rendering issue because not only does the LocalProcessTerminalView UI doesn"t respond anymore, the connection to the spawned process (GHCI in my case) also seems to be lost. I can see the GHCI process running in activity monitor, but I can"t send input to it anymore.

It"s important to note that any size of input is okay, as long as you don"t try to call send n times in quick succession. n seems to be directly proportional to the input size though 😁

@migueldeicaza
Copy link
Owner

Thanks, let me try to reproduce

@migueldeicaza
Copy link
Owner

Ok, I tried something like that, but can not reproduce.

I think the best thing to do is get a trace of all the threads once it get stuck. In Xcode, pause the program from the debug menu and then on the debugger console type the bt all command:

(lldb) bt all

@nkleemann
Copy link
Contributor Author

nkleemann commented May 20, 2023

Here"s the stack trace just after the terminal view freezes:

* thread #1, queue = "com.apple.main-thread", stop reason = signal SIGSTOP
  * frame #0: 0x0000000191dabf14 libsystem_kernel.dylib`mach_msg2_trap + 8
    frame #1: 0x0000000191dbe240 libsystem_kernel.dylib`mach_msg2_internal + 80
    frame #2: 0x0000000191db4b78 libsystem_kernel.dylib`mach_msg_overwrite + 604
    frame #3: 0x0000000191dac290 libsystem_kernel.dylib`mach_msg + 24
    frame #4: 0x0000000191eca8b8 CoreFoundation`__CFRunLoopServiceMachPort + 160
    frame #5: 0x0000000191ec9198 CoreFoundation`__CFRunLoopRun + 1208
    frame #6: 0x0000000191ec858c CoreFoundation`CFRunLoopRunSpecific + 612
    frame #7: 0x000000019b6fddf4 HIToolbox`RunCurrentEventLoopInMode + 292
    frame #8: 0x000000019b6fdc30 HIToolbox`ReceiveNextEventCommon + 648
    frame #9: 0x000000019b6fd988 HIToolbox`_BlockUntilNextEventMatchingListInModeWithFilter + 76
    frame #10: 0x00000001950e7f58 AppKit`_DPSNextEvent + 636
    frame #11: 0x00000001950e70f4 AppKit`-[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 716
    frame #12: 0x00000001950db558 AppKit`-[NSApplication run] + 464
    frame #13: 0x00000001950b29a8 AppKit`NSApplicationMain + 880
    frame #14: 0x00000001ba121974 SwiftUI`___lldb_unnamed_symbol82468 + 148
    frame #15: 0x00000001bb280fa0 SwiftUI`___lldb_unnamed_symbol207338 + 164
    frame #16: 0x00000001bab0243c SwiftUI`static SwiftUI.App.main() -> () + 128
    frame #17: 0x0000000104b93ea8 TidalConduit`static TCApp.$main(self=TidalConduit.TCApp) at TCApp.swift:10:1
    frame #18: 0x0000000104b94308 TidalConduit`main at TCApp.swift:0
    frame #19: 0x0000000191a93f28 dyld`start + 2236
  thread #4, name = "gputools.smt_poll.0x600002e3edc0"
    frame #0: 0x0000000191daf50c libsystem_kernel.dylib`__semwait_signal + 8
    frame #1: 0x0000000191c902d0 libsystem_c.dylib`nanosleep + 220
    frame #2: 0x0000000191c901e8 libsystem_c.dylib`usleep + 68
    frame #3: 0x00000001094439dc GPUToolsCapture`smt_poll_thread_entry(void*) + 108
    frame #4: 0x0000000108c6555c libsystem_pthread.dylib`_pthread_start + 148
  thread #7, name = "caulk.messenger.shared:17"
    frame #0: 0x0000000191dabe90 libsystem_kernel.dylib`semaphore_wait_trap + 8
    frame #1: 0x000000019b40a024 caulk`caulk::semaphore::timed_wait(double) + 212
    frame #2: 0x000000019b409ed8 caulk`caulk::concurrent::details::worker_thread::run() + 36
    frame #3: 0x000000019b409bc8 caulk`void* caulk::thread_proxy<std::__1::tuple<caulk::thread::attributes, void (caulk::concurrent::details::worker_thread::*)(), std::__1::tuple<caulk::concurrent::details::worker_thread*> > >(void*) + 96
    frame #4: 0x0000000108c6555c libsystem_pthread.dylib`_pthread_start + 148
  thread #8, name = "com.apple.audio.toolbox.AUScheduledParameterRefresher"
    frame #0: 0x0000000191dabe90 libsystem_kernel.dylib`semaphore_wait_trap + 8
    frame #1: 0x000000019b40a024 caulk`caulk::semaphore::timed_wait(double) + 212
    frame #2: 0x000000019b409ed8 caulk`caulk::concurrent::details::worker_thread::run() + 36
    frame #3: 0x000000019b409bc8 caulk`void* caulk::thread_proxy<std::__1::tuple<caulk::thread::attributes, void (caulk::concurrent::details::worker_thread::*)(), std::__1::tuple<caulk::concurrent::details::worker_thread*> > >(void*) + 96
    frame #4: 0x0000000108c6555c libsystem_pthread.dylib`_pthread_start + 148
  thread #9, name = "com.apple.NSEventThread"
    frame #0: 0x0000000191dabf14 libsystem_kernel.dylib`mach_msg2_trap + 8
    frame #1: 0x0000000191dbe240 libsystem_kernel.dylib`mach_msg2_internal + 80
    frame #2: 0x0000000191db4b78 libsystem_kernel.dylib`mach_msg_overwrite + 604
    frame #3: 0x0000000191dac290 libsystem_kernel.dylib`mach_msg + 24
    frame #4: 0x0000000191eca8b8 CoreFoundation`__CFRunLoopServiceMachPort + 160
    frame #5: 0x0000000191ec9198 CoreFoundation`__CFRunLoopRun + 1208
    frame #6: 0x0000000191ec858c CoreFoundation`CFRunLoopRunSpecific + 612
    frame #7: 0x0000000195212508 AppKit`_NSEventThread + 172
    frame #8: 0x0000000108c6555c libsystem_pthread.dylib`_pthread_start + 148
  thread #14
    frame #0: 0x0000000108c6fa64 libsystem_pthread.dylib`start_wqthread
  thread #15
    frame #0: 0x0000000191dadbc8 libsystem_kernel.dylib`__workq_kernreturn + 8
  thread #16
    frame #0: 0x0000000108c6fa64 libsystem_pthread.dylib`start_wqthread
  thread #17
    frame #0: 0x0000000000000000

If this doesn"t help I"ll try to come up with a minimal example we can both run and verify in the coming days. Do you have GHC installed? If not I"ll try it with a Python REPL.

@migueldeicaza
Copy link
Owner

Mhm. This stack trace does not show a deadlock.

@nkleemann
Copy link
Contributor Author

I"ve come up with this quick&dirty example using a python REPL. On my machine I can reproduce the issue every time after pressing the button. You can tweak send call delay time and numbers in the Constants struct if you can"t.

import SwiftUI
import SwiftTerm

struct Constants {
    
    static let python3Path = "/usr/bin/python3"
    // delay between .send calls in seconds
    static let sendDelay = 0.05
    // number of times to trigger .send
    static let sendCount = 4
    // dummy python code to send into repl
    static let blockContent = """
    import time

    def is_prime(n):
        if n <= 1:
            return False
        elif n <= 3:
            return True
        elif n % 2 == 0 or n % 3 == 0:
            return False
        i = 5
        while i * i <= n:
            if n % i == 0 or n % (i + 2) == 0:
                return False
            i += 6
        return True

    def print_first_n_primes(n):
        count = 0
        num = 2
        while count < n:
            if is_prime(num):
                print(num)
                count += 1
            num += 1

    start = time.time()
    print_first_n_primes(10000)
    end = time.time()
    print(f"Time taken: {end - start} seconds")
    """
}

@main
struct FreezingREPLApp: App {
    
    @StateObject var replViewManager = REPLViewManager()
    
    var body: some Scene {
        WindowGroup {
            ContentView()
                .environmentObject(replViewManager)
        }
    }
}

struct ContentView: View {
    
    @EnvironmentObject var replViewManager: REPLViewManager
    
    var body: some View {
        
        VStack {
            REPLView()
            
            Button(action: {
                replViewManager.sendTextToREPL()
            }) {
                Text("freeze the repl")
            }
        }
        .padding()
    }
}

class REPLViewManager: ObservableObject {
    
    private var replView: LocalProcessTerminalView?
    
    func setupREPLView(_ view: LocalProcessTerminalView) {
        view.font = .monospacedSystemFont(ofSize: 10, weight: .regular)
        replView = view
        replView?.startProcess(executable: Constants.python3Path)
    }
    
    
    func sendTextToREPL() {
        guard let repl = replView else {
            print("REPLView is not initialized")
            return
        }

        for i in 0..<Constants.sendCount {
            DispatchQueue.main.asyncAfter(deadline: .now() + Constants.sendDelay * Double(i)) {
                repl.send(txt: Constants.blockContent)
            }
        }
    }
}

struct REPLView: NSViewRepresentable {
    
    @EnvironmentObject var replViewManager: REPLViewManager
    
    func makeCoordinator() -> Coordinator {
        Coordinator(self)
    }

    func makeNSView(context: Context) -> LocalProcessTerminalView {
        let view = LocalProcessTerminalView(frame: CGRect(x: 0, y: 0, width: 200, height: 200))
        replViewManager.setupREPLView(view)
        return view
    }

    func updateNSView(_ nsView: LocalProcessTerminalView, context: Context) {}

    class Coordinator: NSObject {
        var parent: REPLView

        init(_ parent: REPLView) {
            self.parent = parent
        }
    }
}

After pressing the button I can see some output in the terminal - until it quickly stops. Then it becomes unresponsive to further input. If you watch the Activity Monitor before and after clicking the button you"ll see that the python process doesn"t crash - but somehow the LocalProcessTerminalView looses access to it (I guess).

@migueldeicaza
Copy link
Owner

This is fabulous! Thank you, I’ll take a look!

migueldeicaza added a commit that referenced this issue May 22, 2023
#294

It seems like the problem is the use of the convenience DispatchQueue
read/write APIs that use a file descriptor as a parameter.  They would
stop delivering data even if data was avaialble, blocking the child
process.

I created a dedicated channel for reading and it works fine now (might
consider doing this for writing to).

Additionally, there is a new dispatchQueue introduced, this is not
necessary to fix this particular bug, but while I was testing, I
noticed a nicer scrolling behavior with a lot of output if I received
the data on a queue, and then dispatched the results to the main queue
than if I processed this directly in the main queue, which had a
visual behavior of "waves".
@migueldeicaza
Copy link
Owner

migueldeicaza commented May 22, 2023

Fixed: your test case was invaluable, thank you so much for taking the time to put it together.

@nkleemann
Copy link
Contributor Author

I can only give that back, thank you for the quick fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants