Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terminal crashes deallocating til::bitmap's vector run w/ PMR allocator (??) #9410

Closed
ralish opened this issue Mar 8, 2021 · 9 comments · Fixed by #9618
Closed

Terminal crashes deallocating til::bitmap's vector run w/ PMR allocator (??) #9410

ralish opened this issue Mar 8, 2021 · 9 comments · Fixed by #9618
Assignees
Labels
Area-TerminalControl Issues pertaining to the terminal control (input, selection, keybindings, mouse interaction, etc.) Issue-Bug It either shouldn't be doing this or needs an investigation. Priority-1 A description (P1) Product-Terminal The new Windows Terminal. Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release. Severity-Crash Crashes are real bad news.

Comments

@ralish
Copy link

ralish commented Mar 8, 2021

Environment

Windows build number: 10.0.19042.844
Windows Terminal version: 1.6.10571.0

Steps to reproduce

Unable to reproduce reliably but suspect it's related to split panes across numerous tabs. I can't remember the last time I saw Windows Terminal crash, but I usually don't use split planes much. This time I had numerous tabs (common) and a vertical split in most of those tabs. Probably not material but all terminals were PowerShell 7 and most had open WinRM sessions.

I was able to retrieve a crash dump which I've uploaded via Feedback Hub here. A quick analysis in WinDbg points to heap corruption. The lack of symbols though makes interpreting the stack trace of the faulting thread a bit more involved.

Expected behavior

Not to crash :-)

Actual behavior

Crash of Windows Terminal and the loss of all open terminals.

Other notes

It'd be nice if symbols could be published at least for official releases. This could be as simple as uploading them in a separate Zip attached to the GitHub releases page, instead of publishing them on the official Microsoft symbol server. In turn, this would allow people like me to perform more analysis of issues themselves, hopefully leading to more detailed bug reports :-)

@ghost ghost added Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting Needs-Tag-Fix Doesn't match tag requirements labels Mar 8, 2021
@zadjii-msft
Copy link
Member

@DHowett I'm assigning you because I can't open feedback hub links anymore 😕

@zadjii-msft zadjii-msft added Product-Terminal The new Windows Terminal. Severity-Crash Crashes are real bad news. labels Mar 8, 2021
@DHowett DHowett changed the title Windows Terminal crash with heap corruption Terminal crashes deallocating til::bitmap's vector run w/ PMR allocator (??) Mar 15, 2021
@DHowett
Copy link
Member

DHowett commented Mar 15, 2021

Copied from @asklar

Stack trace:

# Child-SP          RetAddr               Call Site
00 000000fb`e759dc10 00007ff8`8dc47013     ntdll!RtlReportFatalFailure+0x9 [minkernel\ntos\rtl\rtlutil.c @ 150] 
01 000000fb`e759dc60 00007ff8`8dc5008a     ntdll!RtlReportCriticalFailure+0x97 [minkernel\ntos\rtl\rtlutil.c @ 218] 
02 000000fb`e759dd50 00007ff8`8dc5036a     ntdll!RtlpHeapHandleError+0x12 [minkernel\ntos\rtl\heaplog.c @ 352] 
03 000000fb`e759dd80 00007ff8`8dc5a9e9     ntdll!RtlpHpHeapHandleError+0x7a [minkernel\ntos\rtl\heaplog.c @ 678] 
04 000000fb`e759ddb0 00007ff8`8dbf9456     ntdll!RtlpLogHeapFailure+0x45 [minkernel\ntos\rtl\heaplibcommon.c @ 159] 
05 (Inline Function) --------`--------     ntdll!RtlpHpLfhReportError+0x20 [minkernel\ntos\rtl\heaplfh.c @ 6978] 
06 000000fb`e759dde0 00007ff8`8db662a0     ntdll!RtlpHpLfhSubsegmentFreeBlock+0x92d06 [minkernel\ntos\rtl\heaplfh.c @ 6004] 
07 (Inline Function) --------`--------     ntdll!RtlpHpLfhContextFree+0x9 [minkernel\ntos\rtl\heaplfh.c @ 598] 
08 (Inline Function) --------`--------     ntdll!RtlpHpSegFree+0xbf [minkernel\ntos\rtl\heapsegmentalloc.c @ 662] 
09 (Inline Function) --------`--------     ntdll!RtlpHpFreeHeap+0x366 [minkernel\ntos\rtl\heapsegshared.c @ 1794] 
0a 000000fb`e759de60 00007ff8`8db65ade     ntdll!RtlpFreeHeapInternal+0x3c0 [minkernel\ntos\rtl\heap.c @ 2549] 
0b (Inline Function) --------`--------     ntdll!RtlpHpHeapFreeRedirectLayer+0x1a [minkernel\ntos\rtl\heappublic.c @ 243] 
0c 000000fb`e759df20 00007ff8`8db65a0d     ntdll!RtlpHpFreeWithExceptionProtection+0x1e [minkernel\ntos\rtl\heappublic.c @ 281] 
0d 000000fb`e759df90 00007ff8`8b90219b     ntdll!RtlFreeHeap+0x6d [minkernel\ntos\rtl\heappublic.c @ 351] 
0e 000000fb`e759dfd0 00007ff8`3174c735     ucrtbase!_free_base+0x1b [minkernel\crts\ucrt\src\appcrt\heap\free_base.cpp @ 105] 
0f (Inline Function) --------`--------     TerminalControl!std::pmr::memory_resource::deallocate+0x17 [C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.28.29333\include\xpolymorphic_allocator.h @ 154] 
10 (Inline Function) --------`--------     TerminalControl!std::pmr::polymorphic_allocator<til::rectangle>::deallocate+0x1b [C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.28.29333\include\xpolymorphic_allocator.h @ 222] 
11 (Inline Function) --------`--------     TerminalControl!std::vector<til::rectangle,std::pmr::polymorphic_allocator<til::rectangle> >::_Tidy+0x2b [C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.28.29333\include\vector @ 1696] 
12 (Inline Function) --------`--------     TerminalControl!std::vector<til::rectangle,std::pmr::polymorphic_allocator<til::rectangle> >::{dtor}+0x2b [C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.28.29333\include\vector @ 673] 
13 (Inline Function) --------`--------     TerminalControl!std::_Destroy_in_place+0x2b [C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.28.29333\include\xmemory @ 277] 
14 (Inline Function) --------`--------     TerminalControl!std::_Optional_destruct_base<std::vector<til::rectangle,std::pmr::polymorphic_allocator<til::rectangle> >,0>::reset+0x34 [C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.28.29333\include\optional @ 102] 
15 000000fb`e759e000 00007ff8`31748a07     TerminalControl!til::details::bitmap<std::pmr::polymorphic_allocator<unsigned __int64> >::set+0x155 [E:\BA\1\s\src\inc\til\bitmap.h @ 362] 
16 (Inline Function) --------`--------     TerminalControl!Microsoft::Console::Render::DxEngine::_InvalidateRectangle+0xa8 [E:\BA\1\s\src\renderer\dx\DxRenderer.cpp @ 1025] 
17 000000fb`e759e090 00007ff8`31740cc4     TerminalControl!Microsoft::Console::Render::DxEngine::Invalidate+0x1a7 [E:\BA\1\s\src\renderer\dx\DxRenderer.cpp @ 1049] 
18 (Inline Function) --------`--------     TerminalControl!Microsoft::Console::Render::Renderer::TriggerRedraw::__l10::<lambda_2d0d9b7ab976172a5fedf827605f350e>::operator()+0x16 [E:\BA\1\s\src\renderer\base\renderer.cpp @ 239] 
19 (Inline Function) --------`--------     TerminalControl!std::for_each+0x41 [C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.28.29333\include\algorithm @ 308] 
1a 000000fb`e759e110 00007ff8`3175cd62     TerminalControl!Microsoft::Console::Render::Renderer::TriggerRedraw+0x194 [E:\BA\1\s\src\renderer\base\renderer.cpp @ 238] 
1b 000000fb`e759e170 00007ff8`3175e2e4     TerminalControl!Microsoft::Terminal::Core::Terminal::_InvalidateFromCoords+0x62 [E:\BA\1\s\src\cascadia\TerminalCore\Terminal.cpp @ 722] 
1c (Inline Function) --------`--------     TerminalControl!Microsoft::Terminal::Core::Terminal::_InvalidatePatternTree::__l2::<lambda_9a52256f1328819638ef6d4b55293fb7>::operator()+0x64 [E:\BA\1\s\src\cascadia\TerminalCore\Terminal.cpp @ 687] 
1d (Inline Function) --------`--------     TerminalControl!std::for_each+0x78 [C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.28.29333\include\algorithm @ 308] 
1e 000000fb`e759e1f0 00007ff8`3175dca8     TerminalControl!interval_tree::IntervalTree<til::point,unsigned __int64>::visit_all<<lambda_9a52256f1328819638ef6d4b55293fb7> >+0xc4 [E:\BA\1\s\oss\interval_tree\IntervalTree.h @ 307] 

@DHowett
Copy link
Member

DHowett commented Mar 15, 2021

@miniksa may be interested in this (and there's a chance it'll hit conhost, too.)

@DHowett
Copy link
Member

DHowett commented Mar 15, 2021

I suspect this is due to a lack of locking around clearing the pattern interval tree on scroll. In trying to reproduce it, though, I hit another issue that seems very similar.

@DHowett DHowett added Area-TerminalControl Issues pertaining to the terminal control (input, selection, keybindings, mouse interaction, etc.) Issue-Bug It either shouldn't be doing this or needs an investigation. Priority-1 A description (P1) labels Mar 15, 2021
@ghost ghost removed the Needs-Tag-Fix Doesn't match tag requirements label Mar 15, 2021
@zadjii-msft
Copy link
Member

oh god in my refactor I totally found a place where we weren't taking the lock while updating the pattern interval tree (or something like that) and fixed it, but I doubt that I did it in an atomic commit. I'd be even more surprised if the commit message was even 1% useful in identifying that code change 😕

@DHowett
Copy link
Member

DHowett commented Mar 25, 2021

There's a few places we're not taking the lock. Here's the patch I'm testing...

From 7b1cec066d2c41c81bc4be12a81b7075381565ce Mon Sep 17 00:00:00 2001
From: Dustin Howett <[email protected]>
Date: Thu, 25 Mar 2021 11:44:29 -0500
Subject: [PATCH] LOCK IT UP

---
 src/cascadia/TerminalControl/TermControl.cpp | 24 ++++++++++++++++----
 src/cascadia/TerminalCore/Terminal.cpp       |  3 +++
 2 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/src/cascadia/TerminalControl/TermControl.cpp b/src/cascadia/TerminalControl/TermControl.cpp
index 08b78ec84..6b97c1ffd 100644
--- a/src/cascadia/TerminalControl/TermControl.cpp
+++ b/src/cascadia/TerminalControl/TermControl.cpp
@@ -1815,7 +1815,14 @@ namespace winrt::Microsoft::Terminal::Control::implementation
         }
 
         // Clear the regex pattern tree so the renderer does not try to render them while scrolling
-        _terminal->ClearPatternTree();
+        {
+            // We're taking the lock here instead of in ClearPatternTree because ClearPatternTree is
+            // sometimes called from an already-locked context. Here, we are sure we are not
+            // already under lock (since it is not an internal scroll bar update)
+            // GH#XXXX refine locking around pattern tree
+            auto lock = _terminal->LockForWriting();
+            _terminal->ClearPatternTree();
+        }
 
         const auto newValue = static_cast<int>(args.NewValue());
 
@@ -2434,6 +2441,9 @@ namespace winrt::Microsoft::Terminal::Control::implementation
         }
 
         // Clear the regex pattern tree so the renderer does not try to render them while scrolling
+        // We're **NOT** taking the lock here unlike _ScrollbarChangeHandler because
+        // we are already under lock (since this usually happens as a result of writing).
+        // GH#XXXX refine locking around pattern tree
         _terminal->ClearPatternTree();
 
         _scrollPositionChangedHandlers(viewTop, viewHeight, bufferSize);
@@ -3334,8 +3344,13 @@ namespace winrt::Microsoft::Terminal::Control::implementation
 
         _lastHoveredCell = terminalPosition;
 
+        uint16_t newId{ 0u };
+        // we can't use auto here becuase we're pre-declaring newInterval.
+        decltype(_terminal->GetHyperlinkIntervalFromPosition(COORD{})) newInterval{ std::nullopt };
         if (terminalPosition.has_value())
         {
+            auto lock = _terminal->LockForReading(); // Lock for the duration of our reads.
+
             const auto uri = _terminal->GetHyperlinkAtPosition(*terminalPosition);
             if (!uri.empty())
             {
@@ -3361,15 +3376,16 @@ namespace winrt::Microsoft::Terminal::Control::implementation
                 OverlayCanvas().SetLeft(HyperlinkTooltipBorder(), (locationInDIPs.x() - SwapChainPanel().ActualOffset().x));
                 OverlayCanvas().SetTop(HyperlinkTooltipBorder(), (locationInDIPs.y() - SwapChainPanel().ActualOffset().y));
             }
-        }
 
-        const uint16_t newId = terminalPosition.has_value() ? _terminal->GetHyperlinkIdAtPosition(*terminalPosition) : 0u;
-        const auto newInterval = terminalPosition.has_value() ? _terminal->GetHyperlinkIntervalFromPosition(*terminalPosition) : std::nullopt;
+            newId = _terminal->GetHyperlinkIdAtPosition(*terminalPosition);
+            newInterval = _terminal->GetHyperlinkIntervalFromPosition(*terminalPosition);
+        }
 
         // If the hyperlink ID changed or the interval changed, trigger a redraw all
         // (so this will happen both when we move onto a link and when we move off a link)
         if (newId != _lastHoveredId || (newInterval != _lastHoveredInterval))
         {
+            auto lock = _terminal->LockForWriting();
             _lastHoveredId = newId;
             _lastHoveredInterval = newInterval;
             _renderEngine->UpdateHyperlinkHoveredId(newId);
diff --git a/src/cascadia/TerminalCore/Terminal.cpp b/src/cascadia/TerminalCore/Terminal.cpp
index 3d0a0b810..4b5999778 100644
--- a/src/cascadia/TerminalCore/Terminal.cpp
+++ b/src/cascadia/TerminalCore/Terminal.cpp
@@ -453,6 +453,7 @@ bool Terminal::IsTrackingMouseInput() const noexcept
 // - The position
 std::wstring Terminal::GetHyperlinkAtPosition(const COORD position)
 {
+    auto lock = LockForReading();
     auto attr = _buffer->GetCellDataAt(_ConvertToBufferCell(position))->TextAttr();
     if (attr.IsHyperlink())
     {
@@ -486,6 +487,7 @@ std::wstring Terminal::GetHyperlinkAtPosition(const COORD position)
 // - The hyperlink ID
 uint16_t Terminal::GetHyperlinkIdAtPosition(const COORD position)
 {
+    auto lock = LockForReading();
     return _buffer->GetCellDataAt(_ConvertToBufferCell(position))->TextAttr().GetHyperlinkId();
 }
 
@@ -497,6 +499,7 @@ uint16_t Terminal::GetHyperlinkIdAtPosition(const COORD position)
 // - The interval representing the start and end coordinates
 std::optional<PointTree::interval> Terminal::GetHyperlinkIntervalFromPosition(const COORD position)
 {
+    auto lock = LockForReading();
     const auto results = _patternIntervalTree.findOverlapping(COORD{ position.X + 1, position.Y }, position);
     if (results.size() > 0)
     {
-- 
2.29.0.vfs.0.0

@ghost ghost added the In-PR This issue has a related PR label Mar 25, 2021
@DHowett
Copy link
Member

DHowett commented Mar 25, 2021

The read locks in the inner leaves proved to be bad because shared_mutex isn't intended to be taken recursively, simply shared.

@zadjii-msft zadjii-msft removed the Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting label Mar 26, 2021
@ghost ghost closed this as completed in #9618 Mar 26, 2021
ghost pushed a commit that referenced this issue Mar 26, 2021
We have been seeing some crashes (#9410) originating from a
use-after-free or a double-free in the renderer. The renderer is
iterating over the dirty rects from the render engine¹ and the rect list
is being freed out from under it.

Things like this are usually the result of somebody manipulating the
renderer's state outside of lock.

Therefore, this pull request introduces some targeted locking fixes
around manipulation of the pattern buffer (which, in turn, changes the
renderer state.)

¹ This was not a problem until #8621, which made the renderer return a
span instead of a copy for the list of dirty rects.

## Validation

I ran Terminal under App Verifier, and introduced a manul delay (under
lock) in the renderer such that the invalid map would definitely have
been invalidated between the renderer taking the lock and the renderer
handling the frame. AppVerif failed us without these locking changes,
and did not do so once they were introduced.

Closes #9410.
@ghost ghost added Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release. and removed In-PR This issue has a related PR labels Mar 26, 2021
DHowett added a commit that referenced this issue Apr 2, 2021
We have been seeing some crashes (#9410) originating from a
use-after-free or a double-free in the renderer. The renderer is
iterating over the dirty rects from the render engine¹ and the rect list
is being freed out from under it.

Things like this are usually the result of somebody manipulating the
renderer's state outside of lock.

Therefore, this pull request introduces some targeted locking fixes
around manipulation of the pattern buffer (which, in turn, changes the
renderer state.)

¹ This was not a problem until #8621, which made the renderer return a
span instead of a copy for the list of dirty rects.

## Validation

I ran Terminal under App Verifier, and introduced a manul delay (under
lock) in the renderer such that the invalid map would definitely have
been invalidated between the renderer taking the lock and the renderer
handling the frame. AppVerif failed us without these locking changes,
and did not do so once they were introduced.

Closes #9410.

(cherry picked from commit ea3e56d)
@ghost
Copy link

ghost commented Apr 14, 2021

🎉This issue was addressed in #9618, which has now been successfully released as Windows Terminal v1.7.1033.0.:tada:

Handy links:

@ghost
Copy link

ghost commented Apr 14, 2021

🎉This issue was addressed in #9618, which has now been successfully released as Windows Terminal Preview v1.8.1032.0.:tada:

Handy links:

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-TerminalControl Issues pertaining to the terminal control (input, selection, keybindings, mouse interaction, etc.) Issue-Bug It either shouldn't be doing this or needs an investigation. Priority-1 A description (P1) Product-Terminal The new Windows Terminal. Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release. Severity-Crash Crashes are real bad news.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants