I have upgraded lots of SQL Servers from one version to the next, and for the most part, the process has been smooth. That's not always the case, and there have been some long nights where the Operations staff had to scramble to fix things, script out old logins, call Microsoft support, and perform various data exports and imports to get a new instance running. I've been a part of quite a few of those teams.
Most of my upgrades were with relatively small databases, at least small for that time period. However, I have upgraded a few "large" databases in the past. We had a 400GB database on SQL Server 6.5 in 1999 that was a challenge to move to SQL Server 7. I also upgraded an 800GB database in 2001 from 6.5 to 2000 for our Financial team, which involved a lot of stress.
I haven't upgraded a 1TB database, much less a 4TB one, but I know this can be very time-consuming to move all that data. Even with much faster hardware and networks these days, working with that much data can require a decent amount of downtime. That's not very acceptable these days, especially for applications that are used by customers all around the world and all around the clock.
I found an interesting upgrade story from a SaaS provider that runs PostgreSQL. I don't use that platform, but I found the write-up of their process to be aligned with some of my experiences with SQL Server. They delayed upgrading an old version of PostgreSQL and then had the desire to move to as current a version as possible and delay future upgrades (again). I've certainly experienced that. Their goal was also to minimize downtime, as their customers are constantly connecting.
The plan for the upgrade made sense to me. Use replication to move most data and then minimize downtime. That's a technique that can work in SQL Server, though we'd be more likely to use log shipping to simulate this. They also trimmed and cleaned some data, removing the need to upgrade some of the tables. That's something many of us might be able to do every year in some databases, especially for large logging tables where older data might rarely be read. I'd even think about moving that data to another database and using a synonym to access it if it were needed. That's just a good idea for general DR planning.
I also appreciated them creating a runbook and testing the process multiple times in staging. Their big takeaway here: practice over and over with a realistic workload. They created some problems for themselves by not using a real enough workload. That ought to be something your organization does on a regular basis to test your software and ensure new code performs well. Using the same process for upgrades is a bonus.
Large upgrades are stressful and often they are "big bang" deployments where you can't go back to the old system. Practice as much as you can, make sure you have backups, and then be ready to adapt to whatever challenges come up. Plan ahead, and be prepared for a long night. If you do, you'll likely have success. If you don't, I am fairly sure something will go wrong and you'll regret not thinking ahead.