I've had similar experiences with improving cursors. I had to look at someone's report because they weren't getting good data from it anymore. I ran the existing proc on my machine just to see what it would produce. 5 minutes later I hadn't gotten a result, and I had started to read what they were doing and understood why they weren't getting fast results, but still didn't understand what they were trying to do. 10 minutes after that I understood what they were trying to do, their proc is still running. I open up a new window and start getting answers piecemeal.
If I execute this select I'll get all the users they are interested in. Yep. If I join this information I'll find out the offices they are involved in. Yep. Hmmm, how am I going to include all the related offices. Oh, this should do it, but this could loop indefinitely, before I start this, put a different stop in the loop. (Came in handy when I blew the data comparison later and caused the infinite loop.)
Anyway, by the end of the day, I had reproduced the report information (including the information the original proc was dropping.) that they wanted. I left my workstation up overnight because I still hadn't gotten an answer from my machine in 6 hours of running.
Next morning I killed the 20 hour process, wrote up a new proc and got a report in 30 seconds. I heard the old processes ran in 4 hours on the faster production machine for 1 DB and 2 hours for another DB. (Custom processes for each DB.) My proc gets reports from both DBs in 28 seconds.