I'm sure many of you have tried a GenAI LLM to do something. Maybe write some code, maybe get some sort of recommendation or suggestion, maybe to rewrite something or summarize text. I'm sure you have had some feelings about whether the tool made you more or less productive.
There was a trial conducted by the Australia Department of the Treasury on Microsoft's 365 Copilot, asking for volunteers to participate and use the tool in their daily work. They used it and then completed a survey, which are summarized in this piece. Only 218 people went through the trial, and the results are interesting.
The headline says that the staff rated the GenAI less useful than expected. Those last two words are interesting because your expectations shape a lot of how you view anything in the world. If you expect little and get a little more, you might be happy. If you expect a lot and don't get it, you might be very disappointed.
The sub-headline and the first sentence note that there still is an ROI from the tool. It isn't as helpful and isn't as widely applicable as people expected, but they chalk some of this up to product limitations and some to limited use by people. It was useful in summarizing things and drafting content, which are what they call basic administrative tasks. That's interesting and likely where GenAI tools can help quite a bit.
Maybe the most interesting thing to me is that if Copilot saves 13 minutes a week for mid-level workers, it pays for itself. I don't know how much time it would have to save me, but an hour or two a week might make me use it more. It certainly would be use a small monthly cost. So far, I haven't committed to regular work with the tools, and I think I still spend more time learning and typing with GenAI tools than I'd like. I'm not sure if I am saving time over just doing the task myself. Some of that is because I have habits that allow me to work quickly and to use a Copilot, my pace slows.
This also brings up something I wonder about with these GenAI services. With the cost of compute services, there might not be a lot of margin for vendors to raise prices if people are only lightly more productive. I can see lots of companies starting to use these tools, realizing there isn't as much value as they expected from increased productivity, and then dropping the cost from their budget. That might be some of what we saw in this years State of the Database Landscape, with less people using AI for database management tasks. I suspect some of the hype has died down and people aren't finding the tools as useful as they expected.
I do think GenAI is helpful, but just helpful. It can't do the work, and it can't be trusted more than a junior worker. At least not yet. Maybe that will change, but I haven't seen it to date.