I am looking to host an intensive computation app on AWS, and can't afford to wait for Lambda cold starts. The app needs to be able to handle up to 300 users without losing in performances, but it won't be having 300 users at all time, so it needs to be able to scale up and down. I've been benchmarking both Lambda with provisioned concurrency and EC2, and here are my first conclusions :
Lambda is giving me very satisfying response times when configured with high memory (resulting in more vcpu resources)
EC2 is giving me ok results (with c5a.large)
It's more interesting for me to have lots of small EC2 instances rather than a big instance (ideally I am looking for one instance per active user)
Now here is the big question : it'd be less expensive to have as many EC2 instances as the number of Lambda I plan to provision (since my lambdas are configured with 10GB of memory), but I'm not really sure if that would cause any problems with AWS, and I feel that provisioned concurrency on lambda would be more suited despite being more expensive. So what would you do in this situation ? Are there other parameters that I should consider ?
PS :I thought about pinging my functions myself to keep them warm,but that feels very insecure to me when trying to warm around 300 lambda at the same time (even if this would be much cheaper).