Why not go for App -> AWS Firehose -> S3 architecture. Firehose will automatically take care of batching logs and putting them together. Also it can create partitions. No need to use Lambda at all.
Don’t assume the author made an informed but poor choice, they likely made something work with the tools they know or found. It’s really difficult to keep up with all the various AWS offerings if your not part of a team that’s immersed in it. Constructively suggesting alternates is welcome, I’ve got some new tools to read up on this morning.
I run some Lambda systems at scale and CloudWatch Logs has worked excellent for me. It was completely painless and CloudWatch logs has a lot of tooling that comes with it (backups, streaming, indexing with Elastic Search, etc).
Mainly searching and exporting. Exporting logs to S3 manually for debugging/post-mortem is harder than it needs to be - maybe this isn't a huge use-case, but we like to attach logs to tickets.
And searching is very limited. Yes, you can stream them to ElasticSearch => LogStash/Kibana, etc but that's extra steps, and I can do similar things completely without CloudWatch Logs, e.g. Sentry and SumoLogic integrate fine with Lambdas also.
I'm not hating on CloudWatch Logs, and alarms are good. It's just not a fully featured logging solution, and it sounds like you agree with that?
I agree you need more tools around it to make it a full logging solution. Doing data visualization is more or less impossible without extra tools. But I've overall liked it.
Search is powerful enough to do something like say, "get me all messages with this trace GUID" and S3 exports can be automated[1] and you could probably use the "task-name" to associate with your ticketing system.
I was considering Kinesis but my use case (multiple small apps not running on AWS sending medium-sized messages a few times per user session) values simple components and keeping the long-term maintenance effort to the minimum.
I'm not sure what your understanding of Kinesis Firehose is but, in this situation it would literally remove some moving parts, it would not add complexity. Also potentially cheaper than SQS anyway.
Kinesis works well for low volume, and isn't any more or less complicated than SQS. You should give it a try, it will certainly make some things simpler
Even though it'd work, kinesis would probably not fit this use case very well (sporadic/low volume logging), since you have to pay every hour, whether you're using it (or barely using it) or not.
Firehose would work, though... no per-hour charges.
Cool idea. I love that you're very clear where this makes sense and where it doesn't. For higher volume logs files I'd definitely consider something like kinesis agent: https://github.com/awslabs/amazon-kinesis-agent
But for lower volume logging this seems like a cost effective solution, and then you can use Athena or EMR to further process or analyze the logs. I didn't see any partitioning of the logs by date--that might be a nice enhancement (unless I missed it).
Yes if the goal is to archive stuff, event per event I'd use cloudwatch logs. You can create the stream you want in order to log to it.
It's probably cheaper but lambdas and s3 are equally dirt cheap so hey why not.
Edit: Two possibilities here. Either the choice of technology is deliberate and there's a reason I can't quite figure out, or AWS has reached that state where they do so much it's really really hard for people to know what service to use for each problem.
Which actually makes me wonder what am I doing right now that could be done easily using a pre-baked AWS solution?
I teach some AWS courses and that's something I always bring up: if you have a system you are thinking of building it always makes sense to see if AWS has a prebaked solution and evaluate it. The solution may not meet your needs but is likely to be available at a price/performance ratio (especially considering fully weighted operational costs) that will make it compelling if it does meet the requirements.
No, you can use cloudwatch anywhere you can install the AWS CLI—-including non AWS assets. Just point the log monitor at a file or folder with the right CLI commands and you can suck up logs in real time into cloudwatch.