Distributed System Logging Best Practices (3) — Automation

Tsuyoshi Ushio
3 min readJan 12, 2023
Log for automation

We use logging for automation. That saves a bunch of time especially if your team is working as DevOps model. Good logging practices help you to automate several things.

How is log used except for querying logs?

Dashboard

A lot of Log query tools support creating a dashboard. For this purpose, we need to care about the cost of the query. “Unique” “typed/prefixed” will help this usage.

Alert

We might want to create an alert for an event. For example, lack of capacity, known issue happens. It is also important to reduce the cost of the query.

Self-Healing

You might not want to wake up at 2 am. What if you find a known issue, but it can’t be fixed until next month? Self-Healing comes in. For example, if you find an exception that shows the known issue, you can simply restart the VM. If the lack of capacity happens, it automatically increases the capacity, and so on. In the cloud world, these things help to reduce your operation cost. The logs should contain enough information to start the operation for the triggering. “State” category’s best practices might help.

Auto-Diagnostics

You might not want to answer all the customers’ questions when their system goes wrong. Instead, you can write a system that diagnostic the issue automatically. That helps you to focus on coding rather than operation.

This is the example of Diagnose and solve problems on Azure Functions. Log is the most essential element in this system.

Auto-Diagnose

Testing

For the distributed system, Logging is used to validate the behavior on the unit testing/integration/e2e testing. Logger mocking is the one we can use.

Log as API contract

I’d like to introduce concept level best practices. My friend Chris Gillum said that this concept. We should think of Log as an API contract. Since we automate several things-based tops on logging. So that if the log “API” has been changed, the automation and validation could be damaged. So that we should think log as API contract.

Conclusion

The following sample app represents a distributed system with multiple microservices. You can run, modify, observe how it looks like. The sample doesn’t include all the best practices; however, you can see several of them.

LoggingSampleFunction/Function1.cs at master · TsuyoshiUshio/LoggingSampleFunction (github.com)

I hope you enjoy this post.

--

--