Featured image of post Sparta v1.5.0— The Observability Edition

Sparta v1.5.0— The Observability Edition

image

This month marks the three (!) year anniversary of my work on Sparta. Sparta is a framework that transforms a go binary into a GitOps friendly, self-deploying, and operationally aware application that targets AWS Lambda as its execution environment.

Given a function similar to this and a bit of bootstrapping code: func helloWorld() (string, error) { return "Hello World", nil }

With a single mage command you can compile, package, upload, and create a CloudFormation managed microservice. (The log is more colorful in person).

See the SpartaHelloWorld project for more details.

One of the guiding principles for Sparta’s development is that that functional and non-functional requirements should be uniformly expressed. Metaparticle.io does something similar at a library level. Languages and the assumptions they tacitly embed often foster distinct language communities that can come to see one another as “different”. This perceived distance can hinder collaboration, which from a customer perspective, often means that things “don’t work”. Customers don’t care that a service self-reports it’s working at 5 9s, but the bash to Chef migration caused the 12-factor config to drift a bit and now a critical environment variable is missing and in fact, as far the customer is concerned the service is not working at all. That last part is something customers most definitely care about — until they don’t.

On a new project it might start out being accurate that infrastructure is relatively rigid and the business logic seems endlessly fluid. (Those customers, what exactly do they want?) **** However, perhaps a few years later the service has taken off and the business logic is virtually in lock-down, the team disbanded. Only now the organization is moving to new regions, providers, or execution environments. Or maybe even eliminating a piece of infrastructure entirely in favor of a Service Full migration. What was fluid is no longer so and what was presumed fixed is now in a constant state of churn. Rates of change…change.

Promoting operational responsibilities to the same tier as business logic opens up the possibility of making more adaptive and internally consistent service deployments. There’s a possibility that a deployment can close over all its typically dangling dependencies and ensure that the ids, metrics, alerts, functions, secrets, dashboards, per-environment behaviors and everything else over in JSON/YAML/XML/ConfigLand can be expressed in a uniform way.

Which sets the stage for Sparta 1.5.0 — The Observability Edition. At the risk of enraging both serverless and observability purists, I’m going to stick with the general idea of “How can I gain a better understanding of my existing service.” There are others who can speak much better to observability itself including JBD, Charity Majors, and Cindy Sridharan. I’m limiting things to the case of how a Sparta service can provide more transparency using the same programming constructs used to define the service behavior.

image

Metrics

To round out your service’s core business logic, Sparta provides an opportunity to decorate the CloudFormation template that defines your infrastructure. For example, you can provision a CloudWatch Dashboard using a DashboardDecorator to produce an application-centric view of your service (similar to the recently announced Applications view).

The past few Sparta releases have extended those capabilities to include support for:

  • CloudWatchErrorAlarmDecorator: create and associate a CloudWatch Alarm that’s triggered by a configurable number of AWS Lambda errors over a given period.
  • RegisterLambdaUtilizationMetricPublisher: register a periodic task that publishes container level metrics to CloudWatch. Define custom metric dimensions including your deployment’s BuildID (compile time) or InstanceID (runtime) values, or both. Among the metrics included are CPUPercent and _DiskUsedPercent…_not that I’ve had issues with hitting the 512 MB limit or anything. image

Logs

Logging is a critical observability capability for any microservice deployment. Having a durable and searchable centralized logging store is especially important for serverless-based solutions. Yan Cui’s Centralized logging for AWS Lambda is an excellent introduction to the topic.

Sparta has always supported structured logging via logrus and aggregation to CloudWatch Logs. However as Yan discusses, there are other options that may offer more features or work better within your organization. To support this use case, Sparta now includes a LogAggregatorDecorator that forwards all referenced log statements to a Kinesis Stream for asynchronous processing.

Ensuring Kinesis has has a copy of all log statements doesn’t really help on the observability front though. Even better than having the logs is the ability to effectively search them. The SpartaPProf example turns the Kinesis stream into StackDriver log events and forwards them to Google Stackdriver. Use Stackdriver logging to search your AWS Lambda logs!

image Searching AWS Lambda logs in Google Stackdriver #multicloud

Profiling

While profiling doesn’t typically constitute one of the three pillars of observability, it’s still often helpful to understand your service’s performance. Leveraging go’s ability to expose profiling data, Sparta 1.5.0 provides facilities to make it straightforward to send AWS Lambda extracted profiling information to Google Stackdriver Profiling for visualization. Using Google credentials stored in AWS Systems Manager Parameter Store, you can now install a profiling task to really understand why your function has crossed the 100ms billing increment level.

See the SpartaPProf sample project for a complete example.

image Stackdriver profiling of AWS Lambda executions #multicloud

Additional Treats

There are a host of other improvements and bug fixes, including:

  • Creating your own go CloudFormation CustomResources! Want to do something that isn’t supported in CloudFormation? Want to call out to a third party API as part of your service’s normal lifecyle? Custom resources provide an “escape hatch” for those times when nothing but custom code will get the job done.
  • Publishing an S3 artifact as part of your service’s lifecycle with a S3ArtifactPublisherDecorator. This is a great fit for those times when you want your service to leave some sort of metadata receipt in a bucket.
  • New archetype constructors to eliminate much of the boilerplate around typical serverless patterns such as subscribing to S3 or DynamoDB events.
  • Putting a CloudFront Distribution and custom domain in front of your S3-backed static site
  • Defining Validator WorkflowHooks that receive an immutable version of the complete CloudFormation template. Validator hooks are useful to define team or organization-wide stack policies (eg: prevent Resource: * in IAM policies when possible). Those policies can be distributed as standard go packages.
  • Using magefile tasks for cross-platform friendly actions such as provisioning a service or applying a tool to all *.go files in your source tree. Moving from Makefiles to mage makes it much easier to support mixed *nix/Windows teams. You can largely copy the new magefile.go files across projects and they will Just Work. Here’s an example.

Get On Board the WASM Train

image

Finally, go 1.11 added experimental support for WebAssembly. This compiler target opens up the possibility of using go to write front end code as well!

As the Sparta provisioning lifecycle supports running go:generate as part of the cross compilation step, it’s now feasible to compile go to WASM and deploy that to your static S3 site as part of a single provision step. See the SpartaWASM repo for an example.

And full disclosure, the WASM train is desperately in need of more rail. Pull Requests are most definitely appreciated!

Wrapping Up

The Sparta 1.5.0 release is designed to streamline the developer experience and provide facilities to better understand the state and shortcomings of your Sparta service. While the speed of serverless development and deployment is addictive, I think it’s also important to ensure that the longer-term operational aspects of your service enjoy the same level of integration and expressiveness as your core logic.

And maybe even more importantly, the WASM train is coming and I look forward to seeing many of you at the station with me!

Comments appreciated and if you run into any issues, please open an issue at the Sparta repo. Additional documentation is available at https://gosparta.io and full change notes at CHANGES. Build something awesome. image

Credits