Data Mechanics is a cloud-native Spark platform designed specifically for data engineers. It allows users to run continuously optimized Apache Spark workloads on a Kubernetes cluster in their cloud account, whether it's AWS, GCP, or Azure. With Data Mechanics, users can focus on their data while the platform handles the mechanics.
Data Mechanics offers a developer-friendly experience by allowing users to bring their own tools. They can enjoy a faster and more reliable development workflow with Docker, using pre-built optimized images or creating custom ones to package their dependencies. Interaction with Spark can be done through Jupyter notebooks or programmatically via the platform's REST API or Airflow connector.
The platform is transparent and flexible, utilizing the power of Kubernetes without the complexity. Data Mechanics is deployed on a managed Kubernetes cluster within the user's cloud account and virtual private cloud. This ensures that sensitive data does not leave the environment, giving users full control. Additionally, the platform provides an easy-to-use monitoring dashboard that allows users to track their application's logs, metrics, and costs over time.
One of the standout features of Data Mechanics is its cost-effectiveness. Through smart automations, the platform dynamically scales applications and Kubernetes nodes based on load. It also automatically tunes configurations, such as instance types, disks, container memory/CPU allocations, and Spark configurations, based on historical runs of Spark pipelines. This optimization approach has resulted in 50-75% cost reductions for customers migrating from competing platforms like Databricks or EMR.
In June 2021, NetApp acquired Data Mechanics, further enhancing its capabilities. Users can visit the Spot by NetApp website to learn more about the next evolution of Data Mechanics, known as Ocean for Apache Spark.
Overall, Data Mechanics is a powerful and user-friendly platform that empowers data engineers to efficiently run Apache Spark workloads in the cloud. Its features, including developer friendliness, transparency and flexibility, and cost-effectiveness, make it an attractive choice for organizations seeking to optimize their data infrastructure.