Software

Dremio 3.0 introduces new ways to empower data scientists

Data-as-a-service platform provider Dremio has announced a major release of its open-source platform.

According to the company, the new features in Dremio 3.0 will support data initiatives by providing shorter lead times, lower operational costs, greater security and governance, and more self-service to a wider variety of roles.

“Everything-as-a-Service has been embraced by IT for the past five years – encompassing infrastructure, platforms and applications,” said Kelly Stirman, vice president of strategy and CMO at Dremio. “Today, we are providing these same benefits to our customers for their data initiatives, by offering tools for data engineers to be more productive, and for data consumers to be more self-sufficient.”

Dremio 3.0 includes a new built-in data catalog that makes it easier to discover, organize, describe and self-serve data from many sources. It has a Google-like search interface to make it easy to find data and immediately start curating, blending, or analyzing it, Dremio explained.

New security features include a new native integration with Apache Ranger for centralized access control and support for end-to-end TLS encryption. For those with AWS deployments, it also now supports EC2 instance profiles for secure access to S3.

The release also includes a new multi-tenant feature that enables data engineering teams to manage and optimize resources across multiple workloads and users. Workload management policies can control resource allocation based on user, group membership, time of day, data source, query type, or another runtime factor.

Dremio 3.0 features the availability of the Gandiva Initiative for Apache Arrow, providing 100x greater efficiency on queries and operations. This performance improvement will lead to lower operational costs, better user experiences, and the ability to support more workloads with existing hardware.

Other features include a Docker image and templates for deployments using Kubernetes, a new engine for relational push-downs, and new data sources such as Azure Data Lake Store, Elasticsearch 6, AWS S3 GovCloud, and Teradata.

Source: SDTimes

About the author

Julius Appiah

Julius has been a passionate blogger for several years with a particular interest towards science and technology. When he is not writing, what else can be a better pastime than web surfing and staying updated about the tech world! Reach out to me at: juliusappiah34@gmail.com