Quantcast
Channel: performance – Alluxio
Browsing all 46 articles
Browse latest View live

Building fast and scalable big data and ML platforms at Pinterest and JD.com

This Alluxio Meetup features a chance to interact with other Alluxio users and developers, as well as three talks. Thanks to our joint host Data Council! The post Building fast and scalable big data...

View Article



Hybrid Environments for Data Analytics is a Possibility

As the data ecosystem becomes massively complex and more and more disaggregated, data analysts and end users have trouble adapting and working with hybrid environments. The proliferation of compute...

View Article

Building fast and scalable big data and ML platforms at Pinterest and JD.com

This talk shares our design, implementation and optimization of Alluxio metadata service to address the scalability challenges, focusing on how to apply and combine techniques including tiered metadata...

View Article

Getting Started with the Alluxio-Presto Sandbox

The Alluxio-Presto sandbox is a docker application featuring installations of MySQL, Hadoop, Hive, Presto, and Alluxio. The sandbox lets you easily dive into an interactive environment where you can...

View Article

Scalable Filesystem Metadata Services with RocksDB

Alluxio maintainer and founding engineer Calvin Jia presents on Scalable Filesystem Metadata Services with RocksDB at the RocksDB meetup at Twitter. The post Scalable Filesystem Metadata Services with...

View Article


Alluxio New York Meetup: Accelerating Analytical Workloads for Public &...

Joint hosted Alluxio New York meetup with talks to include: Embracing hybrid cloud for data-intensive analytic workloads and Alluxio on AWS EMR (fast storage access and sharing for Spark). The post...

View Article

NetEase and Alluxio joint meetup

Joint meetup in Hangzhou discusses: An introduction to new features of big data storage system Alluxio and optimization of cache performance, Practice & exploration of Spark & Alluxio, and the...

View Article

Accelerating Write-intensive Data Workloads on AWS S3

Alluxio is an open-source data orchestration system widely used to speed up data-intensive workloads in the cloud. Alluxio v2.0 introduced Replicated Async Write to allow users to complete writes to...

View Article


Community Office Hour: Building a Cloud Native Stack with EMR Spark, Alluxio,...

Learn how to set up EMR Spark with Alluxio so Spark jobs can seamlessly read from and write to S3. See the performance comparison between Spark on S3 with Spark, and Alluxio on S3. The post Community...

View Article


Why Data Orchestration?

Today’s current pace of innovation is hindered by the necessity of reinventing the wheel in order for applications to efficiently access data. When an engineer or scientist wants to write an...

View Article

Online Meetup: Powering Data Science and AI with Apache Spark, Alluxio, and IBM

Learn why leading companies are moving towards a decoupled compute and storage architecture, and the associated challenges and requirements. Hear about how Spark and Alluxio together can solve the...

View Article

Apache Iceberg – A Table Format for Huge Analytic Datasets

This talk includes why Netflix needed to build Iceberg, the project’s high-level design, and will highlight the details that unblock better query performance. The post Apache Iceberg – A Table Format...

View Article

How to Develop and Operate Cloud Native Data Platforms and Applications

In this talk, we share our lessons in building and rebuilding our monitoring systems and data platforms at Electronic Arts (EA). The post How to Develop and Operate Cloud Native Data Platforms and...

View Article


Enterprise Distributed Query Service Powered by Presto & Alluxio Across...

This session talks about challenges associated with querying diverse data sources at Walmart and how those are tackled using Presto & Alluxio. The post Enterprise Distributed Query Service Powered...

View Article

The Practice of Presto & Alluxio in E-Commerce Big Data Platform

JD.com is China’s largest online retailer. It uses Alluxio to provide support for ad hoc and real-time stream computing, using Alluxio-compatible HDFS URLs and Alluxio as a pluggable optimization...

View Article


Integrating Google Cloud Dataproc with Alluxio for faster performance in the...

Learn how to set up Google Cloud Dataproc with Alluxio so jobs can seamlessly read from and write to Cloud Storage. See how to run Dataproc Spark against a remote HDFS cluster. The post Integrating...

View Article

Tech Talk: Integrating Google Cloud Dataproc with Alluxio for faster...

Chris Crosbie and Roderick Yao from the Google Dataproc team and Dipti Borkar of Alluxio demo how to set up Google Cloud Dataproc with Alluxio so jobs can seamlessly read from and write to Cloud...

View Article


NetEase and Alluxio joint meetup

Joint meetup in Hangzhou discusses: An introduction to new features of big data storage system Alluxio and optimization of cache performance, Practice & exploration of Spark & Alluxio, and the...

View Article

What’s new in Alluxio 2.2

With this release comes the General Availability (GA) of Alluxio Structured Data Services (SDS), the subsystem of Alluxio responsible for managing and transforming structured data, such as databases,...

View Article

Optimizing Query Performance by Decoupling Presto and Hive Data Warehouse

Ideally, Presto would access data independently from how the data was originally stored or managed. Alluxio, as a data orchestration layer provides the physical data independence, for Presto to...

View Article
Browsing all 46 articles
Browse latest View live




Latest Images