Skip to Content

Working with Disk Engine using Apache Zeppelin

Previous

Working with Disk Engine using Apache Zeppelin

By Vitaliy Rudnytskiy

Disk-based storage allows you to use relational capabilities without loading data into memory.

You will learn

You will learn how to process data using SAP Vora disk engine.

Details


Step 1: Disk Engine

Besides the SAP Vora relational in-memory engine, SAP Vora also provides the following execution engines: document store, graph engine, time series engine, and disk engine.

These engines are integrated into Spark as either a Spark SQL data source (full integration) or raw data source (partial integration).

The data source com.sap.spark.engines.disk is used by the disk engine. SQL statements issued on the disk engine are fully integrated into Spark SQL. Disk engine tables therefore behave in exactly the same way as Spark SQL tables.

Please log in to access this content.
Step 2: Running 3_Data_on_Disk

The first engine to look at is the Disk Engine. Switch to Zeppelin notebook 3_Data_on_Disk.
Disk notebook

First create a disk engine table. Disk engine tables need a partition function and a derived partition scheme. This is what you do in the first two paragraphs.
Partitions

Create a second disk engine table and verify tables created.
Second table

Run a simple cross-engine query, you can continue by writing your own SQL paragraph using the simple query.

%vora
SELECT COMPLAINTS_DISK.COMPLAINT_ID,PRODUCT  
FROM  COMPLAINTS_DISK
INNER JOIN PRODUCTS
ON COMPLAINTS_DISK.COMPLAINT_ID = PRODUCTS.COMPLAINT_ID
cross-engine query
Please log in to access this content.

Next Steps

Updated 05/23/2017

Time to Complete

15 Min

Beginner

Tags

Next
Back to top