SAP Vora comes with some examples that you can run and examine.
You can run the examples as vora
user by executing /etc/vora/run_examples.sh hdfs
.
Ignore any Address already in use error messages. Or to reduce the amount of information output modify the logger configuration in file /opt/hadoop-2.7.3/etc/hadoop/log4j.properties
by changing line
hadoop.root.logger=INFO,console
to
hadoop.root.logger=ERROR,console
It is convenient to write the output of execution into a file to examine the results later:
/etc/vora/run_examples.sh hdfs > output_from_examples.log
You can also look at the source code which is at /opt/vora/lib/vora-spark/examples
.
The examples source code can also be copied and pasted into spark-shell
, so it can be executed step by step.
To check if everything works you can also run the examples one by one and check if the output matches the expectations.
-
Find the jar file with the Vora examples:
export DATASOURCE_DIST=/opt/vora/lib/vora-spark/lib/spark-sap-datasources-*-assembly.jar
-
Copy the test data to HDFS:
/opt/spark/bin/spark-submit --class com.sap.spark.vora.examples.tools.CopyExampleFilesToHdfs $DATASOURCE_DIST
- When
echo $?
returns 0
you were successful
Now you can run the single examples and check the output. Ignore all the Spark debug output about starting and finishing jobs.
If the expected snippet occurs in the output means, that the example ran successful.