Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
May 28, 2021 06:15 pm GMT

The way to launch Apache Spark Apache Zeppelin InterSystems IRIS

Hi all. YesterdayI tried to connectApache Spark, Apache Zeppelin, and InterSystems IRIS. During the process,I experienced troubles connecting it all together and I did not finda useful guide. So, I decided to write my own.

Introduction

What is Apache Spark and Apache Zeppelin and find out how it works together. Apache Spark isan open-source cluster-computing framework. It providesaninterfacefor programming entire clusters with implicitdata parallelismandfault tolerance. So, it is very useful when you need to work with Big Data. And Apache Zeppelin is a notebook, that provides cool UI to work with analytics and machine learning. Together, it works like this: IRIS provides data, Spark reads provided data, and in a notebook we work with the data.

Note: I have done the following on Windows 10.

Apache Zeppelin

Now, we will install all the necessary programs.First of all, download apache zeppelin fromthe official site of apache zeppelin. I have usedzeppelin-0.8.0-bin-all.tgz. It includes ApacheSpark, Scala, and Python.Unzip it to any folder. After that you can launch zeppelin by calling \bin\zeppelin.cmd from the root of your Zeppelin folder. Wait until theDone, zeppelin server startedstring appears and open http://localhost:8080 in your browser. If everything is okay, you will seeWelcome to Zeppelin!message.

Alt Text

Note: I assume, that InterSystems IRIS already installed. If not, download and install it before the next step.

Apache Spark

So, we have the browser's open window with Zeppelin notebook. In the upper-right cornerclick on anonymousand after, click on Interpreter.Scroll down and find spark.

Alt Text

Next to the spark findeditbuttonand click on it. Scroll down and add dependencies tointersystems-spark-1.0.0.jarand to intersystems-jdbc-3.0.0.jar. I installed InterSystems IRIS to the C:\InterSystems\IRIS\ directory, so artifacts I need to add are at:

Alt Text

My files are here:

Alt Text

And save it.

Check that it works

Let us check it. Create a new note, and in a paragraph pastethe following code:

var dataFrame=spark.read.format("com.intersystems.spark").option("url", "IRIS://localhost:51773/NAMESPACE").option("user", "UserLogin").option("password", "UserPassword").option("dbtable", "Sample.Person").load()

// dbtable - name of your table

URL - IRIS address. It is formed as follows IRIS://ipAddress:superserverPort/namespace:

  • protocol IRIS is a JDBC connection over TCP/IPthat offersJavashared memory connection;

  • ipAddress The IP address of the InterSystems IRIS instance. If you are connecting locally, use 127.0.0.1 instead of localhost;

  • superserverPort The superserver port number for the IRIS instance, which is not the same as the webserver port number. To find the superserver port number, in the Management Portal, go toSystem Administration > Configuration > System Configuration > Memory and Startup;namespace An existing namespace in the InterSystems IRIS instance. In this demo, we connect to theUSERnamespace.
  • Alt Text

    Run the paragraph. If everything is okay, you will see FINISHED.

    My notebook:

    Alt Text

    Conclusion

    In conclusion, we found outhow Apache Spark, Apache Zeppelin, and InterSystems IRIS can work together. In my next articles, I will write about data analysis.

    Links


    Original Link: https://dev.to/intersystems/the-way-to-launch-apache-spark-apache-zeppelin-intersystems-iris-45p9

    Share this article:    Share on Facebook
    View Full Article

    Dev To

    An online community for sharing and discovering great ideas, having debates, and making friends

    More About this Source Visit Dev To