Topic: Development Spark of applications on Windows.

To me in mail the person from a forum tangently about such question addressed:
The applications programming, using Spark without the assembly and filling jar in a cluster.
It would be desirable to note that in mail I to answer I do not see sense.
Forum on that and a forum that can be useful more than to one, and my answer can add or correct.
In general without claiming by no means on  in the field I will describe some moments, can to someone helps for start.
1. I would suggest to begin with the most simple example which will be fulfilled both on Windows and on .
Found at itself ancient preparation on scala (but not super ancient, for start it is necessary Spark 2.0 + like)
The code considers number of lines in the text file containing and and the same for b. It becomes by means of Spark.

object obj {
def main (args: Array [String]) {
println (s "default logging level $ {Logger.getLogger (" org ").getLevel}")
println (s "setting to Level. ALL")
LogManager.getLogger ("org").setLevel (Level. ALL)
println ("========================================1")
val spark = org.apache.spark.sql.SparkSession.builder ()
.master ("local [*]")
.appName ("dummy app")
.getOrCreate ()
println ("========================================2")
val logData = spark.read.textFile ("C: \\sqlldr \\test.txt").cache ()
val numAs = logData.filter (line => line.contains ("a")).count ()
val numBs = logData.filter (line => line.contains ("b")).count ()
println ("========================================3")
println (s "$numAs $numBs")
println ("========================================4")
spark.stop ()

If it is fulfilled, means you managed to tighten all dependences and it is possible to continue.
From same it is possible to generate jar and to flood in , only not to forget a way to register another for spark.read.textFile.
Here it is necessary to pay attention that if it would be desirable to use SQL (and hive metastore) or to register resource manager details
Or adjustments kerberos much from this and still the heap of all can be specified as methods after.builder.
Type.config ("hive.metastore.uris", "...")
2. The idea to use driver program on Windows is irrational initially, .
In what a problem after  to flood  with the same pscp on the server?
In my case for it answers  the server.
Let's assume  that the fixed idea of the assembly of filling jar to avoid.
Here if to look in dock https://spark.apache.org/docs/latest/cl … rview.html
Principal problem will be at least
[quote =] The driver program must listen for and accept incoming connections from its executors throughout its lifetime (e.g., see spark.driver.port in the network config section).
As such, the driver program must be network addressable from the worker nodes.

In addition,  https://stackoverflow.com/questions/370 … loy-modes.
I do not say that it is impossible basically, but I at all would not become .
For adventurouses is and YARN on Windows, and other exotic, but  and   suffices it.
If desire to avoid filling because of complexities  as a rule all expose the necessary level  and are then picked broad gulls.
As a last resort - remote debugging (google: agentlib:jdwp).
3. The most simple way to start - to download Oracle Big Data Lite . There all is configured and adjusted.
At desire in  it is possible to swing intellij/eclipse or that else to taste and to develop to itself under .