Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
2783 commits
Select commit Hold shift + click to select a range
47c1d56
[SPARK-7426] [MLLIB] [ML] Updated Attribute.fromStructField to allow …
dusenberrymw Jun 22, 2015
0818fde
[SPARK-8406] [SQL] Adding UUID to output file name to avoid accidenta…
liancheng Jun 22, 2015
42a1f71
[SPARK-8429] [EC2] Add ability to set additional tags
armisael Jun 22, 2015
ba8a453
[SPARK-8482] Added M4 instances to the list.
pradeepchhetri Jun 22, 2015
5d89d9f
[SPARK-8511] [PYSPARK] Modify a test to remove a saved model in `regr…
yu-iskw Jun 22, 2015
da7bbb9
[SPARK-8104] [SQL] auto alias expressions in analyzer
cloud-fan Jun 22, 2015
5ab9fcf
[SPARK-8532] [SQL] In Python's DataFrameWriter, save/saveAsTable/json…
yhuai Jun 22, 2015
afe35f0
[SPARK-8455] [ML] Implement n-gram feature transformer
Jun 22, 2015
b1f3a48
[SPARK-8537] [SPARKR] Add a validation rule about the curly braces in…
yu-iskw Jun 22, 2015
50d3242
[SPARK-8356] [SQL] Reconcile callUDF and callUdf
BenFradet Jun 22, 2015
96aa013
[SPARK-8492] [SQL] support binaryType in UnsafeRow
Jun 22, 2015
1dfb0f7
[HOTFIX] [TESTS] Typo mqqt -> mqtt
Jun 22, 2015
860a49e
[SPARK-7153] [SQL] support all integral type ordinal in GetArrayItem
cloud-fan Jun 23, 2015
6b7f2ce
[SPARK-8307] [SQL] improve timestamp from parquet
Jun 23, 2015
13321e6
[SPARK-7859] [SQL] Collect_set() behavior differences which fails the…
chenghao-intel Jun 23, 2015
c4d2343
MAINTENANCE: Automated closing of pull requests.
pwendell Jun 23, 2015
44fa7df
[SPARK-8548] [SPARKR] Remove the trailing whitespaces from the SparkR…
yu-iskw Jun 23, 2015
164fe2a
[SPARK-7781] [MLLIB] gradient boosted trees.train regressor missing m…
holdenk Jun 23, 2015
d4f6335
[SPARK-8431] [SPARKR] Add in operator to DataFrame Column in SparkR
yu-iskw Jun 23, 2015
31bd306
[SPARK-8359] [SQL] Fix incorrect decimal precision after multiplication
viirya Jun 23, 2015
9b618fb
[SPARK-8483] [STREAMING] Remove commons-lang3 dependency from Flume Si…
harishreedharan Jun 23, 2015
f0dcbe8
[SPARK-8541] [PYSPARK] test the absolute error in approx doctests
megatron-me-uk Jun 23, 2015
6ceb169
[SPARK-8300] DataFrame hint for broadcast join.
rxin Jun 23, 2015
0f92be5
[SPARK-8498] [TUNGSTEN] fix npe in errorhandling path in unsafeshuffl…
holdenk Jun 23, 2015
4f7fbef
[SQL] [DOCS] updated the documentation for explode
lockwobr Jun 23, 2015
7b1450b
[SPARK-7235] [SQL] Refactor the grouping sets
chenghao-intel Jun 23, 2015
6f4cadf
[SPARK-8432] [SQL] fix hashCode() and equals() of BinaryType in Row
Jun 23, 2015
2b1111d
[SPARK-7888] Be able to disable intercept in linear regression in ml …
holdenk Jun 23, 2015
f2022fa
[SPARK-8265] [MLLIB] [PYSPARK] Add LinearDataGenerator to pyspark.mll…
MechCoder Jun 23, 2015
f2fb028
[SPARK-8111] [SPARKR] SparkR shell should display Spark logo and vers…
Jun 23, 2015
a803118
[SPARK-8525] [MLLIB] fix LabeledPoint parser when there is a whitespa…
fe2s Jun 23, 2015
d96d7b5
[DOC] [SQL] Addes Hive metastore Parquet table conversion section
liancheng Jun 23, 2015
7fb5ae5
[SPARK-8573] [SPARK-8568] [SQL] [PYSPARK] raise Exception if column i…
Jun 23, 2015
111d6b9
[SPARK-8139] [SQL] Updates docs and comments of data sources and Parq…
liancheng Jun 24, 2015
0401cba
[SPARK-7157][SQL] add sampleBy to DataFrame
mengxr Jun 24, 2015
a458efc
Revert "[SPARK-7157][SQL] add sampleBy to DataFrame"
rxin Jun 24, 2015
50c3a86
[SPARK-6749] [SQL] Make metastore client robust to underlying socket …
ericl Jun 24, 2015
13ae806
[HOTFIX] [BUILD] Fix MiMa checks in master branch; enable MiMa for la…
JoshRosen Jun 24, 2015
09fcf96
[SPARK-8371] [SQL] improve unit test for MaxOf and MinOf and fix bugs
cloud-fan Jun 24, 2015
cc465fd
[SPARK-8138] [SQL] Improves error message when conflicting partition …
liancheng Jun 24, 2015
9d36ec2
[SPARK-8567] [SQL] Debugging flaky HiveSparkSubmitSuite
liancheng Jun 24, 2015
bba6699
[SPARK-8578] [SQL] Should ignore user defined output committer when a…
yhuai Jun 24, 2015
31f48e5
[SPARK-8576] Add spark-ec2 options to set IAM roles and instance-init…
nchammas Jun 24, 2015
1173483
[SPARK-8399] [STREAMING] [WEB UI] Overlap between histograms and axis…
BenFradet Jun 24, 2015
43e6619
[SPARK-8506] Add pakages to R context created through init.
holdenk Jun 24, 2015
b84d4b4
[SPARK-7088] [SQL] Fix analysis for 3rd party logical plan.
smola Jun 24, 2015
f04b567
[SPARK-7289] handle project -> limit -> sort efficiently
cloud-fan Jun 24, 2015
fb32c38
[SPARK-7633] [MLLIB] [PYSPARK] Python bindings for StreamingLogisticR…
MechCoder Jun 24, 2015
8ab5076
[SPARK-6777] [SQL] Implements backwards compatibility rules in Cataly…
liancheng Jun 24, 2015
dca21a8
[SPARK-8558] [BUILD] Script /dev/run-tests fails when _JAVA_OPTIONS e…
fe2s Jun 24, 2015
7daa702
[SPARK-8567] [SQL] Increase the timeout of HiveSparkSubmitSuite
yhuai Jun 24, 2015
b71d325
[SPARK-8075] [SQL] apply type check interface to more expressions
cloud-fan Jun 24, 2015
82f80c1
Two minor SQL cleanup (compiler warning & indent).
rxin Jun 25, 2015
7bac2fe
[SPARK-7884] Move block deserialization from BlockStoreShuffleFetcher…
massie Jun 25, 2015
c337844
[SPARK-8604] [SQL] HadoopFsRelation subclasses should set their outpu…
liancheng Jun 25, 2015
085a721
[SPARK-5768] [WEB UI] Fix for incorrect memory in Spark UI
rekhajoshm Jun 25, 2015
e988adb
[SPARK-8574] org/apache/spark/unsafe doesn't honor the java source/ta…
Jun 25, 2015
f9b397f
[SPARK-8567] [SQL] Add logs to record the progress of HiveSparkSubmit…
yhuai Jun 25, 2015
2519dcc
[MINOR] [MLLIB] rename some functions of PythonMLLibAPI
yanboliang Jun 25, 2015
c392a9e
[SPARK-8637] [SPARKR] [HOTFIX] Fix packages argument, sparkSubmitBinName
shivaram Jun 25, 2015
47c874b
[SPARK-8237] [SQL] Add misc function sha2
viirya Jun 26, 2015
4036011
[SPARK-8620] [SQL] cleanup CodeGenContext
cloud-fan Jun 26, 2015
1a79f0e
[SPARK-8635] [SQL] improve performance of CatalystTypeConverters
cloud-fan Jun 26, 2015
9fed6ab
[SPARK-8344] Add message processing time metric to DAGScheduler
JoshRosen Jun 26, 2015
c9e05a3
[SPARK-8613] [ML] [TRIVIAL] add param to disable linear feature scaling
holdenk Jun 26, 2015
37bf76a
[SPARK-8302] Support heterogeneous cluster install paths on YARN.
Jun 26, 2015
41afa16
[SPARK-8652] [PYSPARK] Check return value for all uses of doctest.tes…
JoshRosen Jun 26, 2015
a56516f
[SPARK-8662] SparkR Update SparkSQL Test
Jun 26, 2015
9d11817
[SPARK-8607] SparkR -- jars not being added to application classpath …
Jun 27, 2015
b5a6663
[SPARK-8639] [DOCS] Fixed Minor Typos in Documentation
Rosstin Jun 27, 2015
d48e789
[SPARK-3629] [YARN] [DOCS]: Improvement of the "Running Spark on YARN…
Jun 27, 2015
4153776
[SPARK-8623] Hadoop RDDs fail to properly serialize configuration
sryza Jun 27, 2015
0b5abbf
[SPARK-8606] Prevent exceptions in RDD.getPreferredLocations() from c…
JoshRosen Jun 27, 2015
40648c5
[SPARK-8583] [SPARK-5482] [BUILD] Refactor python/run-tests to integr…
JoshRosen Jun 28, 2015
42db3a1
[HOTFIX] Fix pull request builder bug in #6967
JoshRosen Jun 28, 2015
f510045
[SPARK-8683] [BUILD] Depend on mockito-core instead of mockito-all
JoshRosen Jun 28, 2015
52d1281
[SPARK-8649] [BUILD] Mapr repository is not defined properly
tszym Jun 28, 2015
77da5be
[SPARK-8610] [SQL] Separate Row and InternalRow (part 2)
Jun 28, 2015
ec78438
[SPARK-8686] [SQL] DataFrame should support `where` with expression r…
sarutak Jun 28, 2015
9ce78b4
[SPARK-8596] [EC2] Added port for Rstudio
koaning Jun 28, 2015
24fda73
[SPARK-8677] [SQL] Fix non-terminating decimal expansion for decimal …
viirya Jun 28, 2015
00a9d22
[SPARK-7845] [BUILD] Bumping default Hadoop version used in profile h…
liancheng Jun 29, 2015
25f574e
[SPARK-7212] [MLLIB] Add sequence learning flag
Jun 29, 2015
dfde31d
[SPARK-5962] [MLLIB] Python support for Power Iteration Clustering
yanboliang Jun 29, 2015
0b10662
[SPARK-8575] [SQL] Deprecate callUDF in favor of udf
BenFradet Jun 29, 2015
ac2e17b
[SPARK-8355] [SQL] Python DataFrameReader/Writer should mirror Scala
Jun 29, 2015
660c6ce
[SPARK-8698] partitionBy in Python DataFrame reader/writer interface …
rxin Jun 29, 2015
630bd5f
[SPARK-8702] [WEBUI] Avoid massive concating strings in Javascript
zsxwing Jun 29, 2015
5c796d5
[SPARK-8693] [PROJECT INFRA] profiles and goals are not printed in a …
Jun 29, 2015
715f084
[SPARK-8554] Add the SparkR document files to `.rat-excludes` for `./…
yu-iskw Jun 29, 2015
ea88b1a
Revert "[SPARK-8372] History server shows incorrect information for a…
Jun 29, 2015
ed413bc
[SPARK-8692] [SQL] re-order the case statements that handling catalys…
cloud-fan Jun 29, 2015
3664ee2
[SPARK-8066, SPARK-8067] [hive] Add support for Hive 1.0, 1.1 and 1.2.
Jun 29, 2015
a5c2961
[SPARK-8235] [SQL] misc function sha / sha1
tarekbecker Jun 29, 2015
492dca3
[SPARK-8528] Expose SparkContext.applicationId in PySpark
Jun 29, 2015
94e040d
[SQL][DOCS] Remove wrong example from DataFrame.scala
sarutak Jun 29, 2015
637b4ee
[SPARK-8214] [SQL] Add function hex
zhichao-li Jun 29, 2015
c6ba2ea
[SPARK-7862] [SQL] Disable the error message redirect to stderr
chenghao-intel Jun 29, 2015
be7ef06
[SPARK-8681] fixed wrong ordering of columns in crosstab
brkyvz Jun 29, 2015
afae976
[SPARK-8070] [SQL] [PYSPARK] avoid spark jobs in createDataFrame
Jun 29, 2015
27ef854
[SPARK-8709] Exclude hadoop-client's mockito-all dependency
JoshRosen Jun 29, 2015
f6fc254
[SPARK-8056][SQL] Design an easier way to construct schema for both S…
Jun 29, 2015
ecd3aac
[SPARK-7810] [PYSPARK] solve python rdd socket connection problem
Jun 29, 2015
c8ae887
[SPARK-8660][ML] Convert JavaDoc style comments inLogisticRegressionS…
Rosstin Jun 29, 2015
931da5c
[SPARK-8478] [SQL] Harmonize UDF-related code to use uniformly UDF in…
BenFradet Jun 29, 2015
ed359de
[SPARK-8579] [SQL] support arbitrary object in UnsafeRow
Jun 29, 2015
4e880cf
[SPARK-8661][ML] for LinearRegressionSuite.scala, changed javadoc-sty…
Rosstin Jun 29, 2015
4b497a7
[SPARK-8710] [SQL] Change ScalaReflection.mirror from a val to a def.
yhuai Jun 29, 2015
881662e
[SPARK-8589] [SQL] cleanup DateTimeUtils
cloud-fan Jun 29, 2015
cec9852
[SPARK-8634] [STREAMING] [TESTS] Fix flaky test StreamingListenerSuit…
zsxwing Jun 30, 2015
fbf7573
[SPARK-7287] [SPARK-8567] [TEST] Add sc.stop to applications in Spark…
yhuai Jun 30, 2015
5d30eae
[SPARK-8437] [DOCS] Using directory path without wildcard for filenam…
srowen Jun 30, 2015
d7f796d
[SPARK-8410] [SPARK-8475] remove previous ivy resolution when using s…
brkyvz Jun 30, 2015
4a9e03f
[SPARK-8019] [SPARKR] Support SparkR spawning worker R processes with…
msannell Jun 30, 2015
4c1808b
Revert "[SPARK-8437] [DOCS] Using directory path without wildcard for…
Jun 30, 2015
620605a
[SPARK-8456] [ML] Ngram featurizer python
Jun 30, 2015
ecacb1e
[SPARK-8715] ArrayOutOfBoundsException fixed for DataFrameStatSuite.c…
brkyvz Jun 30, 2015
4915e9e
[SPARK-8669] [SQL] Fix crash with BINARY (ENUM) fields with Parquet 1.7
stshe Jun 30, 2015
f9b6bf2
[SPARK-7667] [MLLIB] MLlib Python API consistency check
yanboliang Jun 30, 2015
7bbbe38
[SPARK-5161] Parallelize Python test execution
JoshRosen Jun 30, 2015
ea775b0
MAINTENANCE: Automated closing of pull requests.
pwendell Jun 30, 2015
f79410c
[SPARK-8721][SQL] Rename ExpectsInputTypes => AutoCastInputTypes.
rxin Jun 30, 2015
e6c3f74
[SPARK-8650] [SQL] Use the user-specified app name priority in SparkS…
watermen Jun 30, 2015
6c5a6db
[SPARK-5161] [HOTFIX] Fix bug in Python test failure reporting
JoshRosen Jun 30, 2015
12671dd
[SPARK-8434][SQL]Add a "pretty" parameter to the "show" method to dis…
zsxwing Jun 30, 2015
5452457
[SPARK-8551] [ML] Elastic net python code example
coderxiang Jun 30, 2015
2ed0c0a
[SPARK-7756] [CORE] More robust SSL options processing.
tellison Jun 30, 2015
08fab48
[SPARK-8590] [SQL] add code gen for ExtractValue
cloud-fan Jun 30, 2015
865a834
[SPARK-8723] [SQL] improve divide and remainder code gen
cloud-fan Jun 30, 2015
a48e619
[SPARK-8680] [SQL] Slightly improve PropagateTypes
viirya Jun 30, 2015
722aa5f
[SPARK-8236] [SQL] misc functions: crc32
qiansl127 Jun 30, 2015
689da28
[SPARK-8592] [CORE] CoarseGrainedExecutorBackend: Cannot register wit…
xuchenCN Jun 30, 2015
ada384b
[SPARK-8437] [DOCS] Corrected: Using directory path without wildcard …
srowen Jun 30, 2015
4528166
[SPARK-4127] [MLLIB] [PYSPARK] Python bindings for StreamingLinearReg…
MechCoder Jun 30, 2015
5fa0863
[SPARK-8679] [PYSPARK] [MLLIB] Default values in Pipeline API should …
MechCoder Jun 30, 2015
fbb267e
[SPARK-8713] Make codegen thread safe
Jun 30, 2015
9213f73
[SPARK-8615] [DOCUMENTATION] Fixed Sample deprecated code
Jun 30, 2015
ca7e460
[SPARK-7988] [STREAMING] Round-robin scheduling of receivers by default
nishkamravi2 Jun 30, 2015
5726440
[SPARK-8630] [STREAMING] Prevent from checkpointing QueueInputDStream
zsxwing Jun 30, 2015
d16a944
[SPARK-8619] [STREAMING] Don't recover keytab and principal configura…
SaintBacchus Jun 30, 2015
1e1f339
[SPARK-6785] [SQL] fix DateTimeUtils for dates before 1970
ckadner Jun 30, 2015
c1befd7
[SPARK-8664] [ML] Add PCA transformer
yanboliang Jun 30, 2015
b8e5bb6
[SPARK-8628] [SQL] Race condition in AbstractSparkSQLParser.parse
Jun 30, 2015
74cc16d
[SPARK-8471] [ML] Discrete Cosine Transform Feature Transformer
Jun 30, 2015
61d7b53
[SPARK-7514] [MLLIB] Add MinMaxScaler to feature transformation
hhbyyh Jun 30, 2015
79f0b37
[SPARK-8560] [UI] The Executors page will have negative if having res…
XuTingjun Jun 30, 2015
7dda084
[SPARK-2645] [CORE] Allow SparkEnv.stop() to be called multiple times…
rekhajoshm Jun 30, 2015
4bb8375
[SPARK-8372] Do not show applications that haven't recorded their app…
Jun 30, 2015
3ba23ff
[SPARK-8736] [ML] GBTRegressor should not threshold prediction
jkbradley Jun 30, 2015
8c89896
[SPARK-8705] [WEBUI] Don't display rects when totalExecutionTime is 0
zsxwing Jun 30, 2015
e725262
[SPARK-8563] [MLLIB] Fixed a bug so that IndexedRowMatrix.computeSVD(…
lee19 Jun 30, 2015
d2495f7
[SPARK-8739] [WEB UI] [WINDOWS] A illegal character `\r` can be conta…
sarutak Jun 30, 2015
58ee2a2
[SPARK-8738] [SQL] [PYSPARK] capture SQL AnalysisException in Python API
Jun 30, 2015
8d23587
[SPARK-7739] [MLLIB] Improve ChiSqSelector example code in user guide
sethah Jun 30, 2015
8133125
[SPARK-8741] [SQL] Remove e and pi from DataFrame functions.
rxin Jun 30, 2015
ccdb052
[SPARK-8727] [SQL] Missing python api; md5, log2
tarekbecker Jun 30, 2015
3bee0f1
[SPARK-6602][Core] Update Master, Worker, Client, AppClient and relat…
zsxwing Jul 1, 2015
f457569
[SPARK-8471] [ML] Rename DiscreteCosineTransformer to DCT
Jul 1, 2015
b6e76ed
[SPARK-8535] [PYSPARK] PySpark : Can't create DataFrame from Pandas d…
x1- Jul 1, 2015
64c1461
[SPARK-6602][Core]Remove unnecessary synchronized
zsxwing Jul 1, 2015
365c140
[SPARK-8748][SQL] Move castability test out from Cast case class into…
rxin Jul 1, 2015
fc3a6fe
[SPARK-8749][SQL] Remove HiveTypeCoercion trait.
rxin Jul 1, 2015
0eee061
[SQL] [MINOR] remove internalRowRDD in DataFrame
cloud-fan Jul 1, 2015
9765241
[SPARK-8750][SQL] Remove the closure in functions.callUdf.
rxin Jul 1, 2015
fdcad6e
[SPARK-8763] [PYSPARK] executing run-tests.py with Python 2.6 fails w…
cocoatomo Jul 1, 2015
69c5dee
[SPARK-7714] [SPARKR] SparkR tests should use more specific expectati…
Jul 1, 2015
4137f76
[SPARK-8752][SQL] Add ExpectsInputTypes trait for defining expected i…
rxin Jul 1, 2015
31b4a3d
[SPARK-8621] [SQL] support empty string as column name
cloud-fan Jul 1, 2015
184de91
[SPARK-6263] [MLLIB] Python MLlib API missing items: Utils
Lewuathe Jul 1, 2015
2012913
[SPARK-8308] [MLLIB] add missing save load for python example
hhbyyh Jul 1, 2015
b8faa32
[SPARK-8765] [MLLIB] [PYTHON] removed flaky python PIC test
jkbradley Jul 1, 2015
75b9fe4
[SPARK-8378] [STREAMING] Add the Python API for Flume
zsxwing Jul 1, 2015
9f7db34
[SPARK-7820] [BUILD] Fix Java8-tests suite compile and test error und…
jerryshao Jul 1, 2015
3083e17
[QUICKFIX] [SQL] fix copy of generated row
Jul 1, 2015
1ce6428
[SPARK-3444] [CORE] Restore INFO level after log4j test.
Jul 1, 2015
f958f27
[SPARK-8766] support non-ascii character in column names
Jul 1, 2015
2727789
[SPARK-8770][SQL] Create BinaryOperator abstract class.
rxin Jul 1, 2015
3a342de
Revert "[SPARK-8770][SQL] Create BinaryOperator abstract class."
rxin Jul 1, 2015
9fd13d5
[SPARK-8770][SQL] Create BinaryOperator abstract class.
rxin Jul 2, 2015
4e4f74b
[SPARK-8660] [MLLIB] removed > symbols from comments in LogisticRegre…
Rosstin Jul 2, 2015
b285ac5
[SPARK-8227] [SQL] Add function unhex
zhichao-li Jul 2, 2015
792fcd8
[SPARK-8754] [YARN] YarnClientSchedulerBackend doesn't stop gracefull…
Jul 2, 2015
646366b
[SPARK-8688] [YARN] Bug fix: disable the cache fs to gain the HDFS co…
SaintBacchus Jul 2, 2015
d14338e
[SPARK-8771] [TRIVIAL] Add a version to the deprecated annotation for…
holdenk Jul 2, 2015
15d41cc
[SPARK-8769] [TRIVIAL] [DOCS] toLocalIterator should mention it resul…
holdenk Jul 2, 2015
377ff4c
[SPARK-8740] [PROJECT INFRA] Support GitHub OAuth tokens in dev/merge…
JoshRosen Jul 2, 2015
3697232
[SPARK-3071] Increase default driver memory
Jul 2, 2015
1b0c8e6
[SPARK-8687] [YARN] Fix bug: Executor can't fetch the new set configu…
SaintBacchus Jul 2, 2015
4158836
[DOCS] Fix minor wrong lambda expression example.
sarutak Jul 2, 2015
c572e25
[SPARK-8787] [SQL] Changed parameter order of @deprecated in package …
Jul 2, 2015
1bbdf9e
[SPARK-8746] [SQL] update download link for Hive 0.13.1
ckadner Jul 2, 2015
246265f
[SPARK-8690] [SQL] Add a setting to disable SparkSQL parquet schema m…
Jul 2, 2015
99c40cd
[SPARK-8647] [MLLIB] Potential issue with constant hashCode
Jul 2, 2015
0a468a4
[SPARK-8758] [MLLIB] Add Python user guide for PowerIterationClustering
yanboliang Jul 2, 2015
5b33381
[SPARK-8223] [SPARK-8224] [SQL] shift left and shift right
tarekbecker Jul 2, 2015
afa021e
[SPARK-8747] [SQL] fix EqualNullSafe for binary type
cloud-fan Jul 2, 2015
52302a8
[SPARK-8407] [SQL] complex type constructors: struct and named_struct
yjshen Jul 2, 2015
0e553a3
[SPARK-8708] [MLLIB] Paritition ALS ratings based on both users and p…
viirya Jul 2, 2015
2e2f326
[SPARK-8581] [SPARK-8584] Simplify checkpointing code + better error …
Jul 2, 2015
34d448d
[SPARK-8479] [MLLIB] Add numNonzeros and numActives to linalg.Matrices
MechCoder Jul 2, 2015
82cf331
[SPARK-8781] Fix variables in published pom.xml are not resolved
Jul 2, 2015
fcbcba6
[SPARK-1564] [DOCS] Added Javascript to Javadocs to create badges for…
deroneriksson Jul 2, 2015
cd20355
[SPARK-7835] Refactor HeartbeatReceiverSuite for coverage + cleanup
Jul 2, 2015
52508be
[SPARK-8772][SQL] Implement implicit type cast for expressions that d…
rxin Jul 2, 2015
7d9cc96
[SPARK-3382] [MLLIB] GradientDescent convergence tolerance
Lewuathe Jul 2, 2015
fc7aebd
[SPARK-8784] [SQL] Add Python API for hex and unhex
Jul 2, 2015
488bad3
[SPARK-7104] [MLLIB] Support model save/load in Python's Word2Vec
yu-iskw Jul 2, 2015
e589e71
Revert "[SPARK-8784] [SQL] Add Python API for hex and unhex"
rxin Jul 2, 2015
d983819
[SPARK-8782] [SQL] Fix code generation for ORDER BY NULL
JoshRosen Jul 3, 2015
aa7bbc1
[SPARK-6980] [CORE] Akka timeout exceptions indicate which conf contr…
BryanCutler Jul 3, 2015
1a7a7d7
[SPARK-8213][SQL]Add function factorial
zhichao-li Jul 3, 2015
dfd8bac
Minor style fix for the previous commit.
rxin Jul 3, 2015
20a4d7d
[SPARK-8501] [SQL] Avoids reading schema from empty ORC files
liancheng Jul 3, 2015
a59d14f
[SPARK-8801][SQL] Support TypeCollection in ExpectsInputTypes
rxin Jul 3, 2015
f743c79
[SPARK-8776] Increase the default MaxPermSize
yhuai Jul 3, 2015
9b23e92
[SPARK-8803] handle special characters in elements in crosstab
brkyvz Jul 3, 2015
2848f4d
[SPARK-8809][SQL] Remove ConvertNaNs analyzer rule.
rxin Jul 3, 2015
ab535b9
[SPARK-8226] [SQL] Add function shiftrightunsigned
zhichao-li Jul 3, 2015
f0fac2a
[SPARK-7401] [MLLIB] [PYSPARK] Vectorize dot product and sq_dist betw…
MechCoder Jul 3, 2015
e92c24d
[SPARK-8810] [SQL] Added several UDF unit tests for Spark SQL
spirom Jul 4, 2015
4a22bce
[SPARK-8572] [SQL] Type coercion for ScalaUDFs
Jul 4, 2015
9fb6b83
[SPARK-8192] [SPARK-8193] [SQL] udf current_date, current_timestamp
adrian-wang Jul 4, 2015
f32487b
[SPARK-8777] [SQL] Add random data generator test utilities to Spark SQL
JoshRosen Jul 4, 2015
f35b0c3
[SPARK-8238][SPARK-8239][SPARK-8242][SPARK-8243][SPARK-8268][SQL]Add …
chenghao-intel Jul 4, 2015
6b3574e
[SPARK-8270][SQL] levenshtein distance
tarekbecker Jul 4, 2015
48f7aed
Fixed minor style issue with the previous merge.
rxin Jul 4, 2015
347cab8
[SQL] More unit tests for implicit type cast & add simpleString to Ab…
rxin Jul 4, 2015
c991ef5
[SPARK-8822][SQL] clean up type checking in math.scala.
rxin Jul 4, 2015
2b820f2
[MINOR] [SQL] Minor fix for CatalystSchemaConverter
viirya Jul 5, 2015
f9c448d
[SPARK-7137] [ML] Update SchemaUtils checkInputColumn to print more i…
rekhajoshm Jul 5, 2015
a0cb111
[SPARK-8549] [SPARKR] Fix the line length of SparkR
yu-iskw Jul 6, 2015
6d0411b
[SQL][Minor] Update the DataFrame API for encode/decode
chenghao-intel Jul 6, 2015
86768b7
[SPARK-8831][SQL] Support AbstractDataType in TypeCollection.
rxin Jul 6, 2015
39e4e7e
[SPARK-8841] [SQL] Fix partition pruning percentage log message
eglp-slindemann Jul 6, 2015
293225e
[SPARK-8124] [SPARKR] Created more examples on SparkR DataFrames
Emaasit Jul 6, 2015
0e19464
[SPARK-8837][SPARK-7114][SQL] support using keyword in column name
cloud-fan Jul 6, 2015
57c72fc
Small update in the readme file
Jul 6, 2015
37e4d92
[SPARK-8784] [SQL] Add Python API for hex and unhex
Jul 6, 2015
2471c0b
[SPARK-4485] [SQL] 1) Add broadcast hash outer join, (2) Fix SparkPla…
Jul 6, 2015
132e7fc
[MINOR] [SQL] remove unused code in Exchange
adrian-wang Jul 6, 2015
9ff2033
[SPARK-8656] [WEBUI] Fix the webUI and JSON API number is not synced
Jul 6, 2015
1165b17
[SPARK-6707] [CORE] [MESOS] Mesos Scheduler should allow the user to …
Jul 6, 2015
96c5eee
Revert "[SPARK-7212] [MLLIB] Add sequence learning flag"
mengxr Jul 6, 2015
0effe18
[SPARK-8765] [MLLIB] Fix PySpark PowerIterationClustering test issue
yanboliang Jul 6, 2015
7b467cc
[SPARK-8588] [SQL] Regression test
yhuai Jul 6, 2015
09a0641
[SPARK-8072] [SQL] Better AnalysisException for writing DataFrame wit…
Jul 6, 2015
d4d6d31
[SPARK-8463][SQL] Use DriverRegistry to load jdbc driver at writing path
viirya Jul 7, 2015
9eae5fa
[SPARK-8819] Fix build for maven 3.3.x
Jul 7, 2015
929dfa2
Revert "[SPARK-8781] Fix variables in published pom.xml are not resol…
Jul 7, 2015
1821fc1
[SPARK-6747] [SQL] Throw an AnalysisException when unsupported Java l…
maropu Jul 7, 2015
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
16 changes: 14 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,23 @@
*.ipr
*.iml
*.iws
*.pyc
*.pyo
.idea/
.idea_modules/
sbt/*.jar
build/*.jar
.settings
.cache
cache
.generated-mima*
/build/
work/
out/
.DS_Store
third_party/libmesos.so
third_party/libmesos.dylib
build/apache-maven*
build/zinc*
build/scala*
conf/java-opts
conf/*.sh
conf/*.cmd
Expand Down Expand Up @@ -49,12 +54,19 @@ dependency-reduced-pom.xml
checkpoint
derby.log
dist/
dev/create-release/*txt
dev/create-release/*final
spark-*-bin-*.tgz
unit-tests.log
/lib/
ec2/lib/
rat-results.txt
scalastyle.txt
scalastyle-output.xml
R-unit-tests.log
R/unit-tests.out
python/lib/pyspark.zip
lint-r-report.log

# For Hive
metastore_db/
Expand Down
27 changes: 27 additions & 0 deletions .rat-excludes
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
target
cache
.gitignore
.gitattributes
.project
Expand All @@ -14,20 +15,28 @@ TAGS
RELEASE
control
docs
docker.properties.template
fairscheduler.xml.template
spark-defaults.conf.template
log4j.properties
log4j.properties.template
metrics.properties
metrics.properties.template
slaves
slaves.template
spark-env.sh
spark-env.cmd
spark-env.sh.template
log4j-defaults.properties
log4j-defaults-repl.properties
bootstrap-tooltip.js
jquery-1.11.1.min.js
d3.min.js
dagre-d3.min.js
graphlib-dot.min.js
sorttable.js
vis.min.js
vis.min.css
.*avsc
.*txt
.*json
Expand Down Expand Up @@ -64,3 +73,21 @@ dist/*
logs
.*scalastyle-output.xml
.*dependency-reduced-pom.xml
known_translations
json_expectation
local-1422981759269/*
local-1422981780767/*
local-1425081759269/*
local-1426533911241/*
local-1426633911242/*
local-1430917381534/*
local-1430917381535_1
local-1430917381535_2
DESCRIPTION
NAMESPACE
test_support/*
.*Rd
help/*
html/*
INDEX
.lintr
22 changes: 13 additions & 9 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,16 @@
## Contributing to Spark

Contributions via GitHub pull requests are gladly accepted from their original
author. Along with any pull requests, please state that the contribution is
your original work and that you license the work to the project under the
project's open source license. Whether or not you state this explicitly, by
submitting any copyrighted material via pull request, email, or other means
you agree to license the material under the project's open source license and
warrant that you have the legal authority to do so.
*Before opening a pull request*, review the
[Contributing to Spark wiki](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark).
It lists steps that are required before creating a PR. In particular, consider:

- Is the change important and ready enough to ask the community to spend time reviewing?
- Have you searched for existing, related JIRAs and pull requests?
- Is this a new feature that can stand alone as a package on http://spark-packages.org ?
- Is the change being proposed clearly explained and motivated?

Please see the [Contributing to Spark wiki page](https://cwiki.apache.org/SPARK/Contributing+to+Spark)
for more information.
When you contribute code, you affirm that the contribution is your original work and that you
license the work to the project under the project's open source license. Whether or not you
state this explicitly, by submitting any copyrighted material via pull request, email, or
other means you agree to license the material under the project's open source license and
warrant that you have the legal authority to do so.
117 changes: 114 additions & 3 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -643,10 +643,41 @@ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

========================================================================
For d3 (core/src/main/resources/org/apache/spark/ui/static/d3.min.js):
========================================================================

Copyright (c) 2010-2015, Michael Bostock
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

* The name Michael Bostock may not be used to endorse or promote products
derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL MICHAEL BOSTOCK BE LIABLE FOR ANY DIRECT,
INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

========================================================================
For Scala Interpreter classes (all .scala files in repl/src/main/scala
except for Main.Scala, SparkHelper.scala and ExecutorClassLoader.scala):
except for Main.Scala, SparkHelper.scala and ExecutorClassLoader.scala),
and for SerializableMapWrapper in JavaUtils.scala:
========================================================================

Copyright (c) 2002-2013 EPFL
Expand Down Expand Up @@ -770,6 +801,22 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

========================================================================
For TestTimSort (core/src/test/java/org/apache/spark/util/collection/TestTimSort.java):
========================================================================
Copyright (C) 2015 Stijn de Gouw

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

========================================================================
For LimitedInputStream
Expand All @@ -789,6 +836,68 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

========================================================================
For vis.js (core/src/main/resources/org/apache/spark/ui/static/vis.min.js):
========================================================================
Copyright (C) 2010-2015 Almende B.V.

Vis.js is dual licensed under both

* The Apache 2.0 License
http://www.apache.org/licenses/LICENSE-2.0

and

* The MIT License
http://opensource.org/licenses/MIT

Vis.js may be distributed under either license.

========================================================================
For dagre-d3 (core/src/main/resources/org/apache/spark/ui/static/dagre-d3.min.js):
========================================================================
Copyright (c) 2013 Chris Pettitt

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

========================================================================
For graphlib-dot (core/src/main/resources/org/apache/spark/ui/static/graphlib-dot.min.js):
========================================================================
Copyright (c) 2012-2013 Chris Pettitt

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

========================================================================
BSD-style licenses
Expand All @@ -797,7 +906,8 @@ BSD-style licenses
The following components are provided under a BSD-style license. See project link for details.

(BSD 3 Clause) core (com.github.fommil.netlib:core:1.1.2 - https://github.com/fommil/netlib-java/core)
(BSD 3-clause style license) jblas (org.jblas:jblas:1.2.3 - http://jblas.org/)
(BSD 3 Clause) JPMML-Model (org.jpmml:pmml-model:1.1.15 - https://github.com/jpmml/jpmml-model)
(BSD 3-clause style license) jblas (org.jblas:jblas:1.2.4 - http://jblas.org/)
(BSD License) AntLR Parser Generator (antlr:antlr:2.7.7 - http://www.antlr.org/)
(BSD License) Javolution (javolution:javolution:5.5.1 - http://javolution.org)
(BSD licence) ANTLR ST4 4.0.4 (org.antlr:ST4:4.0.4 - http://www.stringtemplate.org)
Expand Down Expand Up @@ -838,5 +948,6 @@ The following components are provided under the MIT License. See project link fo
(MIT License) SLF4J LOG4J-12 Binding (org.slf4j:slf4j-log4j12:1.7.5 - http://www.slf4j.org)
(MIT License) pyrolite (org.spark-project:pyrolite:2.0.1 - http://pythonhosted.org/Pyro4/)
(MIT License) scopt (com.github.scopt:scopt_2.10:3.2.0 - https://github.com/scopt/scopt)
(The MIT License) Mockito (org.mockito:mockito-all:1.8.5 - http://www.mockito.org)
(The MIT License) Mockito (org.mockito:mockito-core:1.9.5 - http://www.mockito.org)
(MIT License) jquery (https://jquery.org/license/)
(MIT License) AnchorJS (https://github.com/bryanbraun/anchorjs)
6 changes: 6 additions & 0 deletions R/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
*.o
*.so
*.Rd
lib
pkg/man
pkg/html
12 changes: 12 additions & 0 deletions R/DOCUMENTATION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# SparkR Documentation

SparkR documentation is generated using in-source comments annotated using using
`roxygen2`. After making changes to the documentation, to generate man pages,
you can run the following from an R console in the SparkR home directory

library(devtools)
devtools::document(pkg="./pkg", roclets=c("rd"))

You can verify if your changes are good by running

R CMD check pkg/
67 changes: 67 additions & 0 deletions R/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# R on Spark

SparkR is an R package that provides a light-weight frontend to use Spark from R.

### SparkR development

#### Build Spark

Build Spark with [Maven](http://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn) and include the `-Psparkr` profile to build the R package. For example to use the default Hadoop versions you can run
```
build/mvn -DskipTests -Psparkr package
```

#### Running sparkR

You can start using SparkR by launching the SparkR shell with

./bin/sparkR

The `sparkR` script automatically creates a SparkContext with Spark by default in
local mode. To specify the Spark master of a cluster for the automatically created
SparkContext, you can run

./bin/sparkR --master "local[2]"

To set other options like driver memory, executor memory etc. you can pass in the [spark-submit](http://spark.apache.org/docs/latest/submitting-applications.html) arguments to `./bin/sparkR`

#### Using SparkR from RStudio

If you wish to use SparkR from RStudio or other R frontends you will need to set some environment variables which point SparkR to your Spark installation. For example
```
# Set this to where Spark is installed
Sys.setenv(SPARK_HOME="/Users/shivaram/spark")
# This line loads SparkR from the installed directory
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths()))
library(SparkR)
sc <- sparkR.init(master="local")
```

#### Making changes to SparkR

The [instructions](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark) for making contributions to Spark also apply to SparkR.
If you only make R file changes (i.e. no Scala changes) then you can just re-install the R package using `R/install-dev.sh` and test your changes.
Once you have made your changes, please include unit tests for them and run existing unit tests using the `run-tests.sh` script as described below.

#### Generating documentation

The SparkR documentation (Rd files and HTML files) are not a part of the source repository. To generate them you can run the script `R/create-docs.sh`. This script uses `devtools` and `knitr` to generate the docs and these packages need to be installed on the machine before using the script.

### Examples, Unit tests

SparkR comes with several sample programs in the `examples/src/main/r` directory.
To run one of them, use `./bin/sparkR <filename> <args>`. For example:

./bin/sparkR examples/src/main/r/dataframe.R

You can also run the unit-tests for SparkR by running (you need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first):

R -e 'install.packages("testthat", repos="http://cran.us.r-project.org")'
./R/run-tests.sh

### Running on YARN
The `./bin/spark-submit` and `./bin/sparkR` can also be used to submit jobs to YARN clusters. You will need to set YARN conf dir before doing so. For example on CDH you can run
```
export YARN_CONF_DIR=/etc/hadoop/conf
./bin/spark-submit --master yarn examples/src/main/r/dataframe.R
```
13 changes: 13 additions & 0 deletions R/WINDOWS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
## Building SparkR on Windows

To build SparkR on Windows, the following steps are required

1. Install R (>= 3.1) and [Rtools](http://cran.r-project.org/bin/windows/Rtools/). Make sure to
include Rtools and R in `PATH`.
2. Install
[JDK7](http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html) and set
`JAVA_HOME` in the system environment variables.
3. Download and install [Maven](http://maven.apache.org/download.html). Also include the `bin`
directory in Maven in `PATH`.
4. Set `MAVEN_OPTS` as described in [Building Spark](http://spark.apache.org/docs/latest/building-spark.html).
5. Open a command shell (`cmd`) in the Spark directory and run `mvn -DskipTests -Psparkr package`
Loading