Skip to content

Adding a Apache Spark UDF? #5

@mielvds

Description

@mielvds

Hi! Would it make sense to have a small addition that makes the library usable in Apache Spark? Something along the lines of

package com.atomgraph.etl.json;
import org.apache.spark.sql.api.java.UDF1;
import org.apache.jena.rdf.model.Model;
public class Json2rdfUDF implements UDF1<String, String> {
private static final long serialVersionUID = 1L;
@Override
  public StreamRDF call(String jsonString) throws Exception {

       InputStream bis = new ByteArrayInputStream(jsonString.getBytes());
       Reader reader =  new BufferedReader(bis);

       StreamRDF rdfStream = new CollectorStreamRDF();
       new JsonStreamRDFWriter(reader, rdfStream, baseURI.toString()).convert();
       
       return rdfStream;
   }
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions