hadoop - Decompressing LZ4 compressed data in Spark -

- May 15, 2011

I have LJ 4 compressed data in HDFS and I'm trying to decompress it in Rdd in Apache Spark. As far as I can tell, the only way to read data from HDFS is to textFile , which reads data only in HDFS. I have come to the articles on CompressionCodec but explains to all how HDFS has to output the output, while I already need to uncompressed on HDFS.

I am new to SPARC so I apologize in advance if I became clear or if it is wrong to understand my concept but it would be good that somebody told me in the right direction.

reading SPARK 1.1.0 through LZ4 compressed files sc.textFile . I have to work using SPARC, which is built with a hoodopop which supports LJ4 (2.4.1 in my case)

After that, I have used the original libraries for my platform Have been created and told that they have been linked to them to spark through the - driver-library-path option

without link native lz4 Library was not loaded Exceptions.

Depending on the Hadoop distribution, the steps you are using to build original libraries can be optional.

Search This Blog

Sign

hadoop - Decompressing LZ4 compressed data in Spark -

Comments

Post a Comment

Popular posts from this blog

java - org.apache.http.ProtocolException: Target host is not specified -

How to access user directory in lazarus? -

java - Gradle dependencies: compile project by relative path -