hadoop - Decompressing LZ4 compressed data in Spark -
I have LJ 4 compressed data in HDFS and I'm trying to decompress it in Rdd in Apache Spark. As far as I can tell, the only way to read data from HDFS is to textFile
, which reads data only in HDFS. I have come to the articles on CompressionCodec
but explains to all how HDFS has to output the output, while I already need to uncompressed on HDFS.
I am new to SPARC so I apologize in advance if I became clear or if it is wrong to understand my concept but it would be good that somebody told me in the right direction.
reading SPARK 1.1.0 through LZ4 compressed files sc.textFile
. I have to work using SPARC, which is built with a hoodopop which supports LJ4 (2.4.1 in my case)
After that, I have used the original libraries for my platform Have been created and told that they have been linked to them to spark through the - driver-library-path
option
without link native lz4 Library was not loaded
Exceptions.
Depending on the Hadoop distribution, the steps you are using to build original libraries can be optional.
Comments
Post a Comment