Reactive streaming SEG-Y parser
- Seg-Y version 1 format supported.
- Supports asynchronous stream processing with non-blocking adaptive pull/push back pressure, as it declared by Reactive Streams.
- Built with Akka Streams.
- Contains examples of different use cases: streaming from file source, AWS S3, transformation, visualization, statistics, parallel processing, etc.
- API for both scala and java languages.
- Configurable segy data chunk size and text reading encoding
- Add Github badges - code coverage, stable version, etc
- Add more examples for streaming from file source, S3, transformation, visualization, parallel processing
- Add benchmarks, taking commonly used Seg-Y parsers as a baseline
- Add full support for set of Seg-Y v1 features (variable ext text headers, etc.)
- Add Seg-Y v2 support
- Cross-validation against other commonly used Seg-Y parsers
- java 1.8
- sbt 1.x
Add dependency:
Sbt
libraryDependencies = "com.github.sereneant.segystream" %% "segystream-core" % "0.1.0"
Maven
<dependency>
<groupId>com.github.sereneant.segystream</groupId>
<artifactId>segystream-core_2.12</artifactId>
<version>0.1.0</version>
</dependency>
Gradle
dependencies {
compile group: 'com.github.sereneant.segystream', name: 'segystream-core_2.12', version: '0.1.0'
}
Streaming implementation is based on Akka Streams.
Scala
Setup streams:
implicit val system: ActorSystem = ActorSystem("segystream-examples")
implicit val mat: ActorMaterializer = ActorMaterializer()
Construct Stream blueprint from Seg-Y file or another byte sources (S3, HDFS, etc).
val segySource: Source[SegyPart, Future[SegyHeaders]] = fileSource.viaMat(SegyFlow())(Keep.right)
Full spectre of Alpakka Connectors can be used for streaming from different sources / to different sinks.
Run the flow, make actions/transformations:
val done: Future[Done] = segySource
.map {
case th: TraceHeader => println(s"Trace Header: ${th.traceSequenceNumberWithinLine}")
case td: TraceDataChunk => println(s"Trace Data Chunk: length=${td.length}")
case _ => // NoOp
}
.toMat(Sink.ignore)(Keep.right) // wait for the Sink to complete
.run()
Wait for stream termination and print the stats:
implicit val ec: ExecutionContextExecutor = system.dispatcher
done.onComplete { _ =>
system.terminate()
println("Stream completed")
}
- Collect and print Seg-Y data stats
- Output info from Seg-Y headers
- Collect data for given in-line/cross-line section
- More to come...
Java
The full power of Akka streams is available in Java as well.
Stream of Seg-Y data in traces is split into chunks of configurable length, default is 1024 bytes.
Custom configuration can be passed to SegyFlow
constructor:
val segyFlow = new SegyFlow(SegyConfig(
charset: Charset = Charset.forName("CP037"), //textual data charset
dataChunkSize: Int = 1024 //bytes
))
sbt package
Ivy
sbt publishLocal
Maven
sbt publishM2
sbt test
TBD
Examples are located in examples folder.
sbt "examples/runMain com.github.sereneant.segystrem.examples.CollectSegyStats SegY_file_name.segy"
- Parser does not support variable extended text headers.
- Parser does not support Data Sample Format Code 4 (4-byte fixed-point with gain, obsolete).
Any contributions are welcome! It can be done by creating issues and pull requests on a project GitHub page.
Please keep code clean (whatever it means for you) and comply with coding style standards:
Please keep a CHANGELOG.md file in actual state; the format is based on Keep a Changelog.
SemVer is used as versioning standard. For the version references, see the git tags.
Licensed under the MIT License - see the LICENSE file.
- Inspired by Reactive Manifesto
- Thanks to Mikhail Aksenov for sigrun, used as a good starter in Seg-Y parsing.
- Thanks to Andriy Plokhotnyuk for his jsoniter-scala as an example of technical excellence and well shaped scala project, where build configuration and project structure were borrowed from.
All references are given in alphabetical order.