Endeavor

Client Description

The Client is a privately held exploration and production company and is one of the largest private operators in the United States. The company’s Energy Services division provides construction, roustabout, trucking, vehicle maintenance, and well and wireline services, making it vertically integrated.

Challenge

The Client has had issues with artificial lift equipment failure and would like the ability to auto detect trending conditions and pre-emptively prevent failures. Alerts need to be setup to notify the appropriate stakeholders in the company of negative trends. Also, we need to identify preventative maintenance to avert any failure and reduce cost. Finally, we needed a way to log, audit, and measure many points within the ingestion pipeline. The technology stack available to us was AWS so the pipeline needed to be built using native AWS tools.

Solution

iOLAP partnered with Endeavor team to document and identify the areas of impact. These potential failure points will be evaluated, and recommendations will be provided. Collaborative effort will be made to use the data source to predict failures. Part of the solution will be to create data experiment cycles and evaluations.

Lakehouse Components

The lakehouse architecture comprised of three primary components.

Data lake

S3 was leveraged as the data lake. Data is partitioned by day and stored in parquet format. Data stored here is close to raw format.

Ingestion Pipeline

The ingestion pipeline consists of 3 phases. The first phase handles data on the lake, followed by sending the data to the Snowflake in a staging area, and finally the data is loaded to an EDW.

Data lake ingestion starts when Endeavor uses Boomi to send data from on-premises to the S3 data lake's inbound folder. This triggers a lambda python process that does the following:

  1. Looks up to a Postgres metadata table to determine what to do with the file.
  2. Some options here include whether to run a schema check, what delimiter we expect in the file, and if any fix should be applied
  3. If no metadata is found for the file, then the file will be sent to a dead letter queue (DLQ)
  4. If metadata is found and no failures occurred, then the file will be compressed and organized by source and filename on the data lake
  5. If an error occurs, we send an alert through an SNS message

The landing of the parquet file will trigger the second lambda process. This process:

  1. Looks up to a Postgres metadata table to determine what to do with the file
  2. Data is copied to the staging layer on Snowflake
  3. In some cases, we check for duplicates based on the PK of the staging table
  4. Errors are sent via an alert using SNS

Data Warehouse

The warehouse is built on top of Snowflake. We organized Snowflake into 4 databases:

  1. Staging (STG) - where data from the data lake is stored in tables that are hydrated via Snowflake COPY
  2. Enterprise data warehouse (EDW) - where we transform data based on business requirements into EDW tables
  3. Data marts (DM) - where we build views on top of EDW objects
  4. Outbound (OB) - where views of DM objects are exposed to the end users

The EDW table loads are driven by batch jobs. These batch jobs kick off Glue jobs which in turn invoke Snowflake SQL/ELT files that are stored on S3. A Postgres metadata lookup occurs for each EDW table that is loaded which can tell the job which Snowflake warehouse (compute) to use.

Technologies Used

  • 3 - for data lake storage
  • Lambda - for AWS data pipeline workflows and minor data curation
  • Glue - for Snowflake data warehouse SQL ELT processing execution and workflows
  • Postgres RDS - for storing logging information of all ingestion pipeline components
  • Python
  • Code Commit - for code management
  • CloudFormation - for building out environments
  • ELT - for data loading in Snowflake

Reference Architecture

center-big

Benefits

Systems shall utilize IOLAP’s best practices for automation, logging, stability, and recoverability. The benefit of risk assessment is to properly identify the potential failure points and cut cost by addressing the risk before it becomes an issue. Improved maintenance cycles and risk plans in place will save Endeavor time and money. We learned to build, manage, and maintain a near real-time ingestion engine between multiple technologies. Snowflake and AWS best practices were applied from the get-go, and we learned a lot on how triggering lambda jobs by s3 events work.

center-small

Drive your business forward!

iOLAP experts are here to assist you