Data Engineer

EngineeringRemoteContract5+ years

Own Rebel's end-to-end lakehouse platform: Iceberg/Delta tables, governance, streaming ingestion, and a serverless SQL warehouse that real customers run analytics on.

Compensation: $100,000 – $180,000 · No equity
Employment: Contract
Experience: 5+ years
Work policy: Remote only (United States)
Company location: Virginia
Visa sponsorship: Not available
Relocation: Allowed
Preferred timezones: Pacific, Central, Eastern, Atlantic

The role

We're building an end-to-end lakehouse platform and need an engineer to own it. Builder role, not a maintainer role.

What you'll do

Own the lakehouse layer: Iceberg/Delta tables, partitioning, time-travel, compaction, storage costs
Build the governance plane: catalog, column-level lineage, data quality monitoring, and discovery
Stand up declarative incremental ETL, a pipeline designer, and DAG orchestration with retries and backfills
Implement streaming ingestion with exactly-once semantics and edge agents outside the VPC
Run a serverless SQL warehouse with materialized views and provisioned compute tiers
Build open-protocol data sharing and privacy-preserving clean rooms

You should have

7+ years on data platforms, deep in at least one lakehouse stack (Databricks, Snowflake, or Iceberg/Delta + Spark/Trino)
Iceberg and/or Delta Lake internals: metadata layout, snapshot isolation, compaction, catalog integrations
Spark, Flink, or Trino internals; query optimization; columnar formats
Production streaming (Kafka, Kinesis, Flink, or Structured Streaming) with exactly-once delivery
Python and SQL fluency; Scala or Rust a plus
Data governance experience: catalogs, lineage, access control, PII
AWS (S3, IAM, VPC, Kinesis) and Terraform

Nice to have

Built an internal Databricks-like platform
OSS contributions to Iceberg, Delta, Spark, Trino, Airflow, or dbt
Privacy-preserving compute (differential privacy, secure enclaves, MPC)

Stack

Python, SQL, Scala, Rust, Apache Spark, Trino, Apache Airflow, dbt, Delta Lake, Apache Iceberg, Parquet, Kafka, Amazon Kinesis, AWS (S3, EC2, ELB, DynamoDB), Terraform.

About Rebel

Rebel helps enterprises turn their AI investments into real business value. We build the platform that lets large organizations make better decisions, run better operations, and use AI to automate the work that used to take rooms full of people, with the governance, evaluation, and observability needed to put it into production.

We are an equal opportunity employer. Rebel does not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability, or any other legally protected characteristic. Applicants with arrest or conviction records will be considered in accordance with applicable law.

If you need a reasonable accommodation to participate in the hiring process, reach out to contact@rebelinc.ai and we'll work with you to find one.