Data Engineer
Apply nowOwn Rebel's end-to-end lakehouse platform: Iceberg/Delta tables, governance, streaming ingestion, and a serverless SQL warehouse that real customers run analytics on.
- Compensation
- $100,000 – $180,000 · No equity
- Employment
- Contract
- Experience
- 5+ years
- Work policy
- Remote only (United States)
- Company location
- Virginia
- Visa sponsorship
- Not available
- Relocation
- Allowed
- Preferred timezones
- Pacific, Central, Eastern, Atlantic
The role
We're building an end-to-end lakehouse platform and need an engineer to own it. Builder role, not a maintainer role.
What you'll do
- Own the lakehouse layer: Iceberg/Delta tables, partitioning, time-travel, compaction, storage costs
- Build the governance plane: catalog, column-level lineage, data quality monitoring, and discovery
- Stand up declarative incremental ETL, a pipeline designer, and DAG orchestration with retries and backfills
- Implement streaming ingestion with exactly-once semantics and edge agents outside the VPC
- Run a serverless SQL warehouse with materialized views and provisioned compute tiers
- Build open-protocol data sharing and privacy-preserving clean rooms
You should have
- 7+ years on data platforms, deep in at least one lakehouse stack (Databricks, Snowflake, or Iceberg/Delta + Spark/Trino)
- Iceberg and/or Delta Lake internals: metadata layout, snapshot isolation, compaction, catalog integrations
- Spark, Flink, or Trino internals; query optimization; columnar formats
- Production streaming (Kafka, Kinesis, Flink, or Structured Streaming) with exactly-once delivery
- Python and SQL fluency; Scala or Rust a plus
- Data governance experience: catalogs, lineage, access control, PII
- AWS (S3, IAM, VPC, Kinesis) and Terraform
Nice to have
- Built an internal Databricks-like platform
- OSS contributions to Iceberg, Delta, Spark, Trino, Airflow, or dbt
- Privacy-preserving compute (differential privacy, secure enclaves, MPC)
Stack
Python, SQL, Scala, Rust, Apache Spark, Trino, Apache Airflow, dbt, Delta Lake, Apache Iceberg, Parquet, Kafka, Amazon Kinesis, AWS (S3, EC2, ELB, DynamoDB), Terraform.
About Rebel
Rebel helps enterprises turn their AI investments into real business value. We build the platform that lets large organizations make better decisions, run better operations, and use AI to automate the work that used to take rooms full of people, with the governance, evaluation, and observability needed to put it into production.
We are an equal opportunity employer. Rebel does not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability, or any other legally protected characteristic. Applicants with arrest or conviction records will be considered in accordance with applicable law.
If you need a reasonable accommodation to participate in the hiring process, reach out to contact@rebelinc.ai and we'll work with you to find one.