There's a version of this post that's a bake-off — a feature matrix of every orchestrator, every transform tool, every warehouse. We're not writing that post. Anyone who needs that comparison can find it in a vendor's marketing deck.
What we'll defend instead: pick tools that are boring in production and interesting in development. Airflow, Dagster, dbt, Snowflake or BigQuery, a managed Kafka or Kinesis, Terraform, Datadog. The list is unfashionable because it works.
The interesting decisions are not which tools — they're which seams. Where does the warehouse end and the application start? Where does CDC end and event sourcing begin? Where does the dbt project live in your repo? Those choices determine whether the pipeline is maintainable in 18 months, when half the original team is gone.
Our rule of thumb: every named system should fit on an index card. If you can't describe what a service does in one sentence, it's doing two things. Split it.