Software Mind develops solutions that make an impact for companies around the globe. Tech giants & unicorns, transformative projects, emerging technologies and limitless opportunities – these are a few words that describe an average day for us. Building cross-functional engineering teams that take ownership and crave more means we’re always on the lookout for talented people who bring passion and creativity to every project. Our culture embraces openness, acts with respect, shows grit & guts and combines employment with enjoyment.
Job Description
Project – the aim you’ll have
Our customer provides innovative solutions and insights that enable our clients to manage risk and hire the best talent. Their advanced global technology platform supports fully scalable, configurable screening programs that meet the unique needs of over 33,000 clients worldwide. Headquartered in Atlanta, GA, they have an internationally distributed workforce spanning 19 countries with about 5,500 employees. Our partner perform over 93 million screens annually in over 200 countries and territories.
We are seeking a Senior Data Engineer with strong Python/PySpark skills to join the Data Engineering team and help build our Data Analytics Platform in AWS.
Position – how you’ll contribute
- Develop reusable, metadata-driven data pipelines
- Automate and optimize data platform processes
- Build integrations with data sources and data consumers
- Add data transformation methods to shared ETL libraries
- Write unit tests and perform code reviews to ensure code quality
- Develop solutions for data platform monitoring and alerting (e.g., CloudWatch, third‑party tools)
- Proactively resolve performance and data quality issues in ETL processes
- Collaborate with infrastructure engineers to provision and configure cloud resources (VPC, IAM, S3, etc.)
- Contribute to platform documentation and runbooks
- Propose and implement improvements to data platform architecture
Expectations – the experience you need
- Strong database skills (SQL, data modeling, query optimization)
- Programming: Python / PySpark, SQL
- Proficient in building robust data pipelines using Spark (Databricks on AWS or EMR/EMR Serverless)
- Experienced working with large and complex datasets
- Skilled in building reusable data transformation modules organized as Python packages
- Familiar with Delta Lake optimization techniques on S3 (partitioning, Z-ordering, compaction) or equivalent table formats (Apache Iceberg, Hudi)
- Experienced in developing CI/CD pipelines (e.g., GitHub Actions, Jenkins, AWS CodePipeline)
- Experienced integrating with event brokers (Kafka, Amazon Kinesis) for ingest and streaming use-cases
- Understanding of basic networking and security in cloud environments (VPC, subnets, security groups, IAM)
- Familiar with Agile software development methodologies (Scrum)
Additional skills – the edge you have
- Understanding of stream processing and Spark Structured Streaming or Kinesis Data Analytics
- Experience with Infrastructure as Code (Terraform, AWS CloudFormation)
- Experience running containerized workloads (ECS/Fargate, EKS/Kubernetes)
- Experience building event-sourcing or CDC solutions; familiarity with Debezium a plus
- Knowledge of AWS-native data services (AWS Glue, AWS Lambda, Amazon S3, Amazon Redshift, Amazon RDS, Amazon Athena)
Our offer – professional development, personal growth:
- Flexible employment and remote work
- International projects with leading global clients
- International business trips
- Non-corporate atmosphere
- Language classes
- Internal & external training
- Private healthcare and insurance
- Multisport card
- Well-being initiatives
Position at: Software Mind Poland
This role requires candidates to be based in Poland.



