About the company
Appen is a leader in AI enablement for critical tasks such as model improvement, supervision, and evaluation. To do this we leverage our global crowd of over one million skilled contractors, speaking over 180 languages and dialects, representing 130 countries. In addition, we utilize the industry's most advanced AI-assisted data annotation platform to collect and label various types of data like images, text, speech, audio, and video. Our data is crucial for building and continuously improving the world's most innovative artificial intelligence systems and Appen is already trusted by the world's largest technology companies. Now with the explosion of interest in generative AI, Appen is helping leaders in automotive, financial services, retail, healthcare, and governments the confidence to deploy world-class AI products. At Appen, we are purpose driven. Our fundamental role in AI is to ensure all models are helpful, honest, and harmless, so we firmly believe in unlocking the power of AI to build a better world. We have a learn-it-all culture that values perspective, growth, and innovation. We are customer-obsessed, action-oriented, and celebrate winning together. At Appen, we are committed to creating an inclusive and diverse workplace. We are an equal opportunity employer that does not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
Job Summary
Key Responsibilities:
📍Design, build, and manage large-scale data infrastructures using a variety of AWS technologies such as Amazon Redshift, AWS Glue, Amazon Athena, AWS Data Pipeline, Amazon Kinesis, Amazon EMR, and Amazon RDS. 📍Design, develop, and maintain scalable data pipelines and architectures on Databricks using tools such as Delta Lake, Unity Catalog, and Apache Spark (Python or Scala), or similar technologies. 📍Integrate Databricks with cloud platforms like AWS to ensure smooth and secure data flow across systems. 📍Build and automate CI/CD pipelines for deploying, testing, and monitoring Databricks workflows and data jobs. 📍Continuously optimize data workflows for performance, reliability, and security, applying Databricks best practices around data governance and quality.
Qualifications:
📍5-7 years of hands-on experience with AWS data engineering technologies, such as Amazon Redshift, AWS Glue, AWS Data Pipeline, Amazon Kinesis, Amazon RDS, and Apache Airflow. 📍Hands-on experience working with Databricks, including Delta Lake, Apache Spark (Python or Scala), and Unity Catalog. 📍Demonstrated proficiency in SQL and NoSQL databases, ETL tools, and data pipeline workflows. 📍Experience with Python, and/or Java. 📍Deep understanding of data structures, data modeling, and software architecture. 📍Experience with AI and machine learning technologies is highly desirable. 📍Strong problem-solving skills and attention to detail.
The future of finance is here — whether you’re interested in blockchain, cryptocurrency, or remote web3 jobs, there’s a perfect role waiting for you.