11 West 19th Street (22008), United States of America, New York,
New York Senior Manager, Data and Machine Learning Engineering As a
Capital One Senior Manager of Data Engineering, you'll be part of
an Agile team dedicated to breaking the norm and pushing the limits
of continuous improvement and innovation. You will participate in
detailed technical design, development and implementation of
applications using existing and emerging technology platforms.
Working within an Agile environment, you will provide input into
architectural design decisions, develop code to meet story
acceptance criteria, and ensure that the applications we build are
always available to our customers. You'll have the opportunity to
mentor other engineers and develop your technical knowledge and
skills to keep your mind and our business on the cutting edge of
technology. As a Senior Data Manager, you will need to understand
how to apply technologies in categories such as:
Cloud Computing Services (AWS, Azure, etc.)
Data Management Solutions (Metadata, Lineage, Quality)
Big Data Programming Frameworks (Hadoop, Spark, etc.)
Big Data Storage and Visualization Solutions
Machine learning (ML) Programming Frameworks (scikit-learn,
PyTorch, Dask, Spark, or TensorFlow)
Performance and Scaling Techniques
Data Integration Patterns
Open Source Software
Who You Are:
You yearn to be part of cutting edge, high profile projects and
are motivated by delivering world-class solutions on an aggressive
Someone who has experience productionalizing big data and
machine learning solutions
Someone who is not intimidated by challenges; thrives even under
pressure; is passionate about their craft; and hyper focused on
delivering exceptional results
You love to learn new technologies and mentor junior engineers
to raise the bar on your team
It would be awesome if you have a robust portfolio on Github and
/ or open source contributions you are proud to share
Collaborating as part of a cross-functional Agile team to create
and enhance software that enables state of the art, next generation
Big Data & Machine Learning applications
Building efficient storage for structured and unstructured
Collaborates with business analysts, data analysts, data
scientists, and suggests and leads architecture decisions.
Developing and deploying distributed computing Big Data and
Machine Learning applications using Open Source frameworks like
Apache Spark, Spark MLlib, and Kafka on AWS Cloud
Utilizing programming languages like Java, Scala, and Python
Utilizing Hadoop modules such as YARN & MapReduce, and related
Apache projects such as Hive, Hbase, Pig, and Cassandra
Designing, building, and scaling complex data pipelines for
machine learning models and evaluating their performance.
Leveraging DevOps techniques and practices like Continuous
Integration, Continuous Deployment, Test Automation, Build
Automation and Test Driven Development to enable the rapid delivery
of working code utilizing tools like Jenkins, Maven, Nexus, Chef,
Terraform, Ruby, Git and Docker
Performing unit tests and conducting reviews with other team
members to make sure your code is rigorously designed, elegantly
coded, and effectively tuned for performance
Bachelor’s Degree or military experience
At least 2 years of experience leading Data Engineering or
Machine Learning teams.
At least 3 years of experience with data gathering and
preparation for machine learning models.
At least 3 years of experience building, scaling, and optimizing
machine learning systems.
At least 4 years of experience programming with Python, Scala,
At least 4 years of experience working with machine learning
tools (scikit-learn, PyTorch, Dask, Spark, TensorFlow)
At least 3 years of experience developing and deploying machine
learning solutions in AWS, Azure, or Google Cloud Platform
4+ years of UNIX/Linux experience
2+ years of Agile engineering experience
Master’s Degree or PhD in Computer Science, Electrical
Mastery of one or more ML model architectures such as neural
networks, decision trees, Bayesian models, association learning, or
3+ years of experience productionizing, monitoring, and
maintaining machine learning models
4+ years of experience designing and building data intensive
solutions using distributed computing
Contributed to open source ML software.
Authored/co-authored a paper on a ML technique, model, or proof
Impacts the ML industry through conference presentations,
papers, blog posts, or patents.
Experience designing, implementing, and scaling complex data
pipelines for ML models and evaluating their performance.
Ability to communicate complex technical concepts clearly to a
variety of audiences.
Capital One will consider sponsoring a new qualified applicant
for employment authorization for this position.