Machine Learning Engineer
Posted October 10
Machine Learning Engineer - Cloud Platforms
We are seeking a highly skilled Machine Learning Engineer with extensive experience in running and scaling models on public cloud platforms. The successful candidate will join our team to optimize, implement, and maintain our organization’s cloud-based systems.
Responsibilities
- Design, develop, and deploy modular models on cloud-based systems.
- Develop and maintain cloud solutions in accordance with best practices.
- Ensure efficient functioning of data storage and processing functions, adhering to company security policies and best practices in cloud security.
- Identify, analyze, and resolve infrastructure vulnerabilities and application deployment issues.
- Regularly review existing models and systems, making recommendations for improvements.
- Collaborate with engineering and development teams to evaluate and identify optimal cloud solutions.
- Modify and improve existing systems.
- Educate teams on the implementation of new cloud technologies and initiatives.
Technical Skills
- Proven work experience as a Machine Learning Engineer or in a similar role.
- Preferred: GCP certifications.
- Strong troubleshooting and analytical skills.
- Excellent communication and collaboration skills.
- Relevant training and/or certifications as a Google Cloud Engineer.
- Proficiency in Google Compute Engine, including creating, managing, and scaling virtual machines (VMs) to meet diverse workload requirements.
- Expertise in designing solution architecture on Google Cloud Platform (GCP), aligning with project-specific needs and objectives.
- Strong grasp of programming languages such as Python, Java, Go, and Node.js.
- Expertise in programming frameworks and libraries specific to GCP, such as Google Cloud SDK, Cloud Client Libraries, and Cloud APIs.
- Understanding of serverless computing, containerization (e.g., Docker, Kubernetes), and event-driven architectures for developing scalable and efficient applications on GCP.
- Knowledge of containers and infrastructure automation tools.
- Proficiency in tools including Ansible, Docker, Windows PowerShell, and Linux/Unix.
- Expertise in Google Cloud Storage.
- Familiarity with other GCP services like BigQuery, Cloud Pub/Sub, and Cloud Functions.
Software Security Skills
- Expertise in securing applications and data on Google Cloud, including:
- Configuring Identity and Access Management (IAM) to control resource access.
- Implementing network security measures such as firewall rules and encryption.
- Monitoring and logging security events using tools like Stackdriver Logging.
- Ensuring data encryption at rest and in transit.
- Securing containers and Kubernetes clusters.
- Complying with security standards and regulations.
- Developing incident response plans.
- Automating security checks and vulnerability scanning.
- Following secure development practices.
Coding and Scripting Skills
- Proficiency in programming languages like Python, Java, or Go to build and customize applications on the Google Cloud Platform.
- Knowledge of infrastructure-as-code (IaC) tools such as Terraform and/or Google Cloud Deployment Manager enables defining and managing infrastructure resources using code.
- Automation is key, utilizing scripting languages and tools like Cloud SDK or Cloud APIs to automate tasks, deployments, and resource management.
- Familiarity with containerization technologies like Docker and Kubernetes enables efficient deployment and management of applications.
- Cloud API integration and utilization of monitoring and logging tools like Stackdriver aid in application monitoring, troubleshooting, and enhancing overall performance.
Testing Skills
The primary objective of any GCP Engineer is to expedite the software delivery process for clients, ensuring swift and efficient deployment.
- Strong testing skills and a comprehensive understanding of the testing process. Testing plays a critical role in driving automation and ensuring successful outcomes in the role.
- Additionally, once the appropriate tests are established, a sense of assurance prevails, knowing that each component functions as intended. Tests can be conducted at various stages, from development to deployment, to ensure seamless integration of new features throughout the entire system.
- Finally, emphasizing quality is paramount in the realm of applications and software. As a result, one should continuously prioritize and conduct rigorous testing to deliver high-quality work.
Continuous Integration and Continuous Deployment (CI/CD)
- Continuous Integration and Continuous Deployment (CI/CD) in Google Cloud refers to a set of practices and tools that enable developers to automate the process of building, testing, and deploying applications. CI involves automatically integrating code changes into a shared repository and running tests to ensure code quality. CD takes it further by automating the deployment of tested code to production environments.
- In Google Cloud, CI/CD pipelines can be set up using tools like Cloud Build, Cloud Source Repositories, and Cloud Deployment Manager. These pipelines help streamline development workflows, increase collaboration, and ensure rapid and reliable application delivery, allowing teams to deliver software updates more frequently and with reduced manual effort.
- Azure DevOps and Azure Pipelines
Ideal Candidate Behaviours
- Self-learning abilities
- Radiates energy & enthusiasm for his field, and can instil the same passion in others
- Naturally curious & always a step ahead
- Critical thinker, challenging others in a constructive way, continuously looking for improvements
- Can combine big-picture thinking, with zooming in on details.