- · Work collaboratively with the support and engineering team to deploy and operate systems.
- · Help automate and streamline operations and processes.
- · Build and maintain tools for deployment, monitoring, and operations.
- · Monitor system performance and optimize it.
- · Troubleshoot and resolve issues in development, test, and production environments.
- · Develop, maintain, and support a large AI system deployed on-premises
- Experience in managing infrastructure, services, databases deployed on a cloud service provider like Azure, AWS, etc.
- Experience in Linux Administration in production environments or similar for at least 2-3 years.
- Strong understanding of one of these scripting languages like Bash, Python, Perl, etc.
- Strong understanding of current network protocols, architecture, and design.
- Experience with Containerization tools Docker
- Experience with Nginx, Grafana, Prometheus, ELK stack
- Experience with Oracle SQL. NoSQL experience is a plus.
- Knowledge of best practices and IT operations & Excellent verbal and written communication skills
- Linux: 1+ years in Unix systems engineering with experience in Ubuntu, Red Hat Linux or Centos.
- Have in depth knowledge & hand on experience on docker, docker-compose & Kubernetes
- Have in depth knowledge of Networking & Operating Sysytem.
- Experience in handling programming languages like Nodejs.
- Good knowledge of Apache for routing and load balancing.
- Have managed or experience in handling Microservices Architecture.
- Experience with monitoring tools such as Prometheus, Grafana, ELK stack.
- Working knowledge of TCP/IP networking, SMTP, HTTP, load balancers, API gateways .
- Awareness of security, experience with setting up access rules and firewall configuration.
- Ability to keep systems running at peak performance, upgrade operating system, patches, and version upgrades as required.
- Hands on experience in Mongo & Oracle/MySQL.
- Hands on experience in pm2 for managing nodejs services.